Gemini
Gemini image generation and editing overview
Gemini image generation and editing are both handled through the Gemini-native generateContent endpoint. Text prompts go in contents[].parts[].text, and reference images go in parts[].inline_data.
Endpoint Path
| Method | Path | Purpose |
|---|---|---|
| POST | /v1beta/models/{model}:generateContent | Gemini image generation and image editing |
curl -X POST "https://api.routescope.ai/v1beta/models/gpt-4o-mini:generateContent" \ -H "Content-Type: application/json" \ -d '{ "contents": [ { "parts": [ { "text": "Writing a four-word Chinese short poem on the theme of urban night rain." } ] } ], "generationConfig": { "temperature": 0.7, "maxOutputTokens": 100000 } }'{
"candidates": [],
"usageMetadata": {},
"modelVersion": "string"
}Authorization
BearerAuth
Model relay interface recognition. Request heading: Autoration: Bearer .
In: header
Path Parameters
Gemini model name.
Request Body
application/json
Enter an array of content to carry one or more rounds of messages between users, models or tools. Each element is a Content object, usually consisting of __ FD_PROTECT_0 __ and __ FD_PROTECT_1 __: __ FD_PROTECT_2 __, commonly known as _ FD_PROTECT_3 __, _ FD_PROTECT_4 __, single-cycle user_9 _ with _ FD_PROTECT_10, _ FD_PROTECT_11 __, _ FD_PROTECT_12 __, _ FD_PRT_13 _ _ _ _ _ _FCED_14 Applies to scenarios such as text conversations, image/audio/video/document understanding, function calls and multimodular generation. The number of arrays and media sizes are based on upstream model and operational configuration limits.
Gemini system command.
Generates configurations such as temperature, TopK, TopP, maximum output length.
Security policy settings. List of safetySettings. Scope: An array length is based on upstream or business configuration.
Gemini tool definition. . Scope: The length of arrays and the complexity of schema are based on upstream limits.
Response Body
application/json
Model Selection
| Model ID | Capability | Typical Use |
|---|---|---|
gemini-2.5-flash-image | Image generation, image editing | Low-latency image generation, local edits, text-image mixed responses, and multi-turn visual creation. |
gemini-3.1-flash-image-preview | Image generation, image editing | Nano Banana 2 efficient generation, wide-aspect assets, and batch creative exploration. |
gemini-3-pro-image-preview | Image generation, image editing | Professional assets, complex instructions, multi-turn editing, text rendering, and high-resolution output. |
Common Parameters
| Field | Type | Required | Description |
|---|---|---|---|
contents | array | Yes | Gemini content array. |
contents[].parts[].text | string | Yes | Generation prompt or editing instruction. |
contents[].parts[].inline_data | object | No | Input reference image, used for image editing. |
generationConfig.responseModalities | string[] | No | Return modalities. For image generation, use ["IMAGE"] or ["TEXT","IMAGE"]. |
generationConfig.imageConfig.aspectRatio | string | No | Aspect ratio. |
generationConfig.candidateCount | integer | No | Candidate count. Keep it at 1 in most cases. |
safetySettings | array | No | Gemini safety settings. |
Model-Specific Parameters
| Field | Applicable Models | Default / Range | Description |
|---|---|---|---|
contents[].parts[].inline_data.mime_type / data | gemini-2.5-flash-image | Supports image/png, image/jpeg, image/webp, image/heic, image/heif | For 2.5 Flash, MIME and Base64 fields can be understood separately. |
contents[].parts[].inline_data | gemini-3.1-flash-image-preview, gemini-3-pro-image-preview | Object containing mime_type and Base64 data | 3.x image models pass reference images as inline_data objects. |
generationConfig.imageConfig.aspectRatio | gemini-2.5-flash-image, gemini-3-pro-image-preview | Default 1:1; supports 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 | Standard aspect-ratio set. |
generationConfig.imageConfig.aspectRatio | gemini-3.1-flash-image-preview | Default 1:1; supports 1:1, 1:4, 1:8, 2:3, 3:2, 3:4, 4:1, 4:3, 4:5, 5:4, 8:1, 9:16, 16:9, 21:9 | Supports wider and narrower special aspect ratios. |
generationConfig.imageConfig.imageSize | gemini-3-pro-image-preview | Default 1K; supports up to 4K depending on the channel | Listed only for Gemini 3 Pro Image. |
Generation Example
curl "https://api.routescope.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
-H "Authorization: Bearer $ROUTESCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [{ "role": "user", "parts": [{ "text": "Generate a 21:9 AI model routing banner for a technical documentation homepage" }] }],
"generationConfig": {
"responseModalities": ["IMAGE"],
"imageConfig": { "aspectRatio": "21:9" }
}
}'Editing Example
curl "https://api.routescope.ai/v1beta/models/gemini-2.5-flash-image:generateContent" \
-H "Authorization: Bearer $ROUTESCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"contents": [
{
"role": "user",
"parts": [
{ "text": "Convert this image into a flat illustration style while keeping the main composition" },
{
"inline_data": {
"mime_type": "image/png",
"data": "BASE64_IMAGE_DATA"
}
}
]
}
],
"generationConfig": { "responseModalities": ["IMAGE"] }
}'Response Structure
Gemini image responses use the Gemini-native structure. Images are usually returned in candidates[].content.parts[].inlineData, and the response may also include usageMetadata. Do not rewrite it as OpenAI-style data[].url.
Notes
- Billing follows the model and channel ratios configured in the backend.
- High resolution significantly increases latency and consumption.
- Preview model availability may change. Configure fallback models for production.
How is this guide?
Last updated on