Skip to main content

Image generation

The image_generate tool generates PNG images from text prompts using cloud providers (OpenAI DALL-E 3, Replicate Flux). It writes the image to disk via Storage.writeAtomic, reports cost on ToolResult.cost_usd, and surfaces the provider's revised prompt when applicable.

Source

extensions/tools-image/src/index.ts. Factory: createImageTools(): Tool[].

Tool surface

FieldValue
Nameimage_generate
Toolsetimage
maxResultChars1 000

Parameters

ParamTypeRequiredDefaultDescription
promptstringyesText description of the image to generate.
output_pathstringno~/.ethos/generated/<timestamp>.pngFile path to save the PNG.
sizestringno1024x1024One of 512x512 (Flux only), 1024x1024, 1024x1792, 1792x1024.
qualitystringnostandardstandard or hd.
providerstringnoautoopenai-dalle, replicate-flux, or auto.

Return shape

On success, ToolResult.value is a JSON string:

{
"path": "/home/user/.ethos/generated/1715600000000.png",
"dimensions": { "width": 1024, "height": 1024 },
"cost_usd": 0.04,
"provider": "openai-dalle",
"prompt_used": "A photorealistic cat sitting on a windowsill..."
}

ToolResult.cost_usd is set at the top level so AgentLoop aggregates it into session spend.

prompt_used returns the provider's revised prompt (DALL-E 3 rewrites prompts for quality). For Replicate Flux, prompt_used echoes the input prompt unchanged.

Provider matrix

ProviderEnv varModelSupported sizesQualityCost model
openai-dalleOPENAI_API_KEYDALL-E 31024x1024, 1024x1792, 1792x1024standard, hdPer-size/quality table below
replicate-fluxREPLICATE_API_TOKENFlux SchnellAll sizes (including 512x512)quality param ignored$0.003 flat per image

DALL-E 3 pricing

SizeStandardHD
1024x1024$0.04$0.08
1024x1792$0.08$0.12
1792x1024$0.08$0.12

Auto-pick logic

When provider is auto (default) or omitted, the tool selects the first available provider in order: openai-dalle, then replicate-flux. A provider is available when its env var is set. If neither key is set, the tool returns IMAGE_GEN_NO_PROVIDER.

Error codes

CodeToolResult.codeMeaning
IMAGE_GEN_NO_PROVIDERnot_availableNeither OPENAI_API_KEY nor REPLICATE_API_TOKEN is set.
INVALID_SIZE_FOR_PROVIDERinput_invalidThe chosen provider does not support the size/quality combination.
IMAGE_GEN_REJECTEDexecution_failedProvider refused the prompt (content policy or safety filter).
IMAGE_GEN_QUOTA_EXCEEDEDexecution_failedRate limit or quota exceeded (HTTP 429 or equivalent).
IMAGE_GEN_PROVIDER_UNAVAILABLEexecution_failedProvider returned a server error or timed out.
OUTPUT_PATH_DENIEDexecution_failedOutput path is outside the personality's fs_reach allowlist.

Examples

Basic generation

// In a personality's toolset.yaml, include:
// - image_generate

// The agent calls:
image_generate({ prompt: 'A watercolor painting of a mountain lake at dawn' })
// → { path: '~/.ethos/generated/1715600000000.png', dimensions: { width: 1024, height: 1024 },
// cost_usd: 0.04, provider: 'openai-dalle', prompt_used: 'A serene watercolor...' }

HD portrait with explicit provider

image_generate({
prompt: 'Professional headshot, studio lighting, neutral background',
size: '1024x1792',
quality: 'hd',
provider: 'openai-dalle',
output_path: '/tmp/headshot.png',
})
// → { path: '/tmp/headshot.png', dimensions: { width: 1024, height: 1792 },
// cost_usd: 0.12, provider: 'openai-dalle', prompt_used: '...' }

Known limitations

  • No editing, inpainting, or variations. Generation from text prompt only.
  • No local Stable Diffusion. Cloud providers (OpenAI, Replicate) only.
  • No auto-retry on policy rejection. A rejected prompt fails; the agent must rephrase.
  • PNG only. Output is always PNG regardless of the output_path extension.
  • Replicate polling. Flux uses HTTP polling with a 120 s timeout.

See also