Multimodal & Free Model Routing
Multimodal Support
Files: assistant/multimodal.go (74 lines), assistant/multimodal_query.go (251 lines)
The multimodal system enables agents to process and generate content containing text, images, and audio through a unified interface.
Content Types
| Type | Constant | Usage |
|---|---|---|
| Text | ContentTypeText |
Plain text messages |
| Image | ContentTypeImage |
Image via URL (ImageURL struct) |
| Audio | ContentTypeAudio |
Audio via raw bytes (AudioData struct with format + data) |
Key Structs
type ContentPart struct {
Type ContentType `json:"type"`
Text string `json:"text,omitempty"`
ImageURL *ImageURL `json:"image_url,omitempty"`
Audio *AudioData `json:"audio,omitempty"`
}
Helper constructors: NewTextPart(), NewImagePart(url), NewAudioPart(format, data)
MultimodalQuery
MultimodalQuery(opts) sends multimodal content to OpenRouter's chat completions API:
- Default model:
moonshotai/kimi-k2.5(for images) - Timeout: 120 seconds
- Encoding: Images are sent as URLs; audio is base64-encoded
QueryMultimodal() is the high-level wrapper that:
- Loads agent config to get model assignment
- Default model for images:
moonshotai/kimi-k2.5 - Encodes media content as base64 where needed
- Calls
MultimodalQuery - Persists to chat history if
ChatIDis provided
Free Model Routing
File: assistant/free-models.go (391 lines)
The FreeModelsProvider manages a JSON config (free-models.json) that controls model selection for free-tier, greeting, and fallback scenarios.
Free Model Struct
{
"id": "glm45-air",
"name": "GLM 4.5 Air",
"model": "openrouter/z-ai/glm-4.5-air:free",
"provider": "openrouter",
"priority": 10,
"use_for_interim": true,
"use_for_greeting": true,
"use_for_fallback": true,
"is_local": false,
"description": "Free-tier model for greetings and interim responses"
}
Routing Logic
| Scenario | Function | Fallback |
|---|---|---|
| Interim responses | GetInterimFreeModel() |
GreetingModel constant |
| Greeting messages | GetGreetingFreeModel() |
GreetingModel constant |
| Fallback chain | GetFreeModelsFallbackChain() |
[FallbackFreeModel, FallbackFlashModel] |
| Tool-aware fallback | GetFreeModelsFallbackChainToolAware(hasTools) |
Filters "flash" models when tools present |
Tool-Aware Filtering
When hasTools=true, the function excludes models with "flash" in their name from the fallback chain. Flash models (e.g., gemini-2.5-flash) typically fail to return tool calls reliably. If filtering removes all models, the original chain is returned as-is.
Thread Safety
All operations use sync.RWMutex for concurrent access. The config is loaded from free-models.json with lazy initialization and 3-hour caching.
See also: LLM Providers