Multimodal And Free Models

Last updated: April 10, 2026

Multimodal & Free Model Routing

Multimodal Support

Files: assistant/multimodal.go (74 lines), assistant/multimodal_query.go (251 lines)

The multimodal system enables agents to process and generate content containing text, images, and audio through a unified interface.

Content Types

Type Constant Usage
Text ContentTypeText Plain text messages
Image ContentTypeImage Image via URL (ImageURL struct)
Audio ContentTypeAudio Audio via raw bytes (AudioData struct with format + data)

Key Structs

type ContentPart struct {
    Type     ContentType `json:"type"`
    Text     string      `json:"text,omitempty"`
    ImageURL *ImageURL   `json:"image_url,omitempty"`
    Audio    *AudioData  `json:"audio,omitempty"`
}

Helper constructors: NewTextPart(), NewImagePart(url), NewAudioPart(format, data)

MultimodalQuery

MultimodalQuery(opts) sends multimodal content to OpenRouter's chat completions API:

  • Default model: moonshotai/kimi-k2.5 (for images)
  • Timeout: 120 seconds
  • Encoding: Images are sent as URLs; audio is base64-encoded

QueryMultimodal() is the high-level wrapper that:

  1. Loads agent config to get model assignment
  2. Default model for images: moonshotai/kimi-k2.5
  3. Encodes media content as base64 where needed
  4. Calls MultimodalQuery
  5. Persists to chat history if ChatID is provided

Free Model Routing

File: assistant/free-models.go (391 lines)

The FreeModelsProvider manages a JSON config (free-models.json) that controls model selection for free-tier, greeting, and fallback scenarios.

Free Model Struct

{
  "id": "glm45-air",
  "name": "GLM 4.5 Air",
  "model": "openrouter/z-ai/glm-4.5-air:free",
  "provider": "openrouter",
  "priority": 10,
  "use_for_interim": true,
  "use_for_greeting": true,
  "use_for_fallback": true,
  "is_local": false,
  "description": "Free-tier model for greetings and interim responses"
}

Routing Logic

Scenario Function Fallback
Interim responses GetInterimFreeModel() GreetingModel constant
Greeting messages GetGreetingFreeModel() GreetingModel constant
Fallback chain GetFreeModelsFallbackChain() [FallbackFreeModel, FallbackFlashModel]
Tool-aware fallback GetFreeModelsFallbackChainToolAware(hasTools) Filters "flash" models when tools present

Tool-Aware Filtering

When hasTools=true, the function excludes models with "flash" in their name from the fallback chain. Flash models (e.g., gemini-2.5-flash) typically fail to return tool calls reliably. If filtering removes all models, the original chain is returned as-is.

Thread Safety

All operations use sync.RWMutex for concurrent access. The config is loaded from free-models.json with lazy initialization and 3-hour caching.


See also: LLM Providers