Aller au contenu

AI Model Configuration

Ce contenu n’est pas encore disponible dans votre langue.

When you send a message in Askimo, it might feel like a single AI model is handling everything - but that is not the case. Behind the scenes, Askimo uses several specialized models simultaneously, each optimised for a different job. The model you pick in the main chat settings is only one of them.

Think of it like a team at a company: there is a senior consultant (your main chat model) who handles the deep thinking, and a set of fast specialists who handle smaller jobs in the background - like filing paperwork, reviewing photos, or searching the knowledge base - without consuming the senior consultant’s time or budget.

Askimo ships with sensible defaults for every provider so things work out of the box. But you can override each model independently to match your own priorities - whether that is lower cost, faster responses, higher accuracy, or complete local privacy.

Different tasks call for very different model characteristics:

Model TypeWhat it doesWhy a dedicated model helps
Main Chat ModelAnswers your questions, reasons, writes codeYou choose this in the main provider settings
Utility ModelFast background tasks - generating a chat title, detecting your intent, routing to the right toolA small, cheap model is 10–100× faster and costs a fraction of the main model
Vision ModelUnderstands images you attach to a conversationMust support multimodal input; may differ from the chat model
Image ModelGenerates images from text promptsCompletely separate generation pipeline
Embedding ModelConverts text to vectors for semantic searchPowers RAG (Project Knowledge) and MCP tool matching

Imagine you open Askimo, type a question, and attach a screenshot:

  1. Utility model quickly reads your message and generates a short chat title in the sidebar.
  2. Vision model analyses the screenshot you attached.
  3. Main chat model receives both the text and the image analysis, then writes the full response.
  4. If you have a project knowledge base open, the embedding model silently searches it for relevant context before the main model replies.

All of this happens in parallel, so you barely notice - but each step uses the model best suited for that job.

Askimo’s defaults are chosen to work well for most users, but you may want to change them for reasons like:

  • Cost - swap the utility model to the cheapest tier; it only writes short titles, so quality barely matters
  • Speed - use a smaller, faster model for vision or utility tasks to reduce overall response time
  • Accuracy - use a larger embedding model for more precise RAG results
  • Privacy - point all models to a local Ollama instance so no data leaves your machine

  1. Open Settings (⌘ , on macOS / Ctrl , on Windows/Linux)
  2. Go to the AI Providers tab
  3. Your active provider’s Model Configuration card appears below the main provider settings

Each field saves automatically when you click away (you’ll see a ✓ checkmark confirm the save).


The Model Configuration card in Askimo's AI Providers settings, showing editable fields for utility model, vision model, image model, and embedding model for the selected provider.
FieldDefaultEnvironment Variable
Available Models(auto-detected)ASKIMO_OPENAI_MODELS
Utility ModelASKIMO_OPENAI_UTILITY_MODEL
Utility Timeout45sASKIMO_OPENAI_UTILITY_TIMEOUT
Embedding Modeltext-embedding-3-smallASKIMO_OPENAI_EMBEDDING_MODEL
Vision ModelASKIMO_OPENAI_VISION_MODEL
Image ModelASKIMO_OPENAI_IMAGE_MODEL

The utility model handles fast, low-cost background tasks that don’t require the full power of your main chat model:

  • Generating chat titles automatically
  • Detecting user intent and routing to the correct tool
  • Summarizing conversation context
  • Running MCP tool classification

Best practice: Choose the smallest model that still produces coherent short text. For cloud providers this is typically a “mini” or “flash” tier model.


The vision model is called when you attach an image to a conversation. It must support multimodal (image + text) input.

  • Analyzing screenshots and diagrams
  • Reading text from images (OCR-style)
  • Describing uploaded photos

The image model is used when Askimo generates images from a text prompt.

  • Creating illustrations from descriptions
  • Generating UI mockups or diagrams on request

The embedding model converts text into vector representations used for semantic search. It powers:

  • RAG (Project Knowledge) - finding relevant document chunks for your question
  • Tool vector search - matching your intent to the right MCP tool

Embedding support by provider:

ProviderEmbedding SupportNotes
OpenAI✅ Nativetext-embedding-3-small (1536-d), text-embedding-3-large (3072-d)
Google Gemini✅ Nativegemini-embedding-001 (3072-d)
Ollama✅ Nativenomic-embed-text (768-d), mxbai-embed-large (1024-d)
Docker AI✅ Nativeai/qwen3-embedding:0.6B-F16 (1536-d)
LM Studio✅ Via serverMust load an embedding model in LM Studio
LocalAI✅ Via serverConfigure embedding backend in LocalAI
Anthropic Claude❌ Not supportedUse another provider for embeddings
xAI (Grok)❌ Not supportedUse another provider for embeddings

“Embedding dimension does not match store dimension”

You changed the embedding model after a RAG index was already built. The old index was created with a different vector size. Rebuild the index:

  1. Go to Settings → RAG
  2. Select the affected project
  3. Click Rebuild Index

Utility tasks are slow

Your utility model may be too large. Switch to a smaller, faster model (e.g., gemini-2.5-flash-lite, gpt-4o-mini, claude-haiku-3-5, or a sub-1B local model).

Vision model returns an error for image attachments

The configured vision model does not support multimodal input. Check the provider’s documentation and update the vision model to one that accepts images.

Image generation is not available

Image generation requires provider-side support. Verify that your provider account has access to image generation endpoints and that the image model name is correct.