Integrate New AI Providers in Askimo

This guide explains how to implement a new chat model provider in Askimo. By following these steps, you can integrate any chat model API with the Askimo CLI and Desktop.

Architecture Overview

Askimo uses a modular architecture for chat models with the following key components:

ChatClient: Interface that defines the contract for all chat models (created by LangChain4j’s AiServices)
ChatClientImpl: Wrapper that adds session management and memory persistence to ChatClient
ChatModelFactory: Generic interface for creating chat model instances with type parameter <T : ProviderSettings>
ProviderSettings: Interface for model-specific configuration with methods for validation, field management, and deep copying
ModelProvider: Enum that identifies different model providers (OpenAI, XAI, Gemini, Ollama, Anthropic, LocalAI, LMStudio, Docker)
ProviderRegistry: Central registry that manages all model factories using a map-based structure
AiServiceBuilder: Centralized builder that assembles the full ChatClient stack (memory, tools, RAG, directives)
TokenAwareSummarizingMemory: Advanced memory implementation that automatically summarizes conversation history when approaching token limits

Factory Responsibility

Each factory is responsible for creating the raw model objects only:

createStreamingModel() — returns a StreamingChatModel for the main chat loop
createSecondaryModel() / createModel() — returns a ChatModel for utility/structured tasks
create() — delegates to AiServiceBuilder.buildChatClient() passing the models above

Session management, tool wiring, RAG, directives, and memory are all handled centrally by AiServiceBuilder — you do not implement these in your factory.

Implementation Steps

1. Add LangChain4j Dependency

Add the appropriate LangChain4j dependency to the build.gradle.kts file:

dependencies {
    implementation("dev.langchain4j:langchain4j:1.2.0")
    // Add your provider's LangChain4j module
    implementation("dev.langchain4j:langchain4j-your-provider:1.2.0")
}

Check the LangChain4j GitHub repository or Maven Central for available provider modules. If none exists, you can adapt OpenAiCompatibleModelFactory if your provider exposes an OpenAI-compatible API.

2. Add a New Provider Enum Value

Add your provider to the ModelProvider enum in io.askimo.core.providers.ModelProvider:

@Serializable
enum class ModelProvider {
    @SerialName("OPENAI") OPENAI,
    @SerialName("XAI") XAI,
    @SerialName("GEMINI") GEMINI,
    @SerialName("OLLAMA") OLLAMA,
    @SerialName("ANTHROPIC") ANTHROPIC,
    @SerialName("LOCALAI") LOCALAI,
    @SerialName("LMSTUDIO") LMSTUDIO,
    @SerialName("DOCKER") DOCKER,
    @SerialName("YOUR_PROVIDER") YOUR_PROVIDER,  // Add here
    @SerialName("UNKNOWN") UNKNOWN,
}

3. Create Provider Settings

Create a settings class that implements ProviderSettings. Use marker interfaces like HasApiKey or HasBaseUrl for common configuration patterns:

@Serializable
data class YourProviderSettings(
    override var apiKey: String = "",
    override val defaultModel: String = "your-default-model",
) : ProviderSettings, HasApiKey {

    override fun describe(): List<String> {
        // Return human-readable description of settings (mask sensitive data)
    }

    override fun getFields(): List<SettingField> {
        // Return configurable fields for the UI
    }

    override fun updateField(fieldName: String, value: String): ProviderSettings {
        // Update a field and return a new settings instance
    }

    override fun validate(): Boolean {
        // Validate that settings are properly configured (e.g., API key is non-blank)
    }

    override fun getSetupHelpText(): String {
        // Return helpful guidance for first-time setup
    }

    override fun getConfigFields(): List<ProviderConfigField> {
        // Return fields for the provider setup wizard
    }

    override fun applyConfigFields(fields: Map<String, String>): ProviderSettings {
        // Apply configuration field values and return a new settings instance
    }

    override fun deepCopy(): ProviderSettings = copy()
}

For complete implementation examples, refer to:

OpenAiSettings.kt — API key with secure keychain storage
OllamaSettings.kt — base URL configuration

4. Implement the Model Factory

Create a factory class that implements ChatModelFactory<T>. The factory creates the raw model objects; AiServiceBuilder handles everything else.

class YourProviderModelFactory : ChatModelFactory<YourProviderSettings> {

    private val log = logger<YourProviderModelFactory>()

    override fun getProvider(): ModelProvider = YOUR_PROVIDER

    override fun availableModels(settings: YourProviderSettings): List<ModelDTO> {
        val apiKey = settings.apiKey.takeIf { it.isNotBlank() } ?: return emptyList()
        return fetchModels(apiKey = apiKey, url = "https://api.yourprovider.com/v1/models", providerName = YOUR_PROVIDER)
            .map { ModelDTO.of(YOUR_PROVIDER, it) }
    }

    override fun defaultSettings(): YourProviderSettings = YourProviderSettings()

    override fun getNoModelsHelpText(): String = """
        Make sure you have provided a valid API key in Settings.
        Get your API key from: https://yourprovider.com/api-keys
    """.trimIndent()

    override fun create(
        sessionId: String?,
        settings: YourProviderSettings,
        toolProvider: ToolProvider?,
        retriever: ContentRetriever?,
        executionMode: ExecutionMode,
        chatMemory: ChatMemory?,
    ): ChatClient = AiServiceBuilder.buildChatClient(
        sessionId = sessionId,
        settings = settings,
        provider = YOUR_PROVIDER,
        chatModel = createStreamingModel(settings),
        secondaryChatModel = createSecondaryModel(settings),
        chatMemory = chatMemory,
        toolProvider = toolProvider,
        retriever = retriever,
        executionMode = executionMode,
    )

    override fun createStreamingModel(settings: YourProviderSettings): StreamingChatModel {
        val httpClientBuilder = ProxyUtil.configureProxy(HttpClient.newBuilder())
        val jdkHttpClientBuilder = JdkHttpClient.builder().httpClientBuilder(httpClientBuilder)
        val telemetry = AppContext.getInstance().telemetry

        return YourProviderStreamingChatModel.builder()
            .httpClientBuilder(jdkHttpClientBuilder)
            .apiKey(safeApiKey(settings.apiKey))
            .modelName(settings.defaultModel)
            .timeout(Duration.ofSeconds(AppConfig.models.timeouts.defaultModelTimeoutSeconds))
            .logger(log)
            .logRequests(log.isDebugEnabled)
            .logResponses(log.isTraceEnabled)
            .listeners(listOf(TelemetryChatModelListener(telemetry, YOUR_PROVIDER.name.lowercase())))
            .build()
    }

    override fun createSecondaryModel(settings: YourProviderSettings): ChatModel {
        val httpClientBuilder = ProxyUtil.configureProxy(HttpClient.newBuilder())
        val jdkHttpClientBuilder = JdkHttpClient.builder().httpClientBuilder(httpClientBuilder)

        return YourProviderChatModel.builder()
            .httpClientBuilder(jdkHttpClientBuilder)
            .apiKey(safeApiKey(settings.apiKey))
            .modelName(AppConfig.models[YOUR_PROVIDER].utilityModel.ifBlank { settings.defaultModel })
            .timeout(Duration.ofSeconds(AppConfig.models.timeouts.utilityModelTimeoutSeconds))
            .build()
    }

    override fun createModel(settings: YourProviderSettings): ChatModel {
        val httpClientBuilder = ProxyUtil.configureProxy(HttpClient.newBuilder())
        val jdkHttpClientBuilder = JdkHttpClient.builder().httpClientBuilder(httpClientBuilder)
        val telemetry = AppContext.getInstance().telemetry

        return YourProviderChatModel.builder()
            .httpClientBuilder(jdkHttpClientBuilder)
            .apiKey(safeApiKey(settings.apiKey))
            .modelName(settings.defaultModel)
            .timeout(Duration.ofSeconds(AppConfig.models.timeouts.defaultModelTimeoutSeconds))
            .logger(log)
            .logRequests(log.isDebugEnabled)
            .logResponses(log.isTraceEnabled)
            .listeners(listOf(TelemetryChatModelListener(telemetry, YOUR_PROVIDER.name.lowercase())))
            .build()
    }

    override fun createUtilityClient(
        settings: YourProviderSettings,
    ): ChatClient = AiServices.builder(ChatClient::class.java)
        .chatModel(createSecondaryModel(settings))
        .build()
}

Key design points:

No temperature/sampling — Do not set .temperature() on the model builder. Modern models (GPT-5, o-series, Gemini 2.5 Pro, etc.) only accept temperature=1.0 and throw invalid_request_error for any other value. Tone and style are controlled via Directives which inject system messages — this works across all models, including reasoning models.
No manual memory construction — AiServiceBuilder creates and configures TokenAwareSummarizingMemory automatically.
No tool wiring in factory — Tool registration and system prompts are handled centrally by AiServiceBuilder.
Proxy support — Always use ProxyUtil.configureProxy() and pass the resulting JdkHttpClientBuilder to the model builder for HTTP proxy compatibility.
Telemetry — Attach a TelemetryChatModelListener to both streaming and non-streaming models for usage tracking.

For complete working examples, refer to:

OpenAiModelFactory.kt — API key, proxy support, telemetry
OllamaModelFactory.kt — base URL, local process integration
AnthropicModelFactory.kt — thinking mode probe with ModelCapabilitiesCache
GeminiModelFactory.kt — thinking mode probe with ModelCapabilitiesCache

5. Register Your Factory

Add your factory to the ProviderRegistry in ProviderRegistry.kt:

object ProviderRegistry {
    private val factories: Map<ModelProvider, ChatModelFactory<*>> =
        mapOf(
            OPENAI to OpenAiModelFactory(),
            XAI to XAiModelFactory(),
            GEMINI to GeminiModelFactory(),
            OLLAMA to OllamaModelFactory(),
            ANTHROPIC to AnthropicModelFactory(),
            LOCALAI to LocalAiModelFactory(),
            LMSTUDIO to LmStudioModelFactory(),
            DOCKER to DockerAiModelFactory(),
            YOUR_PROVIDER to YourProviderModelFactory(),  // Add here
        )
}

Once registered, your provider is available in both the CLI and Desktop.

Optional: Thinking Mode Support

If your provider supports extended thinking (like Anthropic and Gemini), probe for it once at model creation time and cache the result:

override fun create(...): ChatClient {
    if (!ModelCapabilitiesCache.hasTestedThinkingSupport(YOUR_PROVIDER, settings.defaultModel)) {
        val supportsThinking = probeThinkingSupport(settings)
        ModelCapabilitiesCache.setThinkingSupport(YOUR_PROVIDER, settings.defaultModel, supportsThinking)
    }
    return AiServiceBuilder.buildChatClient(...)
}

override fun createStreamingModel(settings: YourProviderSettings): StreamingChatModel {
    val supportsThinking = ModelCapabilitiesCache.supportsThinking(YOUR_PROVIDER, settings.defaultModel)
    return YourProviderStreamingChatModel.builder()
        .apply {
            if (supportsThinking) {
                // enable thinking config (e.g. thinkingConfig, sendThinking, returnThinking)
            }
        }
        .build()
}

The ModelCapabilitiesCache persists results to ~/.askimo/model-capabilities-cache.json so the probe only runs once per model.

Optional: Embedding Support

Override supportsEmbedding() and createEmbeddingModel() if your provider offers embedding models:

override fun supportsEmbedding(): Boolean = true

override fun createEmbeddingModel(settings: YourProviderSettings): EmbeddingModel =
    YourProviderEmbeddingModel.builder()
        .apiKey(safeApiKey(settings.apiKey))
        .modelName(AppConfig.models[YOUR_PROVIDER].embeddingModel)
        .build()

override fun getEmbeddingTokenLimit(settings: YourProviderSettings): Int = 8191

Optional: Image Generation Support

Override createImageModel() if your provider supports image generation:

override fun createImageModel(settings: YourProviderSettings): ImageModel =
    YourProviderImageModel.builder()
        .apiKey(safeApiKey(settings.apiKey))
        .modelName(AppConfig.models[YOUR_PROVIDER].imageModel)
        .build()

OpenAI-Compatible Providers

If your provider exposes an OpenAI-compatible REST API, you do not need to create a new factory from scratch. Use OpenAiCompatibleModelFactory as a reference — it accepts a configurable baseUrl and routes all requests through LangChain4j’s OpenAI client pointed at your endpoint.

Memory and Session Management

Memory and session persistence are fully managed by AiServiceBuilder and ChatClientImpl. Your factory does not need to:

Construct TokenAwareSummarizingMemory
Register session save/restore hooks
Wire SessionMemoryRepository

The chatMemory parameter passed into create() is optional and provided by the caller when resuming an existing session. Pass it through to AiServiceBuilder.buildChatClient() unchanged.

Testing Your Implementation

After implementing your provider:

Build and run the Askimo CLI
Set your provider as the active provider:
```
askimo> :set-provider YOUR_PROVIDER
```
Set required parameters:
```
askimo> :set-param api_key your-api-key
```
List available models:
```
askimo> :models
```

Chat with a specific model:

askimo> :set-param model your-model-name
askimo> What is the capital of Viet Nam?

Conclusion

By following these steps, you can integrate any chat model provider with Askimo. The modular architecture keeps provider-specific code minimal — factories create model objects, and AiServiceBuilder handles the rest.

Handle errors gracefully in availableModels() (return an empty list on failure) and provide clear getSetupHelpText() and getNoModelsHelpText() strings to guide users through configuration.