LLM Configuration

Ctrl AI uses a role-based model selection system. Each task in the inference pipeline uses the best model for the job.

Default Model Roles

Role	Default Provider	Default Model	Purpose
`parse`	Gemini	gemini-2.5-flash	Query parsing (cheap/fast)
`generate`	Anthropic	claude-sonnet-4-6	Unit generation (accuracy)
`evaluate`	Gemini	gemini-2.5-flash	Mode B unit execution
`prose`	Gemini	gemini-2.5-flash	Response generation

The system resolves which model to use in this order:

Org BYOK keys — if the org has configured their own API key for a provider
Org default provider — if the org has selected a default provider in settings
Environment variables — system-level API keys
System defaults — hardcoded fallbacks

Organization admins can override models in Settings > AI Models:

Settings are stored per-organization and take effect immediately.

Organizations can provide their own API keys for any supported provider. This means:

The org is billed directly by the LLM provider
Ctrl AI never sees or stores the actual API requests/responses on its infrastructure
Full control over model selection and usage limits

For air-gapped or fully local deployments:

LLM_PROVIDER=ollama
LLM_BASE_URL=http://host.docker.internal:11434/v1
LLM_MODEL=llama3

This routes all LLM calls through your local Ollama instance. No data leaves your network.

For any OpenAI-compatible API (Azure OpenAI, Together, Groq, vLLM, etc.):

LLM_PROVIDER=openai
LLM_BASE_URL=https://your-endpoint.com/v1
LLM_API_KEY=your-key
LLM_MODEL=your-model

Every inference query logs which provider and model were used for each role. This is visible in:

This enables full traceability for regulatory requirements (EU AI Act Article 11).