Configuration¶

LLM models¶

Configure different models for each role using LiteLLM format:

from voicetest.models.test_case import RunOptions

options = RunOptions(
    agent_model="openai/gpt-4o-mini",
    simulator_model="gemini/gemini-1.5-flash",
    judge_model="anthropic/claude-3-haiku-20240307",
    max_turns=20,
)

Or use Ollama for local execution:

options = RunOptions(
    agent_model="ollama_chat/qwen2.5:0.5b",
    simulator_model="ollama_chat/qwen2.5:0.5b",
    judge_model="ollama_chat/qwen2.5:0.5b",
)

In the shell:

> set agent_model gemini/gemini-1.5-flash
> set simulator_model ollama_chat/qwen2.5:0.5b

For guidance on which model to use for each role, see the Models guide.

Vertex AI¶

For Vertex AI models that aren't available in the default us-central1 region, set the VERTEXAI_LOCATION environment variable:

export VERTEXAI_LOCATION=global  # needed for e.g. gemini-3.1-flash-lite-preview

See the LiteLLM Vertex AI docs for supported regions.

Run options¶

Option	Default	Description
`max_turns`	`50`	Maximum conversation turns
`turn_timeout_seconds`	`60.0`	Per-turn timeout (user sim + agent response)
`no_cache`	`false`	Bypass LLM response cache
`audio_eval`	`false`	TTS/STT round-trip evaluation
`flow_judge`	`false`	Validate conversation flow
`streaming`	`false`	Stream tokens as LLM generates
`test_model_precedence`	`false`	Test-level model overrides global model
`pattern_engine`	`fnmatch`	Pattern matching engine: `fnmatch` or `re2`

Settings file¶

Settings are stored in .voicetest/settings.toml:

[models]
agent = "groq/llama-3.1-8b-instant"
simulator = "groq/llama-3.1-8b-instant"
judge = "groq/llama-3.1-8b-instant"

[run]
max_turns = 20
audio_eval = false
streaming = false

[audio]
tts_url = "http://localhost:8002/v1"
stt_url = "http://localhost:8001/v1"

[cache]
cache_backend = "disk"

Section	Keys	Notes
`[models]`	`agent`, `simulator`, `judge`	LiteLLM strings; required for any non-local model
`[run]`	`max_turns`, `audio_eval`, `streaming`, etc.	Defaults for new runs; per-run overrides win
`[audio]`	`tts_url`, `stt_url`	Set when audio eval is enabled
`[cache]`	`cache_backend`, `s3_bucket`, `s3_prefix`, `s3_region`	See Features: LLM response cache

voicetest settings prints the active configuration; voicetest settings --set <key>=<value> updates it.

Environment variables¶

Variable	Purpose
`GROQ_API_KEY`	Default LLM provider for the demo agent and quickstart
`OPENAI_API_KEY`	OpenAI models
`ANTHROPIC_API_KEY`	Anthropic Claude models
`VERTEXAI_LOCATION`	Override the Vertex AI region (default `us-central1`)
`VOICETEST_DB_PATH`	Override DuckDB storage location (default `.voicetest/data.duckdb`)
`RETELL_API_KEY`	Retell platform integration
`VAPI_API_KEY`	VAPI platform integration
`BLAND_API_KEY`	Bland platform integration
`TELNYX_API_KEY`	Telnyx platform integration
`LIVEKIT_API_KEY` + `LIVEKIT_API_SECRET`	LiveKit platform integration

Using Claude Code as your LLM backend¶

If you have Claude Code installed, you can route LLM calls through your existing Claude subscription instead of configuring a separate API key. See Claude Code Integration.