TL;DR
Claude Code is a highly capable terminal-based AI coding agent, but it defaults to a single provider. Bifrost, an open-source AI gateway, removes that limitation by enabling routing across 1000+ models. Bifrost CLI lets you launch any coding agent through Bifrost with a single command with no environment variables or manual configuration needed. It supports mid-session model switching, automatic failover, and expression-based routing. This guide covers installation, multi-model configuration, dynamic switching, cloud passthrough, and routing rules powered by CEL.
Why Multi-Model Routing Matters for Claude Code
Claude Code brings AI-assisted development directly into the terminal, handling file edits, command execution, and complex reasoning tasks. However, relying on a single provider in production introduces challenges such as rate limits, outages, unpredictable costs, and limited flexibility in matching models to specific tasks.
Bifrost addresses these limitations by acting as a unified AI gateway. It connects to 1000+ models through a single API layer. Routing Claude Code through Bifrost enables failover across providers, flexible model selection, mid-session switching, and governance controls, all without requiring changes to Claude Code itself.
Setting Up Bifrost CLI With Claude Code
Getting started is quick. Install and run Bifrost locally:
npx -y @maximhq/bifrost
Next, launch the Bifrost CLI in a separate terminal:
npx -y @maximhq/bifrost-cli
Next, configure Claude Code to use the Bifrost Anthropic endpoint:
export ANTHROPIC_BASE_URL=http://localhost:8080/anthropic
If you are using API key authentication, also set:
export ANTHROPIC_API_KEY=your-api-key
For Claude Pro or Max users, authentication happens automatically through browser OAuth. After setting the base URL, running claude will open a login window and route all traffic through Bifrost.
For Teams and Enterprise users, the process remains the same. Team Premium defaults to Opus, while Team Standard uses Sonnet.
Overriding Model Tiers for Multi-Provider Routing
Claude Code organizes models into three tiers: Sonnet for general use, Opus for advanced reasoning, and Haiku for lightweight tasks. With Bifrost CLI, you can select any model from any provider when launching — using the provider/model-name format — and the CLI maps it to the appropriate tier automatically.
If you prefer manual configuration, you can still set environment variables directly:
export ANTHROPIC_DEFAULT_SONNET_MODEL=”openai/gpt-5″ export ANTHROPIC_DEFAULT_OPUS_MODEL=”anthropic/claude-opus-4-5-20251101″ export ANTHROPIC_DEFAULT_HAIKU_MODEL=”groq/llama-3.3-70b-versatile”
Bifrost automatically translates Anthropic API requests into the appropriate format for other providers. No SDK changes are required. The only requirement is that the selected model must support tool use, since Claude Code depends on tool calling for key operations like file handling and command execution.
Switching Models Mid-Session
You can switch models during an active session without restarting Claude Code. From the Bifrost CLI tab bar (Ctrl+B), open a new tab and select a different model at the summary screen. Alternatively, use the /model command inside Claude Code to dynamically change providers:
/model vertex/claude-haiku-4-5 /model azure/claude-sonnet-4-5 /model openai/gpt-5 /model mistral/mistral-large-latest
Running /model without arguments shows the current model. Switching happens instantly, and your conversation context is preserved. This makes it easy to start with a fast, low-cost model for simpler tasks and move to a more powerful model when deeper reasoning is required.
Cloud Provider Passthrough
For teams using AWS, GCP, or Azure, Bifrost simplifies authentication and routing. When using Bifrost CLI, select your cloud-hosted model from the model list and the CLI configures the correct provider path automatically.
For manual setup:
Amazon Bedrock:
export CLAUDE_CODE_USE_BEDROCK=1 export ANTHROPIC_BEDROCK_BASE_URL=http://localhost:8080/bedrock export CLAUDE_CODE_SKIP_BEDROCK_AUTH=1
Google Vertex AI:
export CLAUDE_CODE_USE_VERTEX=1 export ANTHROPIC_VERTEX_BASE_URL=http://localhost:8080/genai export CLAUDE_CODE_SKIP_VERTEX_AUTH=1
Azure does not provide native passthrough for Claude Code, but you can still route requests through theBifrost Anthropic endpoint and let Bifrost handle Azure-based model routing. When working with cloud providers, always pin specific model versions using ANTHROPIC_DEFAULT_*_MODEL to avoid issues with aliases.
Dynamic Routing With CEL Expressions
For advanced routing scenarios, Bifrost supports expression-based routing rules using Common Expression Language (CEL). These rules are evaluated at runtime before provider selection, allowing precise control over request routing.
For example, you can redirect traffic to a lower-cost provider when budget usage exceeds 85 percent:
json
{ “name”: “Budget Overflow Route”, “cel_expression”: “budget_used > 85”, “targets”: [ { “provider”: “groq”, “model”: “llama-2-70b”, “weight”: 1 } ], “scope”: “global”, “priority”: 5 }
You can also split traffic across providers for testing:
json
{ “targets”: [ { “provider”: “openai”, “model”: “gpt-4o”, “weight”: 0.7 }, { “provider”: “groq”, “model”: “llama-3.1-70b”, “weight”: 0.3 } ] }
Rules follow a hierarchy of scopes, from Virtual Key to Team to Customer to Global, and are evaluated in order of priority. The first matching rule is applied. If no rule matches, the request proceeds with its original provider and model. CEL expressions can reference headers, model names, team identifiers, budget metrics, and token usage.
Observability Out of the Box
Bifrost logs every request by default. You can view Claude Code interactions at http://localhost:8080/logs, with filters for provider, model, and content.
For production use, Bifrost includes Prometheus metrics, OTLP tracing compatible with tools like Grafana and Honeycomb, and detailed request logging. When combined with Maxim’s observability platform, teams gain full visibility into agent behavior across providers, along with the ability to run automated evaluations on real-world traces.
Key Considerations
There are a few important points to keep in mind when using Claude Code with Bifrost CLI:
Tool use support is essential. Claude Code depends heavily on tool calling. Models that lack proper support will fail on file operations, terminal commands, and other core tasks.
Some Claude-specific features are limited. Capabilities such as extended thinking, web search, computer use, and citations are not available with non-Anthropic models. Core features like chat, streaming, and tool use generally remain supported.
Streaming behavior varies across providers. According to the Bifrost documentation, some providers such as OpenRouter may not stream function call arguments correctly, which can result in empty tool call inputs. In such cases, switching providers within your configuration is recommended.
Wrapping Up
Using Claude Code with Bifrost CLI turns a single-provider development tool into a flexible, multi-model system with built-in failover, governance, and observability. Setup is simple, model switching is immediate, and routing rules can scale from basic overrides to complex traffic management strategies.
For teams building AI-powered systems, combining Bifrost withMaxim’s evaluation and observability platform provides a complete workflow for deploying reliable applications. You can intelligently route requests, monitor performance, and continuously evaluate outcomes.
To begin, explore Bifrost on GitHub or follow the Claude Code integration guide.
