LiteLLM proxy connects Alex Sidebar to Amazon Bedrock, Google Vertex AI, and other enterprise AI providers. Teams can use their existing cloud infrastructure without changing their security setup.
For iOS Developers : If your company already pays for AWS Bedrock or Google Cloud AI, this guide shows how to use those models in Xcode with Alex Sidebar instead of buying separate API keys.
What is LiteLLM?
LiteLLM is an open-source proxy that translates between the OpenAI API format and 100+ different AI providers. Alex Sidebar can work with enterprise AI services that don’t support the OpenAI API format.
LiteLLM is what Alex Sidebar uses internally for model connections, making it a well-tested solution for enterprise deployments. Current stable version: v1.73.6-stable (June 2025)
Why Use LiteLLM?
Use Your Company's AI If your company uses AWS Bedrock or Google Cloud AI, LiteLLM lets you access those models through Alex Sidebar
Data Never Leaves Your Infrastructure Your code stays within your company’s cloud. No data goes to Alex Sidebar servers
Track Costs by Project See exactly how much each project costs. Set budgets and get alerts
One Interface for All Models Switch between Claude 4 on Bedrock, Gemini 2.5 on Vertex, or GPT-4 on Azure without changing code
Quick Start
Install LiteLLM
Choose your deployment method:
Option 1: pip install (simplest)
pip install 'litellm[proxy]'
litellm --model bedrock/claude-4-sonnet --port 4000
Option 2: Docker (recommended for production)
docker run -p 4000:4000 ghcr.io/berriai/litellm:v1.73.6-stable
Configure Your Providers
Create a config.yaml
file in your LiteLLM directory:
model_list :
# Amazon Bedrock - Latest Claude 4 Models
- model_name : "claude-4-sonnet"
litellm_params :
model : "bedrock/anthropic.claude-4-sonnet-20250514-v1:0"
aws_region_name : "us-east-1"
# Google Vertex AI - Latest Gemini 2.5 Models
- model_name : "gemini-2.5-pro"
litellm_params :
model : "vertex_ai/gemini-2.5-pro"
vertex_project : "your-gcp-project"
vertex_location : "us-central1"
# OpenAI O-Series with Reasoning
- model_name : "o4-mini"
litellm_params :
model : "o4-mini-2025-04-16"
api_key : "your-openai-key"
- model_name : "o3-pro"
litellm_params :
model : "o3-pro-2025-06-10"
api_key : "your-openai-key"
# Start proxy with config
# litellm --config config.yaml --port 4000
Connect Alex Sidebar
In Alex Sidebar, add a custom model pointing to your LiteLLM proxy:
Open Settings → Models → Custom Models
Click “Add New Model”
Configure:
Model ID : Your model name from config.yaml (e.g., claude-4-sonnet
)
Base URL : Your LiteLLM URL + /v1
(e.g., https://litellm.company.com/v1
)
API Key : Your LiteLLM proxy key (if configured)
Provider-Specific Setup
Amazon Bedrock
Ensure your AWS credentials are configured on the LiteLLM server
Enable the models you need in the AWS Bedrock console
Add to your LiteLLM config:
model_list :
# Latest Claude 4 Models
- model_name : "claude-4-opus"
litellm_params :
model : "bedrock/anthropic.claude-4-opus-20250514-v1:0"
aws_region_name : "us-east-1"
- model_name : "claude-4-sonnet"
litellm_params :
model : "bedrock/anthropic.claude-4-sonnet-20250514-v1:0"
aws_region_name : "us-east-1"
# Reasoning and Thinking Support
- model_name : "claude-4-sonnet-reasoning"
litellm_params :
model : "bedrock/anthropic.claude-4-sonnet-20250514-v1:0"
aws_region_name : "us-east-1"
thinking : true
# Latest Llama 4 Models
- model_name : "llama4-70b"
litellm_params :
model : "bedrock/meta.llama4-70b-instruct-v1:0"
aws_region_name : "us-east-1"
# DeepSeek R1 Models
- model_name : "deepseek-r1"
litellm_params :
model : "bedrock/deepseek.deepseek-r1-distill-llama-70b"
aws_region_name : "us-east-1"
Ensure your AWS credentials are configured on the LiteLLM server
Enable the models you need in the AWS Bedrock console
Add to your LiteLLM config:
model_list :
# Latest Claude 4 Models
- model_name : "claude-4-opus"
litellm_params :
model : "bedrock/anthropic.claude-4-opus-20250514-v1:0"
aws_region_name : "us-east-1"
- model_name : "claude-4-sonnet"
litellm_params :
model : "bedrock/anthropic.claude-4-sonnet-20250514-v1:0"
aws_region_name : "us-east-1"
# Reasoning and Thinking Support
- model_name : "claude-4-sonnet-reasoning"
litellm_params :
model : "bedrock/anthropic.claude-4-sonnet-20250514-v1:0"
aws_region_name : "us-east-1"
thinking : true
# Latest Llama 4 Models
- model_name : "llama4-70b"
litellm_params :
model : "bedrock/meta.llama4-70b-instruct-v1:0"
aws_region_name : "us-east-1"
# DeepSeek R1 Models
- model_name : "deepseek-r1"
litellm_params :
model : "bedrock/deepseek.deepseek-r1-distill-llama-70b"
aws_region_name : "us-east-1"
LiteLLM supports multiple AWS authentication methods:
IAM roles (recommended for EC2/ECS)
Environment variables (AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
)
AWS profiles
Temporary credentials with STS
Workload Identity Federation (for cross-cloud deployments)
Google Vertex AI
Enable the Vertex AI API in your GCP project
Set up authentication (service account recommended)
Add to your LiteLLM config:
model_list :
# Latest Gemini 2.5 Models
- model_name : "gemini-2.5-pro"
litellm_params :
model : "vertex_ai/gemini-2.5-pro"
vertex_project : "your-project-id"
vertex_location : "us-central1"
- model_name : "gemini-2.5-flash"
litellm_params :
model : "vertex_ai/gemini-2.5-flash"
vertex_project : "your-project-id"
vertex_location : "us-central1"
# Claude 4 on Vertex AI
- model_name : "claude-4-vertex"
litellm_params :
model : "vertex_ai/claude-4-opus"
vertex_project : "your-project-id"
vertex_location : "us-central1"
# Multimodal Image Generation
- model_name : "imagen-4"
litellm_params :
model : "vertex_ai/imagen-4"
vertex_project : "your-project-id"
vertex_location : "us-central1"
Enable the Vertex AI API in your GCP project
Set up authentication (service account recommended)
Add to your LiteLLM config:
model_list :
# Latest Gemini 2.5 Models
- model_name : "gemini-2.5-pro"
litellm_params :
model : "vertex_ai/gemini-2.5-pro"
vertex_project : "your-project-id"
vertex_location : "us-central1"
- model_name : "gemini-2.5-flash"
litellm_params :
model : "vertex_ai/gemini-2.5-flash"
vertex_project : "your-project-id"
vertex_location : "us-central1"
# Claude 4 on Vertex AI
- model_name : "claude-4-vertex"
litellm_params :
model : "vertex_ai/claude-4-opus"
vertex_project : "your-project-id"
vertex_location : "us-central1"
# Multimodal Image Generation
- model_name : "imagen-4"
litellm_params :
model : "vertex_ai/imagen-4"
vertex_project : "your-project-id"
vertex_location : "us-central1"
Set up authentication using one of these methods:
Service account JSON file: export GOOGLE_APPLICATION_CREDENTIALS=path/to/key.json
Workload Identity (for GKE)
Default application credentials
Authorized user credentials
Azure OpenAI
Configure Azure OpenAI with latest models:
model_list :
# Latest O-Series Models
- model_name : "o4-mini"
litellm_params :
model : "azure/o4-mini-2025-04-16"
api_base : "https://your-resource.openai.azure.com"
api_key : "your-azure-key"
api_version : "2025-01-01-preview"
- model_name : "o3-pro"
litellm_params :
model : "azure/o3-pro-2025-06-10"
api_base : "https://your-resource.openai.azure.com"
api_key : "your-azure-key"
api_version : "2025-01-01-preview"
# GPT-4 with Audio Preview
- model_name : "gpt-4o-audio"
litellm_params :
model : "azure/gpt-4o-audio-preview-2025-06-03"
api_base : "https://your-resource.openai.azure.com"
api_key : "your-azure-key"
api_version : "2025-01-01-preview"
Configure Azure OpenAI with latest models:
model_list :
# Latest O-Series Models
- model_name : "o4-mini"
litellm_params :
model : "azure/o4-mini-2025-04-16"
api_base : "https://your-resource.openai.azure.com"
api_key : "your-azure-key"
api_version : "2025-01-01-preview"
- model_name : "o3-pro"
litellm_params :
model : "azure/o3-pro-2025-06-10"
api_base : "https://your-resource.openai.azure.com"
api_key : "your-azure-key"
api_version : "2025-01-01-preview"
# GPT-4 with Audio Preview
- model_name : "gpt-4o-audio"
litellm_params :
model : "azure/gpt-4o-audio-preview-2025-06-03"
api_base : "https://your-resource.openai.azure.com"
api_key : "your-azure-key"
api_version : "2025-01-01-preview"
Azure supports multiple authentication methods:
API keys
Azure AD tokens
Managed Identity
Certificate-based authentication
Advanced Features
Reasoning and Thinking Capabilities
Enable advanced reasoning for supported models:
model_list :
- model_name : "claude-4-reasoning"
litellm_params :
model : "anthropic/claude-4-sonnet-20250514"
thinking : true
- model_name : "o3-pro-reasoning"
litellm_params :
model : "o3-pro-2025-06-10"
reasoning_effort : "high"
- model_name : "o4-mini-reasoning"
litellm_params :
model : "o4-mini-2025-04-16"
reasoning_effort : "medium"
Multimodal Support
Configure models for text, image, audio, and video:
model_list :
# Vision + Audio Models
- model_name : "gpt-4o-multimodal"
litellm_params :
model : "gpt-4o"
supports_vision : true
supports_audio : true
# Gemini with Enhanced Multimodal
- model_name : "gemini-2.5-multimodal"
litellm_params :
model : "vertex_ai/gemini-2.5-pro"
supports_vision : true
supports_pdf_input : true
supports_audio : true
MCP Gateway Integration
Enable Model Context Protocol for enhanced tool use:
general_settings :
enable_mcp_gateway : true
mcp_servers :
- server_name : "filesystem"
server_command : [ "uvx" , "mcp-server-filesystem" , "/path/to/allowed/files" ]
- server_name : "jira"
server_command : [ "node" , "/path/to/jira-mcp-server" ]
auth_type : "api_key"
auth_value : "your-jira-api-key"
Team Configuration
For team accounts, you can override all Alex Sidebar model endpoints:
Go to Alex Sidebar Admin Portal
Navigate to Models tab
Add your LiteLLM proxy URL as Base URL for each model type
All team members automatically use your proxy
All AI requests from your team go through your infrastructure. You control the data and costs.
See the Team Configuration Guide for detailed instructions on managing team models.
Advanced Configuration
Load Balancing with Fallbacks
Distribute requests across multiple model deployments with intelligent routing:
model_list :
- model_name : "claude-4-primary"
litellm_params :
model : "bedrock/anthropic.claude-4-sonnet-20250514-v1:0"
aws_region_name : "us-east-1"
- model_name : "claude-4-fallback"
litellm_params :
model : "anthropic/claude-4-sonnet-20250514"
api_key : "fallback-key"
router_settings :
routing_strategy : "least-busy" # or "round-robin", "weighted-round-robin"
model_group_alias : "claude-4"
fallbacks : [{ "claude-4-primary" : [ "claude-4-fallback" ]}]
cooldown_time : 60 # seconds before retrying failed deployment
Cost Tracking and Budget Management
Enable comprehensive cost tracking:
general_settings :
master_key : "your-secret-key"
database_url : "postgresql://user:pass@localhost:5432/litellm"
# Budget controls
litellm_settings :
max_budget : 1000 # Monthly budget in USD
budget_duration : "monthly" # daily, weekly, monthly
success_callback : [ "langfuse" , "prometheus" ]
track_cost_callback : true
# User-level budgets
user_api_key_config :
user1 :
budget_duration : "monthly"
max_budget : 100
Security and Rate Limiting
Secure your LiteLLM deployment with advanced controls:
general_settings :
master_key : "sk-your-secret-key"
# Enhanced security
allowed_ips : [ "10.0.0.0/8" , "172.16.0.0/12" ]
disable_spend_logs : false
guardrails : [ "presidio_pii" , "bedrock_guardrails" ]
# Advanced rate limiting
max_parallel_requests : 1000
max_request_per_minute : 10000
rate_limiting_strategy : "sliding-window" # New accurate rate limiting
# SCIM integration for enterprise SSO
scim_settings :
enabled : true
scim_base_url : "https://your-litellm.com/scim/v2"
Vector Store Integration
Connect to knowledge bases and vector stores:
vector_stores :
- store_name : "company_docs"
store_type : "bedrock_knowledge_base"
knowledge_base_id : "your-kb-id"
aws_region : "us-east-1"
- store_name : "technical_docs"
store_type : "pinecone"
api_key : "your-pinecone-key"
environment : "your-environment"
# Auto-activate for specific models
model_list :
- model_name : "claude-4-with-kb"
litellm_params :
model : "bedrock/anthropic.claude-4-sonnet-20250514-v1:0"
vector_store : "company_docs"
Monitoring & Observability
LiteLLM v1.73.6 provides enhanced monitoring capabilities:
2x Higher RPS : Enhanced aiohttp transport for improved performance
50ms Median Latency : Optimized for high-throughput applications
Multi-instance Rate Limiting : Accurate rate limiting across deployments
Dashboard Features
general_settings :
database_url : "postgresql://user:pass@localhost:5432/litellm"
ui_access_mode : "admin_only" # or "all"
# Enhanced logging
litellm_settings :
success_callback : [ "langfuse" , "prometheus" , "datadog" ]
failure_callback : [ "slack" , "pagerduty" ]
# Session tracking
session_config :
enable_session_logs : true
session_retention_days : 30
Real-time Monitoring
# Prometheus metrics
prometheus_settings :
enabled : true
track_end_users : false # Opt-in to prevent large metric sets
# Health checks
health_check :
enabled : true
check_interval : 300 # seconds
models_to_check : [ "claude-4-sonnet" , "gemini-2.5-pro" , "o4-mini-2025-04-16" , "o3-pro-2025-06-10" ]
Troubleshooting
Verify LiteLLM is running and accessible
Check firewall rules and security groups
Ensure you’re using the correct URL format with /v1
suffix
For Docker: Check port mapping and container status
Bedrock : Verify AWS credentials, IAM permissions, and model access
Vertex : Check service account permissions and project settings
Azure : Ensure API keys and resource endpoints are correct
Verify master key matches if configured
Model not found or deprecated
Check model name matches exactly in config.yaml
Verify the model is enabled in your cloud provider console
Update to latest model versions (e.g., claude-4 instead of claude-3)
Check region/location settings and model availability
High latency or rate limiting
Enable aiohttp transport: USE_AIOHTTP_TRANSPORT=True
Implement load balancing across multiple deployments
Adjust max_parallel_requests
and rate limiting settings
Consider regional deployment distribution
Ensure database connection is properly configured
Check that track_cost_callback: true
is set
Verify model pricing information is up to date
Review spend logs retention settings
Common Use Cases for iOS Teams
Scenario 1: Company Uses AWS with Latest Models
Your company has AWS Bedrock with Claude 4 models. Instead of buying Anthropic API keys:
Deploy LiteLLM v1.73.6-stable on an EC2 instance
Configure it to use your Bedrock Claude 4 models with reasoning capabilities
Developers connect Alex Sidebar to your LiteLLM endpoint
All costs go to your AWS bill with detailed tracking
Scenario 2: Multi-Cloud Model Testing
Test latest models across providers without changing code:
model_list :
- model_name : "best-reasoning"
litellm_params :
model : "o3-pro-2025-06-10" # OpenAI's latest reasoning model
reasoning_effort : "high"
- model_name : "best-multimodal"
litellm_params :
model : "vertex_ai/gemini-2.5-pro" # Google's latest multimodal
- model_name : "best-coding"
litellm_params :
model : "bedrock/anthropic.claude-4-sonnet-20250514-v1:0" # Claude 4 for coding
Scenario 3: Development vs Production with New Models
# Dev environment - use efficient models
- model_name : "dev-model"
litellm_params :
model : "bedrock/anthropic.claude-3-haiku-20240307-v1:0"
max_tokens : 1000
# Staging - test new capabilities
- model_name : "staging-model"
litellm_params :
model : "vertex_ai/gemini-2.5-flash"
# Production - use most capable models
- model_name : "prod-model"
litellm_params :
model : "bedrock/anthropic.claude-4-opus-20250514-v1:0"
thinking : true
Scenario 4: Enterprise Security and Compliance
general_settings :
master_key : "sk-your-secure-key"
guardrails : [ "presidio_pii" , "bedrock_guardrails" ]
# PII masking and content filtering
guardrail_settings :
presidio :
mask_entities : [ "PERSON" , "EMAIL" , "PHONE_NUMBER" ]
block_entities : [ "MEDICAL_LICENSE" ]
bedrock_guardrails :
guardrail_id : "your-guardrail-id"
guardrail_version : "1"
Enterprise Features (LiteLLM v1.73.6+)
SCIM Integration
Automatic user provisioning from your identity provider:
Okta, Azure AD, OneLogin support
Automatic team creation and user assignment
Deprovisioning when users are removed
Advanced Analytics
Team and tag-based usage tracking
Daily spend analysis by model and user
Session grouping and analysis
Audit logs for compliance
Enhanced Security
Vector store permissions by team/user
MCP server access controls
IP allowlisting and rate limiting
End-to-end encryption options
Next Steps
LiteLLM v1.73.6-stable gives you control over your AI infrastructure with the latest models and enterprise-grade features, working seamlessly with all Alex Sidebar capabilities.