One API. Every model.Arbitrage and optimizeyour inference
Quant-trading-grade optimization to make your inference cheaper, faster, and more reliable across models and providers.
Supported model providers
More than a gateway
Flexible and optimal multi-provider inference, handled end to end.
Unified API
Access 300+ models through one integration. Preserve provider-specific features with full fidelity to each provider's API.
Arbitrage
Arbitrage the same model across providers. Route requests to the best provider based on real-time price and performance.
Routing Strategies
Use built-in defaults or define your own routing strategy. Optimize for cost, latency, reliability, or set your own objective.
Predictive Signals
Optimize inference with real-time predictive signals on provider performance, health, and your usage patterns.
Edge Network
Route through a globally distributed edge network with state-of-the-art latency optimization.
Automatic Failover
Deliver continuous uptime. Back every request with redundancy.
Key Orchestration
BYOK, use platform keys, or both. Maximize key utilization with Auriko's orchestration engine.
Rate Limits
Run inference with capacity awareness across providers and keys. Access Auriko's global capacity reserve for on-demand capacity.
Budget Controls
Set spending limits and alerts at workspace or API key level.
Works with agent frameworks
And growing
Start optimizing in seconds
Change a few lines in your code. Then optimization kicks in.
1from auriko import Client2 3client = Client()4response = client.chat.completions.create(5 model="gpt-4o",6 messages=[{"role": "user", "content": "Hello!"}],7 routing={8 "optimize": "cost",9 "max_ttft_ms": 200,10 "data_policy": "zdr",11 }12)Works with OpenAI compatible API. Learn more
