💥 Absolute
Ultrafast Smart LLM Proxy
Route requests between fast and slow models based on complexity.
Simple interactions like "Hello" or "What's 2+2?" go to quick
models,
while complex reasoning, analysis, and domain-specific questions go
to more capable ones.
Zero added latency. Drop-in
replacement for your OpenAI client.
curl https://your-proxy.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"messages": [
{"role": "user", "content": "What is the meaning of life?"}
],
"fastModel": "gpt-4o", // Quick model for simple queries
"slowModel": "o1-mini", // Powerful model for complex ones
"stream": true
}'
1. OpenAI API Compatible
2. Mixed heuristics for deciding model
3. Support any OpenAI compatible model
4. Open-source and private
5. Very fast (hosted on Cloudflare Workers)