Hello Hacker News,
Stop relying on benchmarks and easily test LLMs in production.
Try it here: https://admin.litellm.ai/
LiteLLM allows you to simplify calling any LLM as a drop in replacement for gpt-3.5-turbo
We're launching `completion_with_split_tests` to easily A/B test all LLMs.
Example usage - 1 function:
completion_with_split_tests(
models={
"claude-2": 0.4,
"gpt-3.5-turbo": 0.6
},
messages=messages,
temperature=temperature
)
For each completion call we allow you to:
- Control/Modify LLM configs (prompt, temperature, max_tokens etc without needing to edit code)
- Easily swap in/out 100+ LLMs without redeploying code
- View Input/Outputs for each LLM on our UI
- Retry requests with an alternate LLM
Happy completion()!