Discussed on DEV

We run LiteLLM as our AI gateway. 100+ providers, one OpenAI-compatible API. It works, it scales, we like it. But after a year of pushing traffic through the Python proxy, one thing kept bugging us: memory. Under concurrent load, the Python proxy peaks around 359MB. Multiply that across pods, regions, retries. OOM kills at the worst possible time. You know the feeling. LiteLLM just announced they're migrating the entire hot path to Rust. Not a rewrite. Not a v2. Same config.yaml, same databas...

Read the original article