Show HN: Cascade – A bare-metal C++ proxy that cuts LLM API bills by 70% (opens in new tab)
A bare-metal C++ AI proxy that predicts prompt complexity in 4.59 milliseconds and dynamically routes traffic to the most cost-effective LLM.
Read the original article