Sors: a Rust proxy that reorders prompts to maximize vLLM prefix cache hits (opens in new tab)
Minimal proxy which reorders prompts for LLM to maximize prefix cache hit - flouthoc/sors
Read the original articleMinimal proxy which reorders prompts for LLM to maximize prefix cache hit - flouthoc/sors
Read the original article