Modal Auto Endpoints: Optimized inference you own (opens in new tab)
LLM inference at SotA speeds and Modal quality, now available to everyone.
Read the original articleLLM inference at SotA speeds and Modal quality, now available to everyone.
Read the original article