jd-opensource/xllm: A high-performance inference engine for LLMs, optimized for diverse AI accelerators. (opens in new tab)
A high-performance inference engine for LLMs, optimized for diverse AI accelerators. - jd-opensource/xllm
Read the original article