MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X response speed boost (opens in new tab)
Among the many Chinese AI companies and laboratories vying for market share and attention (no pun intended) on the global marketplace, Now, MiniMax is again raising the eyebrows of AI power users and developers around the world by releasing a new, on the making of its popular M2 series of language models (, which it says yields up to 15.6 times faster decoding (or LLM response) speed by adopting a custom sub-quadratic framework. In so doing, MiniMax has designed M3 to make ultra-long-context ...
Read the original article