RT by @awnihannun: anemll-flash-mlx repo is up! (opens in new tab)
anemll-flash-mlx repo is up! Simple toolkit to speed up Flash-MoE experiments on MLX. Let MLX do what it does best - dense inference in memory. We only optimize the MoE part: stable slot-bank + SSD streaming, clean hit/miss separation, no per-token expert materialization. Hackable, focused, and easy to extend to other models (Qwen 3.5 MoEs work great). → github.com/Anemll/anemll-fla… #FlashMoE
Read the original article