RT by @awnihannun: ExecuTorch now has an MLX delegate that runs PyTorch models on Apple Silicon GPUs. It supports LLMs, speech-to-text, and MoE models with quan... (opens in new tab)
ExecuTorch now has an MLX delegate that runs PyTorch models on Apple Silicon GPUs. It supports LLMs, speech-to-text, and MoE models with quantization via TorchAO. Export with torch.export, run on Metal. Read our latest blog: pytorch.org/blog/running-pyt… Video
Read the original article