NeMo out, GGUF in: how parakeet.cpp ports NVIDIA ASR to C++ (opens in new tab)
NVIDIA's Parakeet speech models used to mean a Python stack: NeMo, PyTorch, and a GPU you kept warm. A new C++ port collapses that to one binary and one file. From NeMo to GGUF: What the Port Covers parakeet.cpp is a C++17 inference port that runs NVIDIA's Parakeet automatic speech recognition (ASR) models on the ggml tensor library — the same engine behind whisper.cpp and llama.cpp — with no Python, no NeMo, and no ONNX at inference time. The project is maintained by Ettore Di Giacinto (@mud...
Read the original article