Building a Voice AI Platform with 28 Modules in Python (opens in new tab)
What I Built Omni-VRAM is an open-source voice AI platform with 28 modules. GitHub: Features Speech Recognition: Whisper with 5 backends (faster-whisper, whisper.cpp, ONNX, TensorRT, OpenAI API) Real-time Streaming: <200ms latency Speaker Diarization: Who spoke when Emotion Recognition: 6 emotions TTS Synthesis: Edge-TTS + pyttsx3 Chinese Processing: Punctuation, tokenization, dialects Meeting Assistant: Auto summarization with LLM APIs: REST, WebSocket, gRPC Docker: GPU and CPU support Tech ...
Read the original article