Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba
arxiv.org·3d
🎚️Voice AI Systems
Show HN: I built a video-to-text tool – 10 min free daily, no signup
harku.io·8h·
Discuss: Hacker News
🎚️Audio Codecs
🧠 Real-Time Smart Speech Assistant with Python, Whisper & LLMs
dev.to·1h·
Discuss: DEV
🎙️Whisper
Show HN: AI Voice AudioBook – Convert ebooks to audio with your cloned voice
zan.chat·9h·
Discuss: Hacker News
🎚️Voice AI Systems
Show HN: Nanowakeword – Automates custom wake word model training
github.com·10h·
Discuss: Hacker News
🎙️Whisper
Towards a Typology of Strange LLM Chains-of-Thought
lesswrong.com·1d
💻Local LLMs
AI receptionist that answers real phone calls
news.ycombinator.com·4h·
Discuss: Hacker News
🧠AI
Making Machines Sound Sarcastic: LLM-Enhanced and Retrieval-Guided Sarcastic Speech Synthesis
arxiv.org·1d
🎤Voice Interfaces
Neural Networks from Scratch in Python: Simpler Than You Think
hamza.se·1h·
Discuss: Hacker News
🧠Neuromorphic Hardware
The key to conversational speech recognition
datasciencecentral.com·1d
🎤Voice Interfaces
Own your AI: Learn how to fine-tune Gemma 3 270M and run it on-device
developers.googleblog.com·1d·
Discuss: Hacker News
💻Local LLMs
Detecting and Mitigating Insertion Hallucination in Video-to-Audio Generation
arxiv.org·18h
🎤Voice Interfaces
I built a translator for spatial thinking (because I can't interview in Python)
graemefawcett.ca·3h·
Discuss: Hacker News
vibe-coding
MuFFIN: Multifaceted Pronunciation Feedback Model with Interactive Hierarchical Neural Modeling
arxiv.org·3d
🎙️Whisper
From RNNs to ChatGPT: The Paper That Changed How AI Thinks 🤖
dev.to·5h·
Discuss: DEV
🏗️AI Infrastructure
Prompt Engineering Templates That Work: 7 Copy-Paste Recipes for LLMs
kdnuggets.com·1d
🧩Low-code
How Google Translate & ChatGPT Work: The Transformer, Unboxed
dev.to·1d·
Discuss: DEV
🎙️Whisper
Harmonizing AI Voices: Bridging the Gap in Intelligent Communication
dev.to·2d·
Discuss: DEV
🎤Voice Interfaces
Everyday AI Agents
oreilly.com·10h
🤖AI agents