Speech AI

Feeds to Scour
SubscribedAll
Scoured 403 posts in 10.6 ms

Dots.tts: 2B-parameter continuous, end-to-end autoregressive TTS system

 🔮Multimodal AI

sgl-project/sglang-omni: SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models

 🔮Multimodal AI  Content type: Code
github.com·

FlashTTS: Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation

 🔩ML Compilers  Content type: Academic
arxiv.org·

OpenCode Plugin by Aito's Intelligence

 🤖AI Engineering

Benchmarking dots.tts on Strix Halo

 🎮GPU Programming
sleepingrobots.com·

Treble Technologies and Hugging Face Address Benchmark of Automatic Speech Recognition Models

 🔮Multimodal AI
audioxpress.com·

Evaluate Clinical ASR Models Faster with Agent Skills and NVIDIA Nemotron Speech

 🔮Multimodal AI  Content type: News  Content type: Blog

Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech

 🧠LLM Research  Content type: Blog
huggingface.co·

What TTS Throws Away

 🔮Multimodal AI
amaldavid.com··Hacker News

Balabolka Portable 2.15.0.917 (text-to-speech on demand) Released

 🖥️OS Development
portableapps.com·

AI Deepfakes and Creator Economy Fraud: Detection & Protection Guide 2026

 👁️Computer Vision  Content type: Blog
sumsub.com··r/artificial

Build a local voice agent with Red Hat OpenShift AI

 🤖AI Engineering
developers.redhat.com·

The 4-layer voice-agent latency stack, traced with OTel spans

 🤖AI Engineering  Content type: Blog
medium.com
·

DW News : DW : June 10, 2026 1:00pm-1:03pm CEST

 🖥️OS Development
archive.org·

Show HN: ListenDock now supports free TTS and bring-your-own API keys

 🦀Rust

AI Detection for Podcasts and Audio: Transcript Analysis and Verification 2026

 🔮Multimodal AI  Content type: Blog
hub.paper-checker.com·

Speaker Group Encoding in Self-supervised Speech Recognition Models

 🧠LLM Research  Content type: Academic
arxiv.org·

Palabra.ai Review 2026: Real-Time Speech Translation, Tested Carefully

 🔮Multimodal AI  Content type: Blog
medium.com·

Gemini 3.5 Live Translate rolling out to Google Meet & Translate with new ‘listening mode’

 🧠LLM Research  Content type: News
9to5google.com·

You don't need Copilot for code completion, try this instead

 🔮Multimodal AI

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help