Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
Speech AI
🎙️ Speech AI
speech to speech, TTS, ASR, voice synthesis, Whisper
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
403
posts in
10.6
ms
Dots.tts
: 2B-parameter continuous, end-to-end autoregressive
TTS
system
🔮
Multimodal AI
rednote-hilab.github.io
·
4d
4 days ago
·
Hacker News
Actions for Dots.tts: 2B-parameter continuous, end-to-end autoregressive TTS system
sgl-project/sglang-omni: SGLang Omni: High-Performance Multi-Stage
Pipeline
Framework for Omni Models
🔮
Multimodal AI
Content type:
Code
github.com
·
17h
17 hours ago
Actions for sgl-project/sglang-omni: SGLang Omni: High-Performance Multi-Stage Pipeline Framework for Omni Models
FlashTTS: Fast Streaming
TTS
with MTP Acceleration and X-pred Mean Flow Distillation
🔩
ML Compilers
Content type:
Academic
arxiv.org
·
1d
1 day ago
Actions for FlashTTS: Fast Streaming TTS with MTP Acceleration and X-pred Mean Flow Distillation
OpenCode Plugin by Aito's Intelligence
🤖
AI Engineering
interrupt.camaramagic.com
·
7h
7 hours ago
·
r/selfhosted
Actions for OpenCode Plugin by Aito's Intelligence
Benchmarking
dots.tts
on Strix Halo
🎮
GPU Programming
sleepingrobots.com
·
2d
2 days ago
Actions for Benchmarking dots.tts on Strix Halo
Treble Technologies and Hugging Face Address Benchmark of Automatic
Speech
Recognition
Models
🔮
Multimodal AI
audioxpress.com
·
5d
5 days ago
Actions for Treble Technologies and Hugging Face Address Benchmark of Automatic Speech Recognition Models
Evaluate Clinical
ASR
Models Faster with Agent Skills and NVIDIA Nemotron
Speech
🔮
Multimodal AI
Content type:
News
Content type:
Blog
developer.nvidia.com
·
1d
1 day ago
Actions for Evaluate Clinical ASR Models Faster with Agent Skills and NVIDIA Nemotron Speech
Can
Voice
Agents Handle Bilingual Customers? Benchmarking Frontier
ASR
on Code-Switched
Speech
🧠
LLM Research
Content type:
Blog
huggingface.co
·
23h
23 hours ago
Actions for Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech
What
TTS
Throws Away
🔮
Multimodal AI
amaldavid.com
·
4d
4 days ago
·
Hacker News
Actions for What TTS Throws Away
Balabolka Portable 2.15.0.917 (
text-to-speech
on demand) Released
🖥️
OS Development
portableapps.com
·
2d
2 days ago
Actions for Balabolka Portable 2.15.0.917 (text-to-speech on demand) Released
AI
Deepfakes and Creator Economy Fraud: Detection & Protection Guide 2026
👁️
Computer Vision
Content type:
Blog
sumsub.com
·
6h
6 hours ago
·
r/artificial
Actions for AI Deepfakes and Creator Economy Fraud: Detection & Protection Guide 2026
Build a local
voice
agent with Red Hat OpenShift
AI
🤖
AI Engineering
developers.redhat.com
·
2d
2 days ago
Actions for Build a local voice agent with Red Hat OpenShift AI
The 4-layer
voice-agent
latency stack, traced with OTel spans
🤖
AI Engineering
Content type:
Blog
medium.com
·
6d
6 days ago
Actions for The 4-layer voice-agent latency stack, traced with OTel spans
DW News : DW : June 10, 2026 1:00pm-1:03pm CEST
🖥️
OS Development
archive.org
·
8h
8 hours ago
Actions for DW News : DW : June 10, 2026 1:00pm-1:03pm CEST
Show HN: ListenDock now supports free
TTS
and bring-your-own API keys
🦀
Rust
listendock.com
·
2d
2 days ago
·
Hacker News
,
r/SideProject
Actions for Show HN: ListenDock now supports free TTS and bring-your-own API keys
AI
Detection for Podcasts and Audio: Transcript Analysis and Verification 2026
🔮
Multimodal AI
Content type:
Blog
hub.paper-checker.com
·
6d
6 days ago
Actions for AI Detection for Podcasts and Audio: Transcript Analysis and Verification 2026
Speaker Group Encoding in Self-supervised
Speech
Recognition
Models
🧠
LLM Research
Content type:
Academic
arxiv.org
·
15h
15 hours ago
Actions for Speaker Group Encoding in Self-supervised Speech Recognition Models
Palabra.ai
Review 2026: Real-Time
Speech
Translation, Tested Carefully
🔮
Multimodal AI
Content type:
Blog
medium.com
·
5d
5 days ago
Actions for Palabra.ai Review 2026: Real-Time Speech Translation, Tested Carefully
Gemini 3.5 Live Translate rolling out to Google Meet & Translate with new ‘listening mode’
🧠
LLM Research
Content type:
News
9to5google.com
·
1d
1 day ago
Actions for Gemini 3.5 Live Translate rolling out to Google Meet & Translate with new ‘listening mode’
You don't need Copilot for code completion, try this instead
🔮
Multimodal AI
mistral.ai
·
2d
2 days ago
·
r/GithubCopilot
Actions for You don't need Copilot for code completion, try this instead
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help