๐Ÿฟ๏ธ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
๐Ÿ—ฃ๏ธ CMU Pronouncing

Phonetic Dictionaries, Speech Synthesis, Linguistic Resources, Audio Processing

Zero-shot Context Biasing with Trie-based Decoding using Synthetic Multi-Pronunciation
arxiv.orgยท6h
๐ŸŽ™๏ธWhisper
VibeVoice (1.5B) - TTS model by Microsoft
huggingface.coยท17hยท
Discuss: Hacker News, r/LocalLLaMA
๐ŸŽ™๏ธWhisper
Google NotebookLM goes global with multilingual AI video summaries of your notes
techradar.comยท7h
๐Ÿ›Digital humanities
MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
arxiv.orgยท6h
๐ŸŽ™๏ธWhisper
TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling
arxiv.orgยท6h
๐ŸŽ™๏ธWhisper
Claude Code's 19 cent Parser
blogger.comยท21h
๐Ÿ”งBinary Parsers
A Universal Rhythm Guides How We Speak: Global Analysis Reveals 1.6-second Units
science.slashdot.orgยท1d
๐ŸŽตMusic Universality
Speech-Based Depressive Mood Detection in the Presence of Multiple Sclerosis: A Cross-Corpus and Cross-Lingual Study
arxiv.orgยท6h
๐ŸŽ™๏ธWhisper
LingVarBench: Benchmarking LLM for Automated Named Entity Recognition in Structured Synthetic Spoken Transcriptions
arxiv.orgยท1d
โš™๏ธCompression Benchmarking
Systematic LLM Prompt Engineering Using DSPy Optimization
towardsdatascience.comยท17h
โšกProof Automation
EyeMulator: Improving Code Language Models by Mimicking Human Visual Attention
arxiv.orgยท6h
๐Ÿ“ŠFeed Optimization
Using Gemini prompts for Suno's Cover/Remix helps unblock creative projects
backpocketmusic.comยท10hยท
Discuss: Hacker News
๐ŸŽงLearned Audio
DocHop-QA: Towards Multi-Hop Reasoning over Multimodal Document Collections
arxiv.orgยท1d
๐Ÿ“‡Dublin Core
Constrained Prompt Enhancement for Improving Zero-Shot Generalization of Vision-Language Models
arxiv.orgยท6h
๐Ÿง Neural Codecs
Dissonance: A journey through musical possibility space
aatishb.comยท2dยท
Discuss: Hacker News
๐ŸŒˆSpectral Audio
Next-gen voice, video, and chat messaging using your domain name not your number
thunderbolt.comยท2dยท
Discuss: Hacker News
๐Ÿ”ŒOperating system internals
On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization
arxiv.orgยท6h
๐ŸงฎKolmogorov Bounds
Mini-Omni-Reasoner: Token-Level Thinking-in-Speaking in Large Speech Models
arxiv.orgยท1d
๐ŸŽ™๏ธWhisper
Nvidia Release Massive AI-Ready Open European Language Dataset and Tools
hardware.slashdot.orgยท2d
๐ŸŽ™๏ธWhisper
Show HN: Voice Typing from Your Terminal
github.comยท1dยท
Discuss: Hacker News
๐ŸŽ™๏ธWhisper
Loading...Loading more...
AboutBlogChangelogRoadmap