Phonetic Dictionaries, Speech Synthesis, Linguistic Resources, Audio Processing
Surgery-R1: Advancing Surgical-VQLA with Reasoning Multimodal Large Language Model via Reinforcement Learning
arxiv.orgยท17h
MLLP-VRAIN UPV system for the IWSLT 2025 Simultaneous Speech Translation Translation task
arxiv.orgยท1d
Dialogic Pedagogy for Large Language Models: Aligning Conversational AI with Proven Theories of Learning
arxiv.orgยท17h
Multilingual innovation in LLMs: How open models help unlock global communication
developers.googleblog.comยท2d
ElevenLabs releases a standalone voice generation app
techcrunch.comยท1d
Context Biasing for Pronunciations-Orthography Mismatch in Automatic Speech Recognition
arxiv.orgยท1d
SUTRA: Decoupling Concept & Language for Multilingual LLM Excellence
hackernoon.comยท5h
General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound
arxiv.orgยท17h
Breaking the Transcription Bottleneck: Fine-tuning ASR Models for Extremely Low-Resource Fieldwork Languages
arxiv.orgยท1d
Loading...Loading more...