Swift Sings: MDX-Net Vocal Splits and RVC Voice Conversion On-Device with ONNX/CoreML
web.navan.dev·23h
🎧Learned Audio
Flag this post
AI and the End of Accents
wired.com·8h
🏛Digital humanities
Flag this post
Text to Speech Sam
🎙️Whisper
Flag this post
Brain-tuning Improves Generalizability and Efficiency of Brain Alignment in Speech Models
arxiv.org·14h
🧠Intelligence Compression
Flag this post
Taming Text-to-Sounding Video Generation via Advanced Modality Condition andInteraction
🧠Neural Compression
Flag this post
Medical Speech AI Platform: Corti Gears Up for Psychiatry and More
heise.de·4h
🎵Audio ML
Flag this post
The Art and Discipline of Prompt Engineering
cacm.acm.org·17m
⚡Proof Automation
Flag this post
10-26-building-the-rope-operation-for-tensorrent-hardware at Clehaxze
clehaxze.tw·9h
⚡SIMD Vectorization
Flag this post
Production-Grade Machine Learning Through MLOps
blog.devops.dev·5h
⚡Incremental Computation
Flag this post
Variational autoencoders stabilise TCN performance when classifying weakly labelled bioacoustics data: an interdisciplinary approach
arxiv.org·14h
🧠Learned Codecs
Flag this post
Foley Control: Aligning a Frozen Latent Text-to-Audio Model to Video
arxiv.org·14h
🎧Vorbis Encoding
Flag this post
Can large audio language models understand child stuttering speech? speech summarization, and source separation
arxiv.org·14h
🎙️Whisper
Flag this post
Building a TikTok Hook Generator Prompt That Actually Works
hackernoon.com·13h
🔃Feed Algorithms
Flag this post
VOGUE: A Multimodal Dataset for Conversational Recommendation in Fashion
arxiv.org·14h
🏛Digital humanities
Flag this post
Five LLM Tricks for Data Pipelines
🔗Constraint Handling
Flag this post
Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset
arxiv.org·14h
🎵Audio ML
Flag this post
Loading...Loading more...