Swift Sings: MDX-Net Vocal Splits and RVC Voice Conversion On-Device with ONNX/CoreML
web.navan.dev·16h
🎧Learned Audio
Flag this post
AI and the End of Accents
wired.com·1h
🏛Digital humanities
Flag this post
Text to Speech Sam
texttospeechrobot.com·18h·
Discuss: Hacker News
🎙️Whisper
Flag this post
Brain-tuning Improves Generalizability and Efficiency of Brain Alignment in Speech Models
arxiv.org·7h
🧠Intelligence Compression
Flag this post
Taming Text-to-Sounding Video Generation via Advanced Modality Condition andInteraction
dev.to·8h·
Discuss: DEV
🧠Neural Compression
Flag this post
Clojure Runs ONNX AI Models Now - Join the AI fun!
dragan.rocks·19h·
Discuss: Hacker News
🌳Context free grammars
Flag this post
10-26-building-the-rope-operation-for-tensorrent-hardware at Clehaxze
clehaxze.tw·2h
SIMD Vectorization
Flag this post
Foley Control: Aligning a Frozen Latent Text-to-Audio Model to Video
arxiv.org·7h
🎧Vorbis Encoding
Flag this post
Can large audio language models understand child stuttering speech? speech summarization, and source separation
arxiv.org·7h
🎙️Whisper
Flag this post
Building a TikTok Hook Generator Prompt That Actually Works
hackernoon.com·6h
🔃Feed Algorithms
Flag this post
VOGUE: A Multimodal Dataset for Conversational Recommendation in Fashion
arxiv.org·7h
🏛Digital humanities
Flag this post
[P] I'm unable to do a single project without using AI and it's killing my confidence
reddit.com·1d·
Discuss: r/artificial
Proof Automation
Flag this post
Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset
arxiv.org·7h
🎵Audio ML
Flag this post
HiFi-HARP: A High-Fidelity 7th-Order Ambisonic Room Impulse Response Dataset
arxiv.org·7h
👂Psychoacoustic Coding
Flag this post
Show HN: Free Online Video Caption – Burn in your own subtitles in browser
videocaption.app·22h·
Discuss: Hacker News
🧠Learned Codecs
Flag this post
Brands can stay visible in the age of AI search
wearetalker.com·22h·
Discuss: Hacker News
🎙️Whisper
Flag this post