🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🎧 Learned Audio

Neural Codecs, Perceptual Models, Rate-Distortion, AI Compression

AADNet: An End-to-End Deep Learning Model for Auditory Attention Decoding
arxiv.org·20h
🎵Audio ML
Writing an LLM from scratch, part 16 -- layer normalisation
gilesthomas.com·5h·
Discuss: Hacker News
📊Quantization
Toward Efficient Speech Emotion Recognition via Spectral Learning and Attention
arxiv.org·20h
🎵Audio ML
A Conversation with Val Bercovici about Disaggregated Prefill / Decode
fabricatedknowledge.com·1d
📼Tape Combinators
Our dev team tried replacing typing with talking and it's working
deepgram.com·2h·
Discuss: Hacker News
🎙️Whisper
Python Audio Processing with Pedalboard
lwn.net·4d·
Discuss: Hacker News
💿FLAC Archaeology
What Really Is Machine Learning? (And Why Your Phone Knows You Better Than Your Best Friend)
dev.to·9h·
Discuss: DEV
🧠Machine Learning
Complexities of Media Streaming
aschey.tech·1d·
Discuss: Lobsters
🎵Audio Streaming
LAPS-Diff: A Diffusion-Based Framework for Singing Voice Synthesis With Language Aware Prosody-Style Guided Learning
arxiv.org·20h
🎙️Whisper
Activation Steering for Chain-of-Thought Compression
arxiv.org·20h
⧗Information Bottleneck
RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification
arxiv.org·20h
🎵Audio ML
<h2>DIY Ear training with Python and Music21, part 1</h2>
naomiceder.tech·2d
🎼Computational Musicology
Using a Framework Desktop for local AI
frame.work·1d
💻Local LLMs
The Five-Second Fingerprint: Inside Shazam’s Instant Song ID
towardsdatascience.com·22h
🎵Audio Fingerprinting
MGAA: Multi-Granular Adaptive Allocation fof Low-Rank Compression of LLMs
arxiv.org·20h
🧠Machine Learning
New Machine Vision Is More Energy Efficient—and More Human
spectrum.ieee.org·11h·
Discuss: r/technews
🧠Machine Learning
Towards Human-in-the-Loop Onset Detection: A Transfer Learning Approach for Maracatu
arxiv.org·20h
🎵Music Universality
EXPOTION: Facial Expression and Motion Control for Multimodal Music Generation
arxiv.org·20h
🎼Computational Musicology
Radial Attention: O(nlogn) Attention for Long Video Generation with 2-4× Speedup
hanlab.mit.edu·1d·
Discuss: Hacker News
🌊Streaming Algorithms
Large Language Models: A Self-Study Roadmap
kdnuggets.com·1d
💻Local LLMs
Loading...Loading more...
AboutBlogChangelogRoadmap