SoundReactor: Frame-level Online Video-to-Audio Generation
arxiv.org·21h
💿FLAC Archaeology
AI Under the Hood Part I: Understanding the Machine
kennethwolters.com·13h·
Discuss: Hacker News
📼Cassette Combinators
Adaptive Diffusive Quantization for Enhanced Image Reconstruction Fidelity
dev.to·1d·
Discuss: DEV
🖼️JPEG XL
[P] Building a Music Search Engine + Foundational Model on 100M+ Latent Audio Embeddings
reddit.com·11h·
🎵Audio ML
An Anechoic Chamber at Nokia Bell Labs Reveals the Hidden Sounds of Your Body
scientificamerican.com·15h
📡Frequency Archaeology
ThinkSound AI
thinksoundai.com·20h·
Discuss: Hacker News
💿FLAC Archaeology
Unlocking Symbol-Level Precoding Efficiency Through Tensor Equivariant Neural Network
arxiv.org·21h
🧠Neural Codecs
Hume AI Octave 2: new text-to-speech model, 11+ languages
hume.ai·6h·
Discuss: Hacker News
🎙️Whisper
Benchmark: Spark vs. Ray Data vs. Daft on Multimodal Workloads
daft.ai·7h·
Discuss: Hacker News
🌊Stream Processing
Eliminating the Precision–Latency Trade-Off in Large-Scale RAG
thenewstack.io·9h
🎯Retrieval Systems
Whispers of A.I.'s Modular Future (2023)
newyorker.com·2d·
Discuss: Hacker News
🎙️Whisper
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation
arxiv.org·21h
LZ4 Streaming
Building Blocks of Awareness: A Modular Approach to Artificial Minds by Arvind Sundararajan
future.forem.com·13h·
Discuss: DEV
🧠Intelligence Compression
Linear Algebra for AI: A Beginner-Friendly Guide with Real-World Examples
dev.to·14h·
Discuss: DEV
📐Linear Algebra
Sora 2: AI Video Generation with Realistic Sound
2-sora.com·20h·
Discuss: Hacker News
🧠Learned Codecs
SLAP: Learning Speaker and Health-Related Representations from Natural Language Supervision
arxiv.org·21h
🎵Audio ML
VideoNSA: Native Sparse Attention Scales Video Understanding
arxiv.org·21h
🧠Learned Codecs