DeepASA: An Object-Oriented One-for-All Network for Auditory Scene Analysis
arxiv.orgยท3h
๐ŸŽงLearned Audio
As Good as a Coin Toss: Human Detection of AI-Generated Content
cacm.acm.orgยท13hยท
Discuss: Hacker News
๐Ÿ“ŠLearned Metrics
BeepBank-500: A Synthetic Earcon Mini-Corpus for UI Sound Research and Psychoacoustics Research
arxiv.orgยท3h
๐ŸŒˆSpectral Audio
DISPATCH: Distilling Selective Patches for Speech Enhancement
arxiv.orgยท1d
๐Ÿ‘‚Psychoacoustic Coding
Attentive AV-FusionNet: Audio-Visual Quality Prediction with Hybrid Attention
arxiv.orgยท3h
๐Ÿง Learned Codecs
Automotive Sound Quality for EVs: Psychoacoustic Metrics with Reproducible AI/ML Baselines
arxiv.orgยท3h
๐Ÿ‘‚Psychoacoustics
Adaptive Spike-Timing Dependent Plasticity Driven By Disordered Neural Networks for Enhanced Temporal Pattern Recognition
dev.toยท7hยท
Discuss: DEV
๐Ÿ”ฒCellular Automata
Show HN: Python Audio Transcription: Convert Speech to Text Locally
pavlinbg.comยท13hยท
Discuss: Hacker News
๐ŸŽ™๏ธWhisper
SVeritas: Benchmark for Robust Speaker Verification under Diverse Conditions
arxiv.orgยท3h
๐ŸŽตAcoustic Fingerprinting
Reverse Attention for Lightweight Speech Enhancement on Edge Devices
arxiv.orgยท3h
๐ŸŽงLearned Audio
Reference-aware SFM layers for intrusive intelligibility prediction
arxiv.orgยท3h
๐Ÿ‘‚Psychoacoustic Coding
VAInpaint: Zero-Shot Video-Audio inpainting framework with LLMs-driven Module
arxiv.orgยท3h
๐Ÿ‘‚Psychoacoustic Coding
AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing?
arxiv.orgยท3h
๐Ÿ‘‚Psychoacoustic Coding
HyPlaneHead: Rethinking Tri-plane-like Representations in Full-Head Image Synthesis
arxiv.orgยท3h
๐ŸŒ€Hyperbolic Geometry
Audio Super-Resolution with Latent Bridge Models
arxiv.orgยท3h
๐ŸŽงLearned Audio
AISTAT lab system for DCASE2025 Task6: Language-based audio retrieval
arxiv.orgยท3h
๐ŸŽตAudio ML
Does Audio Matter for Modern Video-LLMs and Their Benchmarks?
arxiv.orgยท3h
๐Ÿง Learned Codecs
An Octave-based Multi-Resolution CQT Architecture for Diffusion-based Audio Generation
arxiv.orgยท3h
๐ŸŽงLearned Audio
Audio Contrastive-based Fine-tuning: Decoupling Representation Learning and Classification
arxiv.orgยท3h
๐Ÿ‘๏ธPerceptual Coding
SingLEM: Single-Channel Large EEG Model
arxiv.orgยท3h
๐ŸŽตAudio ML