Human or Machine? Low-Latency Audio Detection of Humans at Scale
nooks.ai·4h·
Discuss: Hacker News
Flag this post

Most outbound calls never reach a human. At scale, sales teams in the U.S. waste hundreds of millions of hours each year on rings, voicemails, and phone menus. If you could automatically filter out those machine responses, you could recover that lost time. However, discerning a human from a machine in real time is a deceptively hard problem. You have only the first hundred milliseconds of speech to accurately decide whether to connect the call before latency becomes noticeable.

We built a system that makes this decision millions of times a day, outperforming humans in both speed and accuracy. Getting there required careful data bootstrapping, a low-latency architecture, and a tunable decision engine we can position anywhere along the precision–recall–latency surface. This post wa…

Similar Posts

Loading similar posts...