Overview of preprocessing steps, feature extraction, and modeling process. WhatsApp audios were preprocessed, acoustic features extracted, and models trained with a standardized pipeline. Seven ML algorithms were evaluated on an independe…
Overview of preprocessing steps, feature extraction, and modeling process. WhatsApp audios were preprocessed, acoustic features extracted, and models trained with a standardized pipeline. Seven ML algorithms were evaluated on an independent test set. Credit: Otani et al., 2026, PLOS Mental Health, CC-BY 4.0 (creativecommons.org/licenses/by/4.0/)
A new medical large language model (LLM) achieved over 91% accuracy in identifying female participants diagnosed with major depressive disorder after analyzing a short WhatsApp audio recording where participants described their week, according to a study published in PLOS Mental Health by Victor H. O. Otani, from Santa Casa de São Paulo School of Medical Sciences and Infinity Doctors Inc., Brazil, and colleagues.
Study design and participant details
Major depressive disorder is a mental health condition that affects over 280 million people globally, and early detection can be critical for timely treatment. Here, Otani and colleagues used machine learning models to classify individuals with and without major depressive disorder based on WhatsApp voice messages.
The authors used two datasets for this study, a dataset to train their LLMs (with seven different sub-models used) and then a dataset to test their LLMs. The training dataset consisted of 86 participants: a group of outpatients (37 women, eight men) clinically diagnosed with major depressive disorder and a control group of 41 volunteers (30 women, 11 men) with no depression diagnoses.
The dataset used to test the trained models consisted of 74 participants: 33 outpatients (17 women, 16 men) diagnosed with major depressive disorder and 41 control group participants (21 women, 20 men) with no depression diagnoses. All participants were provided informed consent and screened to exclude potential confounding factors such as other medical issues.
Audio data collection and analysis
In the training dataset, outpatient speech data was taken from WhatsApp audio recordings sent to their doctor’s offices when they were symptomatic; control group participants chose their own routine WhatsApp audio voice messages to share.
For the test dataset, speech data taken from the outpatient group and the control group were the same: recorded WhatsApp messages counting from 1–10, as well as audio messages describing their past week. All audio messages in both datasets were from native Brazilian Portuguese speakers.
Model performance and future implications
The LLMs showed greater accuracy when classifying women compared to men as depressed versus not depressed, particularly when given the "describe your week" data, with an accuracy rate of 91.9% for the highest-performing model. The highest-performing model’s accuracy when classifying male participants was 75% for the same "describe your week" audio. (This may potentially be explained by the higher number of women participants in the model training dataset, as well as differences in speech patterns between men and women.)
The LLMs showed more similar performance between men and women when given the "count to 10" data, with the highest-performing model 82% accurate in women and 78% accurate in men.
The authors are hopeful that continued refinement of their models could produce a low-cost and practical way to screen individuals for depression, as well as other potential clinical/research applications.
Senior author Lucas Marques adds, "Our study shows that subtle acoustic patterns in spontaneous WhatsApp voice messages can help identify depressive profiles with surprising accuracy using machine learning. This opens a promising path for low-burden, real-world digital screening tools that respect people’s daily communication habits."
Publication details
Otani VHO, et al. ML-based detection of depressive profile through voice analysis in WhatsApp audio messages of Brazilian Portuguese Speakers, PLOS Mental Health (2026). DOI: 10.1371/journal.pmen.0000357
Journal information: PLOS Mental Health
Citation: LLMs can identify major depressive disorder via voice note recordings (2026, January 21) retrieved 21 January 2026 from https://medicalxpress.com/news/2026-01-llms-major-depressive-disorder-voice.html
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.