Multilingual benchmark evaluates how well AI interprets clinical text and health records in nine languages (opens in new tab)

Researchers at Mass General Brigham recently developed BRIDGE, a multilingual benchmark that evaluates how well large language models (LLMs) understand clinical patient care text, including language used in electronic health records (EHRs), across nine languages. The benchmarking tool could help clinicians evaluate and compare LLMs for use in specific contexts. Results are published in Nature Biomedical Engineering.

Read the original article