A database containing 149 million account usernames and passwords—including 48 million for Gmail, 17 million for Facebook, and 420,000 for the cryptocurrency platform Binance—has been removed after a researcher reported the exposure to the hosting provider.
The longtime security analyst who discovered the database, Jeremiah Fowler, could not find indications of who owned or operated it, so he worked to notify the host, which took down the trove because it violated a terms of service agreement.
In addition to email and social media logins for a number of platforms, Fowler also observed credentials for government systems from multiple countries as well as consumer banking and credit card logins and media streaming platforms. Fowler suspects that the database had been assembled by infostealing malware that infects devices and then uses techniques like keylogging to record information that victims type into websites.
While attempting to contact the hosting service over the course of about a month, Fowler says the database continued to grow, accumulating additional logins for an array of services. He is not naming the provider, because the company is a global host that contracts with independent regional companies to expand its reach. The database was hosted by one of these affiliates in Canada.
“This is like a dream wish list for criminals because you have so many different types of credentials,” Fowler told WIRED. “An infostealer would make the most sense. The database was in a format made for indexing large logs as if whoever set it up was expecting to gather a lot of data. And there were tons of government logins from many different countries.”
In addition to the 48 million Gmail credentials, the trove also contained about four million for Yahoo accounts, 1.5 million for Microsoft Outlook, 900,000 for Apple’s iCloud, and 1.4 million for “.edu” academic and institutional accounts. There were also, among others, about 780,000 logins for TikTok, 100,000 for OnlyFans, and 3.4 million for Netflix. The data was publicly accessible and searchable using just a web browser.
“It seemed like it captured anything and everything, but one thing that was interesting was that the system seemed to automatically classify each log with an identifier, and these were unique identifiers that didn’t reappear,” Fowler says. “It seemed like the system was organizing the data automatically as it went for easier searching.