As of today, the huge human-edited web directory Curlie.org is made publicly available for download – thanks to the OpenWebSearch.eu initiative.
With over 2.9 million well-structured entries, Curlie.org is a clear guide to the Internet. The download now enables operators of niche websites to offer website catalogues on their topic. This is also good news for operators of alternative search engines. The trustworthy entries in DMOZ (Curlie’s predecessor project) have long been Google’s secret sauce to displaying spam-free and relevant search results.
The download of the Curlie database under an open source license is enabled by the European Open Web Search initiative via **[OpenWebSearch.eu](https://openwebsearch.eu/the-proje…
As of today, the huge human-edited web directory Curlie.org is made publicly available for download – thanks to the OpenWebSearch.eu initiative.
With over 2.9 million well-structured entries, Curlie.org is a clear guide to the Internet. The download now enables operators of niche websites to offer website catalogues on their topic. This is also good news for operators of alternative search engines. The trustworthy entries in DMOZ (Curlie’s predecessor project) have long been Google’s secret sauce to displaying spam-free and relevant search results.
The download of the Curlie database under an open source license is enabled by the European Open Web Search initiative via OpenWebSearch.eu project partner Leibniz Supercomputing Centre (LRZ). As of now, the provider of scientific IT services in Munich, Germany and Europe will provide a constantly updated dump of the entire Curlie directory.
OpenWebSearch.eu is already offering the pilot version of an Open Web Index, which contains roughly 1.3 billion website entries. This index should serve as the basis for an expandable search infrastructure that complies with European democratic values, legal regulations and standards. It enables the creation of alternative search engines that do not have to rely on an index from the big tech companies.
Category data from Curlie.org is already integrated into the Open Web Index. Curlie data also supports the identification of high-quality websites and the corresponding guidance of website crawlers. Some 45.000 categories containing geographic labelling open the door to enriching location-aware apps.
For the open search community, the cooperation opens up new ways to judge information, says Laura Brown at Curlie.org: ‘We only include high-quality websites in our directory that provide useful information. This is ensured by our experienced and specialised volunteer editors in the individual categories. That’s the advantage we humans have over chat language models: We can assess whether websites are trustworthy. With Curlie, you can always see the source of the information.’
‘We want to enable free, unbiased and transparent access to information. By working together, we are taking a big step towards greater data transparency and data democracy on the World Wide Web,’ explains Michael Granitzer, project manager at OpenWebSearch.eu. The computer science professor at the University of Passau sees many use cases: ‘For example, the combined knowledge of the Curlie editors can be easily leveraged to exclude AI-generated websites from search results – or to flag them. This would give search engine users more transparency about their search results.’
The Curlie directory is now available for free download at