Blog - Announcing GneissWeb Annotations (opens in new tab)
Common Crawl has added IBM’s GneissWeb quality and category annotations to its web dataset, enabling users to filter high-quality content and explore topics like medical, education, and technology.
Read the original article