Aduana: Link analysis to crawl the web at scale (opens in new tab)
Crawling vast numbers of websites for specific types of information is impractical. Unless, that is, you prioritize what you crawl. Aduana is an experimental tool that we developed to help you do that. It’s a special backend for Frontera, our tool to expedite massive crawls in parallel (primer here). Aduana is designed for situations where […]
Read the original article