Most of the Web isn't reachable by a crawler. It sits behind search forms - on government registers, library catalogues, scientific databases, niche topic engines - and the only way in is to ask. This is a paper about asking thousands of them at once.
This 2003 IADIS paper describes the architecture of Turbo10 (later T10) - a metasearch engine that aggregated thousands of specialised "deep net" search engines into a single query interface. The hard part isn't the searching. It's keeping the adapters alive: each engine has its own form, its own URL pattern, its own way of formatting results, and they all change without warning.
A user types a query once. The metasearch broker fans it out to a chosen set of relevant engines - PubMed, the patent office, library catalogues, niche commerce engines - through small adapters that translate the query into the shape each engine expects. Results come back in different formats; the broker normalises and merges them, deduplicates, and ranks the unified list before showing it to the user.
Crawler-based search has a single hard problem (scale). Metasearch has a different one: adapter rot. Forms change. URL structures change. Result formats change. Engines disappear. The paper describes the automation around this - a process for generating, monitoring and repairing adapters at scale, so the system can offer thousands of engines without an army of maintainers.
That automation is what made the difference between a clever demo and a system that ran for almost a decade. T10 handled around 100 million searches a day at peak before pivoting into a search-advertising network and finally closing in 2012 after Google's anti-competitive practices in EU search advertising eroded the addressable market.
Related: Search Trails: Back to the Future picks up the thread four years later, and A patent for blazing search trails describes the trail-recording method that grew out of this work.