A person looking to find reviews of online dating can find them in a variety of places.
They are the most advertised, and the most in depth.
This mathematical combination creates a problem for crawlers, as they must sort through endless combinations of relatively minor scripted changes in order to retrieve unique content. noted, "Given that the bandwidth for conducting crawls is neither infinite nor free, it is becoming essential to crawl the Web in not only a scalable, but efficient way, if some reasonable measure of quality or freshness is to be maintained." Given the current size of the Web, even large search engines cover only a portion of the publicly available part.
A 2009 study showed even large-scale search engines index no more than 40-70% of the indexable Web; As a crawler always downloads just a fraction of the Web pages, it is highly desirable for the downloaded fraction to contain the most relevant pages and not just a random sample of the Web.
URLs from the frontier are recursively visited according to a set of policies.
If the crawler is performing archiving of websites it copies and saves the information as it goes.