We're excited to announce that we've added two settings in source configuration, to give you more control over the behavior of your Sitecore Search crawlers and increase the likelihood of successful crawl jobs.
Optional delay: For single thread advanced crawlers (where the parallelism setting is set to 1), you can now add a delay in between requests, to avoid getting “Too Many Requests” errors from websites that follow strict crawling guidelines. Previously, there was no way to set a crawler delay. By default, the delay is set to 0 ms.
Cookie disabling: You now have the option to disable cookies that the crawler sends. This feature is especially useful because sending cookies can sometimes trigger the website to mark the crawler as a bot or automated program. Additionally, when the crawler doesn’t send cookies, it accesses and indexes the default, non-personalized version of websites. Previously, there was no way to disable cookies. In the CEC, the Enable navigation cookies during crawling option is enabled by default.
These updates are available for all crawlers except the (basic) web crawler.