Google Open Sources Its ‘Web Crawler’ After 20 Years

Google’s Robot Exclusion Protocol (REP), also known as robots.txt, is a standard used by many websites to tell the automated crawlers which parts of the site should be crawled or not. However, it isn’t the officially adopted standard, leading to different interpretations. In a bid to make REP an official web standard, Google has open-sourced […]