Transcript of "Analyzing Robots.txt for Fun and Profit"
Mining Robots.txt for Fun and Profit Vivek Ramachandran http://www.SecurityTube.Net
SecurityTube.Net www.SecurityTube.Net - the YouTube for Computer Networking and Security!
What is Robots.Txt ? <ul><li>A plain text file placed in the wwwroot of a website </li></ul><ul><li>It serves as a way to instruct automated bots such as search engine bots (Googlebot , Yahoo! Slurp etc ) about what to mine from the site and what not to </li></ul><ul><li>It is written in what is referred to as a Robots Exclusion Protocol </li></ul>
What Robots.txt should not be used for! <ul><li>It should never be used to hide important directories </li></ul><ul><li>Should be never used as some form of security </li></ul><ul><li>Reason: </li></ul><ul><li>The file is world readable </li></ul><ul><li>Anyone can disobey the rules written there </li></ul>