SlideShare a Scribd company logo
1 of 18
robots.txt and sitemap.xml

PRACTICAL GUIDE FOR SEO BEGINNERS
SEO Beginners

ROBOTS.TXT
WHAT ARE WEB ROBOTS?

   Web Robots (also known as Web Wanderers,
    Crawlers, or Spiders), are programs that
    traverse the Web automatically. Search engines
    such as Google use them to index the web
    content, spammers use them to scan for email
    addresses, and they have many other uses.
WHAT IS ROBOTS.TXT?

   Robots.txt is a plain text file that you upload to
    the root directory of your site. Once the web
    spiders (ants, bots, indexers) that index your
    webpage search your site, they first look at that
    text file and process it. Put differently, robots.txt
    says to the spider which pages to crawl.
THE SIMPLEST VERSION OF ROBOTS.TXT
User-agent: *
Disallow:

   The first line “user agent asterisk” indicates
    that the following lines apply to all agents.
    Space after "disallow:" means that nothing is
    limited. This robots.txt file does nothing it
    allows all types of robots to see everything on
    the site.
SOME MORE EXAMPLES OF ROBOTS.TXT
   To exclude all robots from the entire server
    User-agent: *
    Disallow: /

   To allow all robots complete access
    User-agent: *
    Disallow:

    (or just create an empty "/robots.txt" file, or don't use
      one at all)
SOME MORE EXAMPLES OF ROBOTS.TXT
   To exclude all robots from part of the server
    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /tmp/
    Disallow: /~joe/

   To exclude a single robot
    User-agent: BadBot
    Disallow: /
SOME MORE EXAMPLES OF ROBOTS.TXT
   To allow a single robot
    User-agent: Googlebot
    Disallow:

    User-agent: *
    Disallow: /

   You can disallow single pages:
    User-agent: *
    Disallow: /~joe/junk.html
    Disallow: /~joe/foo.html
    Disallow: /~joe/bar.html
SOME MORE EXAMPLES OF ROBOTS.TXT

   You can specify the Sitemap location in your
    robots.txt file

    User-agent: *
    Disallow: /

    Sitemap: http://www.example.com/sitemap.xml
ABOUT THE ROBOTS <META> TAG
   You can use a special HTML <META> tag to tell
    robots not to index the content of a page, and/or
    not scan it for links to follow.

    <html>
    <head>
    <title>...</title>
    <META NAME="ROBOTS"
      CONTENT="NOINDEX, NOFOLLOW">
    </head>
SEO Beginners

SITEMAP.XML
WHAT ARE SITEMAPS?

 Tells search engines which pages are available
  for crawling.
 A Sitemap is an XML file that lists URLs for a
  site along with additional metadata about each
  URL.
     when it was last updated
     how often it usually changes

     how important it is, relative to other URLs in the site
SITEMAPS XML FORMAT
   The Sitemap must:
     Begin with an opening <urlset> tag and end with a
      closing </urlset> tag.
     Specify the namespace (protocol standard) within the
      <urlset> tag.
     Include a <url> entry for each URL, as a parent XML
      tag.
     Include a <loc> child entry for each <url> parent tag.
     All URLs in a Sitemap must be from a single host, such
      as www.example.com or store.example.com.
     Sitemap file must be UTF-8 encoded
     No more than 50,000 URLs
     File must not be larger than 10MB
SAMPLE XML SITEMAP
   <?xml version="1.0" encoding="UTF-8"?>

   <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

    <url>

      <loc>http://www.example.com/</loc>

      <lastmod>2005-01-01</lastmod>

      <changefreq>monthly</changefreq>

      <priority>0.8</priority>

    </url>

   </urlset>
USING SITEMAP INDEX FILES (TO GROUP
MULTIPLE SITEMAP FILES)

   The Sitemap index file must:
       Begin with an opening <sitemapindex> tag and end with a
        closing </sitemapindex> tag.
       Include a <sitemap> entry for each Sitemap as a parent
        XML tag.
       Include a <loc> child entry for each <sitemap> parent tag.
       The optional <lastmod> tag is also available for Sitemap
        index files.
   Note: A Sitemap index file can only specify Sitemaps
    that are found on the same site as the Sitemap index
    file. For example,
    http://www.yoursite.com/sitemap_index.xml can include
    Sitemaps on http://www.yoursite.com but not on
    http://www.example.com or
    http://yourhost.yoursite.com.
SAMPLE XML SITEMAP INDEX
   <?xml version="1.0" encoding="UTF-8"?>

   <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

    <sitemap>

      <loc>http://www.example.com/sitemap1.xml.gz</loc>

      <lastmod>2004-10-01T18:23:17+00:00</lastmod>

    </sitemap>

    <sitemap>

      <loc>http://www.example.com/sitemap2.xml.gz</loc>

      <lastmod>2005-01-01</lastmod>

    </sitemap>

   </sitemapindex>
SITEMAP FILE LOCATION

   The location of a Sitemap file determines the
    set of URLs that can be included in that
    Sitemap. A Sitemap file located at
    http://example.com/catalog/sitemap.xml can
    include     any     URLs      starting   with
    http://example.com/catalog/ but can not
    include       URLs        starting       with
    http://example.com/images/.
THANK YOU
                                   ADITYA TODAWAL
                         PROJECT COORDINATOR (SEO)
SEARCH RESULTS MEDIA – INTERNET MARKETING TORONTO

More Related Content

What's hot (20)

Seo Presentation for Beginners, Complete SEO ppt,
Seo Presentation for Beginners, Complete SEO ppt,Seo Presentation for Beginners, Complete SEO ppt,
Seo Presentation for Beginners, Complete SEO ppt,
 
Web hosting presentations by hostindia.net
Web hosting presentations by hostindia.netWeb hosting presentations by hostindia.net
Web hosting presentations by hostindia.net
 
Meta tags
Meta tagsMeta tags
Meta tags
 
Internal Linking
Internal LinkingInternal Linking
Internal Linking
 
Seo ppt
Seo pptSeo ppt
Seo ppt
 
Seo presentation
Seo presentationSeo presentation
Seo presentation
 
SEO On Page Activities 2014
SEO On Page Activities 2014SEO On Page Activities 2014
SEO On Page Activities 2014
 
On Page SEO And Off Page SEO
On Page SEO And Off Page SEOOn Page SEO And Off Page SEO
On Page SEO And Off Page SEO
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimization
 
Link building ppt
Link building pptLink building ppt
Link building ppt
 
Off page seo
Off page seo Off page seo
Off page seo
 
WEB HOSTING
WEB HOSTINGWEB HOSTING
WEB HOSTING
 
Seo digital marketing
Seo digital marketingSeo digital marketing
Seo digital marketing
 
Seo
SeoSeo
Seo
 
On page seo ppt
On page seo ppt On page seo ppt
On page seo ppt
 
WordPress Complete Tutorial
WordPress Complete TutorialWordPress Complete Tutorial
WordPress Complete Tutorial
 
Search Engine
Search EngineSearch Engine
Search Engine
 
Web crawler
Web crawlerWeb crawler
Web crawler
 
Basics of Search Engine Optimisation
Basics of Search Engine OptimisationBasics of Search Engine Optimisation
Basics of Search Engine Optimisation
 
Google Adwords For Beginners
Google Adwords For BeginnersGoogle Adwords For Beginners
Google Adwords For Beginners
 

Viewers also liked

SEO for beginners
SEO for beginnersSEO for beginners
SEO for beginnersSocialab
 
SEO Guide for Beginners, The Beginner Guide to SEO
SEO Guide for Beginners, The Beginner Guide to SEOSEO Guide for Beginners, The Beginner Guide to SEO
SEO Guide for Beginners, The Beginner Guide to SEORahul Kumar
 
BlogPaws 2014 - Beginner SEO
BlogPaws 2014 - Beginner SEOBlogPaws 2014 - Beginner SEO
BlogPaws 2014 - Beginner SEOMatt Beswick
 
SEO for Beginners by Ducktoes SEO Agency
SEO for Beginners by Ducktoes SEO AgencySEO for Beginners by Ducktoes SEO Agency
SEO for Beginners by Ducktoes SEO AgencyCathie Dunklee-Donnell
 
Digital Marketing Training Noida - SEO, PPC, SMO Classes
Digital Marketing Training Noida - SEO, PPC, SMO ClassesDigital Marketing Training Noida - SEO, PPC, SMO Classes
Digital Marketing Training Noida - SEO, PPC, SMO ClassesRachit Gupta
 
Beginning and Advanced SEO for Beginners Workshop
Beginning and Advanced SEO for Beginners WorkshopBeginning and Advanced SEO for Beginners Workshop
Beginning and Advanced SEO for Beginners WorkshopJohnBolyard.com
 
Your first sitemap.xml and robots.txt implementation
Your first sitemap.xml and robots.txt implementationYour first sitemap.xml and robots.txt implementation
Your first sitemap.xml and robots.txt implementationJérôme Verstrynge
 
What is SEO? - Basic SEO Guide for Beginners.pptx
What is SEO? - Basic SEO Guide for Beginners.pptxWhat is SEO? - Basic SEO Guide for Beginners.pptx
What is SEO? - Basic SEO Guide for Beginners.pptxGeromme Talampas
 
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...TechSoup
 
Beginners Guide To SEO - Adam Vowles
Beginners Guide To SEO - Adam VowlesBeginners Guide To SEO - Adam Vowles
Beginners Guide To SEO - Adam VowlesAdam Vowles
 
How to Build SEO into Content Strategy
How to Build SEO into Content StrategyHow to Build SEO into Content Strategy
How to Build SEO into Content StrategyJonathon Colman
 

Viewers also liked (15)

SEO for beginners
SEO for beginnersSEO for beginners
SEO for beginners
 
SEO Guide for Beginners, The Beginner Guide to SEO
SEO Guide for Beginners, The Beginner Guide to SEOSEO Guide for Beginners, The Beginner Guide to SEO
SEO Guide for Beginners, The Beginner Guide to SEO
 
BlogPaws 2014 - Beginner SEO
BlogPaws 2014 - Beginner SEOBlogPaws 2014 - Beginner SEO
BlogPaws 2014 - Beginner SEO
 
Basic seo rules
Basic seo rulesBasic seo rules
Basic seo rules
 
SEO for Beginners by Ducktoes SEO Agency
SEO for Beginners by Ducktoes SEO AgencySEO for Beginners by Ducktoes SEO Agency
SEO for Beginners by Ducktoes SEO Agency
 
Digital Marketing Training Noida - SEO, PPC, SMO Classes
Digital Marketing Training Noida - SEO, PPC, SMO ClassesDigital Marketing Training Noida - SEO, PPC, SMO Classes
Digital Marketing Training Noida - SEO, PPC, SMO Classes
 
Beginning and Advanced SEO for Beginners Workshop
Beginning and Advanced SEO for Beginners WorkshopBeginning and Advanced SEO for Beginners Workshop
Beginning and Advanced SEO for Beginners Workshop
 
Your first sitemap.xml and robots.txt implementation
Your first sitemap.xml and robots.txt implementationYour first sitemap.xml and robots.txt implementation
Your first sitemap.xml and robots.txt implementation
 
Basic SEO Lecture Presentation
Basic SEO Lecture PresentationBasic SEO Lecture Presentation
Basic SEO Lecture Presentation
 
What is SEO? - Basic SEO Guide for Beginners.pptx
What is SEO? - Basic SEO Guide for Beginners.pptxWhat is SEO? - Basic SEO Guide for Beginners.pptx
What is SEO? - Basic SEO Guide for Beginners.pptx
 
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
Webinar - SEO for Beginners: Simple Steps for Nonprofits and Libraries - 2016...
 
Beginners Guide To SEO - Adam Vowles
Beginners Guide To SEO - Adam VowlesBeginners Guide To SEO - Adam Vowles
Beginners Guide To SEO - Adam Vowles
 
SEO - A Beginners' Guide
SEO - A Beginners' GuideSEO - A Beginners' Guide
SEO - A Beginners' Guide
 
How to Build SEO into Content Strategy
How to Build SEO into Content StrategyHow to Build SEO into Content Strategy
How to Build SEO into Content Strategy
 
PPT - Powerful Presentation Techniques
PPT - Powerful Presentation TechniquesPPT - Powerful Presentation Techniques
PPT - Powerful Presentation Techniques
 

Similar to XML Sitemap and Robots.TXT Guide for SEO Beginners

Great+Seo+Cheatsheet
Great+Seo+CheatsheetGreat+Seo+Cheatsheet
Great+Seo+Cheatsheetjeetututeja
 
Google Sitemap and robots.txt Setup Techniques
Google Sitemap and robots.txt Setup TechniquesGoogle Sitemap and robots.txt Setup Techniques
Google Sitemap and robots.txt Setup TechniquesNasir Uddin Shamim
 
Difference between robots txt file, meta robots, X-robots tag
Difference between robots txt file, meta robots, X-robots tagDifference between robots txt file, meta robots, X-robots tag
Difference between robots txt file, meta robots, X-robots tagParidhi Infotech
 
Robots.txt - Control What Crawler Can See
Robots.txt - Control What Crawler Can SeeRobots.txt - Control What Crawler Can See
Robots.txt - Control What Crawler Can SeeLets Get Digital
 
Robots.txt and Sitemap.xml Creation
Robots.txt and Sitemap.xml CreationRobots.txt and Sitemap.xml Creation
Robots.txt and Sitemap.xml CreationJahid Hasan
 
Gabriel Gayhart - XML Pointer File Example
Gabriel Gayhart - XML Pointer File ExampleGabriel Gayhart - XML Pointer File Example
Gabriel Gayhart - XML Pointer File Examplelinkedinsys
 
Top 10 Onsite SEO Practices
Top 10 Onsite SEO PracticesTop 10 Onsite SEO Practices
Top 10 Onsite SEO PracticesCharlie Kalech
 
Canonical and robotos (2)
Canonical and robotos (2)Canonical and robotos (2)
Canonical and robotos (2)panchaloha
 
Advanced SEO through multiple XML sitemaps
Advanced SEO through multiple XML sitemapsAdvanced SEO through multiple XML sitemaps
Advanced SEO through multiple XML sitemapsLaurent Müllender
 
Front End Website Optimization
Front End Website OptimizationFront End Website Optimization
Front End Website OptimizationGerard Sychay
 
Controlling crawler for better Indexation and Ranking
Controlling crawler for better Indexation and RankingControlling crawler for better Indexation and Ranking
Controlling crawler for better Indexation and RankingRajesh Magar
 
Search engine optimization (seo) from Endeca & ATG
Search engine optimization (seo) from Endeca & ATGSearch engine optimization (seo) from Endeca & ATG
Search engine optimization (seo) from Endeca & ATGVignesh sitaraman
 
Seo Bootcamp for Small Buisinesses
 Seo Bootcamp for Small Buisinesses Seo Bootcamp for Small Buisinesses
Seo Bootcamp for Small BuisinessesCharlie Kalech
 
Web design and Development
Web design and DevelopmentWeb design and Development
Web design and DevelopmentShagor Ahmed
 
Web Applications and Deployment
Web Applications and DeploymentWeb Applications and Deployment
Web Applications and DeploymentBG Java EE Course
 

Similar to XML Sitemap and Robots.TXT Guide for SEO Beginners (20)

Sitemap comparison
Sitemap comparisonSitemap comparison
Sitemap comparison
 
Great+Seo+Cheatsheet
Great+Seo+CheatsheetGreat+Seo+Cheatsheet
Great+Seo+Cheatsheet
 
Google Sitemap and robots.txt Setup Techniques
Google Sitemap and robots.txt Setup TechniquesGoogle Sitemap and robots.txt Setup Techniques
Google Sitemap and robots.txt Setup Techniques
 
Difference between robots txt file, meta robots, X-robots tag
Difference between robots txt file, meta robots, X-robots tagDifference between robots txt file, meta robots, X-robots tag
Difference between robots txt file, meta robots, X-robots tag
 
Robots.txt - Control What Crawler Can See
Robots.txt - Control What Crawler Can SeeRobots.txt - Control What Crawler Can See
Robots.txt - Control What Crawler Can See
 
Robots.txt and Sitemap.xml Creation
Robots.txt and Sitemap.xml CreationRobots.txt and Sitemap.xml Creation
Robots.txt and Sitemap.xml Creation
 
Gabriel Gayhart - XML Pointer File Example
Gabriel Gayhart - XML Pointer File ExampleGabriel Gayhart - XML Pointer File Example
Gabriel Gayhart - XML Pointer File Example
 
Top 10 Onsite SEO Practices
Top 10 Onsite SEO PracticesTop 10 Onsite SEO Practices
Top 10 Onsite SEO Practices
 
Canonical and robotos (2)
Canonical and robotos (2)Canonical and robotos (2)
Canonical and robotos (2)
 
Advanced SEO through multiple XML sitemaps
Advanced SEO through multiple XML sitemapsAdvanced SEO through multiple XML sitemaps
Advanced SEO through multiple XML sitemaps
 
Front End Website Optimization
Front End Website OptimizationFront End Website Optimization
Front End Website Optimization
 
Controlling crawler for better Indexation and Ranking
Controlling crawler for better Indexation and RankingControlling crawler for better Indexation and Ranking
Controlling crawler for better Indexation and Ranking
 
SEO Robots txt FILE
SEO Robots txt FILESEO Robots txt FILE
SEO Robots txt FILE
 
Article19
Article19Article19
Article19
 
xml sitemap
xml sitemapxml sitemap
xml sitemap
 
Search engine optimization (seo) from Endeca & ATG
Search engine optimization (seo) from Endeca & ATGSearch engine optimization (seo) from Endeca & ATG
Search engine optimization (seo) from Endeca & ATG
 
Seo Bootcamp for Small Buisinesses
 Seo Bootcamp for Small Buisinesses Seo Bootcamp for Small Buisinesses
Seo Bootcamp for Small Buisinesses
 
Web design and Development
Web design and DevelopmentWeb design and Development
Web design and Development
 
Web Applications and Deployment
Web Applications and DeploymentWeb Applications and Deployment
Web Applications and Deployment
 
T5 Oli Aro
T5 Oli AroT5 Oli Aro
T5 Oli Aro
 

Recently uploaded

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 

Recently uploaded (20)

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 

XML Sitemap and Robots.TXT Guide for SEO Beginners

  • 1. robots.txt and sitemap.xml PRACTICAL GUIDE FOR SEO BEGINNERS
  • 3. WHAT ARE WEB ROBOTS?  Web Robots (also known as Web Wanderers, Crawlers, or Spiders), are programs that traverse the Web automatically. Search engines such as Google use them to index the web content, spammers use them to scan for email addresses, and they have many other uses.
  • 4. WHAT IS ROBOTS.TXT?  Robots.txt is a plain text file that you upload to the root directory of your site. Once the web spiders (ants, bots, indexers) that index your webpage search your site, they first look at that text file and process it. Put differently, robots.txt says to the spider which pages to crawl.
  • 5. THE SIMPLEST VERSION OF ROBOTS.TXT User-agent: * Disallow:  The first line “user agent asterisk” indicates that the following lines apply to all agents. Space after "disallow:" means that nothing is limited. This robots.txt file does nothing it allows all types of robots to see everything on the site.
  • 6. SOME MORE EXAMPLES OF ROBOTS.TXT  To exclude all robots from the entire server User-agent: * Disallow: /  To allow all robots complete access User-agent: * Disallow: (or just create an empty "/robots.txt" file, or don't use one at all)
  • 7. SOME MORE EXAMPLES OF ROBOTS.TXT  To exclude all robots from part of the server User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /~joe/  To exclude a single robot User-agent: BadBot Disallow: /
  • 8. SOME MORE EXAMPLES OF ROBOTS.TXT  To allow a single robot User-agent: Googlebot Disallow: User-agent: * Disallow: /  You can disallow single pages: User-agent: * Disallow: /~joe/junk.html Disallow: /~joe/foo.html Disallow: /~joe/bar.html
  • 9. SOME MORE EXAMPLES OF ROBOTS.TXT  You can specify the Sitemap location in your robots.txt file User-agent: * Disallow: / Sitemap: http://www.example.com/sitemap.xml
  • 10. ABOUT THE ROBOTS <META> TAG  You can use a special HTML <META> tag to tell robots not to index the content of a page, and/or not scan it for links to follow. <html> <head> <title>...</title> <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> </head>
  • 12. WHAT ARE SITEMAPS?  Tells search engines which pages are available for crawling.  A Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL.  when it was last updated  how often it usually changes  how important it is, relative to other URLs in the site
  • 13. SITEMAPS XML FORMAT  The Sitemap must:  Begin with an opening <urlset> tag and end with a closing </urlset> tag.  Specify the namespace (protocol standard) within the <urlset> tag.  Include a <url> entry for each URL, as a parent XML tag.  Include a <loc> child entry for each <url> parent tag.  All URLs in a Sitemap must be from a single host, such as www.example.com or store.example.com.  Sitemap file must be UTF-8 encoded  No more than 50,000 URLs  File must not be larger than 10MB
  • 14. SAMPLE XML SITEMAP  <?xml version="1.0" encoding="UTF-8"?>  <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">  <url>  <loc>http://www.example.com/</loc>  <lastmod>2005-01-01</lastmod>  <changefreq>monthly</changefreq>  <priority>0.8</priority>  </url>  </urlset>
  • 15. USING SITEMAP INDEX FILES (TO GROUP MULTIPLE SITEMAP FILES)  The Sitemap index file must:  Begin with an opening <sitemapindex> tag and end with a closing </sitemapindex> tag.  Include a <sitemap> entry for each Sitemap as a parent XML tag.  Include a <loc> child entry for each <sitemap> parent tag.  The optional <lastmod> tag is also available for Sitemap index files.  Note: A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file. For example, http://www.yoursite.com/sitemap_index.xml can include Sitemaps on http://www.yoursite.com but not on http://www.example.com or http://yourhost.yoursite.com.
  • 16. SAMPLE XML SITEMAP INDEX  <?xml version="1.0" encoding="UTF-8"?>  <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">  <sitemap>  <loc>http://www.example.com/sitemap1.xml.gz</loc>  <lastmod>2004-10-01T18:23:17+00:00</lastmod>  </sitemap>  <sitemap>  <loc>http://www.example.com/sitemap2.xml.gz</loc>  <lastmod>2005-01-01</lastmod>  </sitemap>  </sitemapindex>
  • 17. SITEMAP FILE LOCATION  The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/.
  • 18. THANK YOU ADITYA TODAWAL PROJECT COORDINATOR (SEO) SEARCH RESULTS MEDIA – INTERNET MARKETING TORONTO