<   SEO     FOR   web developers       />
          Universidad CEU San Pablo
             Madrid, 26 febrero 2013
                  Ruben Martinez




             Paradigma | Javahispano
WHAT IS SEo ?

SEO is everything that helps a website generate more
           revenues from search engines.




       Technical                     Off
          seo                       page
Why is technical SEO
            important


1   Helps close the gap between web servers and
    search engines




2   Helps close the gap between search engines and
    websites
www search flow
                                                                                   OPTIMIZE

                   User          UNDERSTAND

                                                               Web developer




resources     World Wide Web   Search engines    Servers       Site architecture     Page      Author/s



Bottlenecks     Connections       Crawl           Speed           Structure         Content    Content
                                  Index         Availability                       relevance   purpose
                                   Rank




 SEO deals with the bottlenecks
           in the information flow
What can SEO do for a
               web developer?
Save time               Organize
and energy              functionalities



                        Intermediate
Detect                  the expectations
unknown                 of UX, design and
bugs early              web developers
How does an experienced SEO
      audit a web site
                           1 Crawl
                           2 Filter
                             $ head crawl.txt

       $ cut -f1,2 crawl.txt | sed -e 's/http://www.{domain}.
 {tld}//g' -e 's/t/,/g' |grep -v ".jpg|http:|.css|.js" >filtered.csv

                           $ head -5 filtered.csv


    3    Visualize the network and analyze

                           4    Gephi
variables for audit
        Backlinks

    Targeted keywords

    Content inventory

     Site architecture

       Site health

      Engagement
Measurement                  tools
Backlinks           Ahrefs, OpenSiteExplorer
PageRank            Google Toolbar
Competition         Adword’s Keyword Estimator Tool
Rankings            Google Webmasters Tools
Content inventory   Xenu, Screaming Frog
Duplicate content   Copyscape
Pages indexed       Operator “site:” on Google
Site architecture   Gephi
Server logs         Apache Log Viewer, Splunk
Crawler reports     Google Webmasters Tools
Engagement          Web analytics providers
Link graph
                          Example www.bigdataspain.org


     speakers.php



                                                        terms-and-conditions.pdf


                                                               /2012/program.php




                                                                                   program.pdf
program.php                                             en-index.php




              venue.php
                                                 hashtag-traking-live.php
What SEO should developers
        carry out?



             Content is king.
  Make sure that you have great content.
1 Findable content
  Reach out to publishers
  Upload your content
  Upload sitemaps to search engines

    Image    <?xml version="1.0" encoding="UTF-8"?>
              <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
      XML      xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
              <url>
 sitemaps       <loc>http://example.com/sample.html</loc>
                <image:image>
                 <image:loc>http://example.com/image.jpg</image:loc>
                </image:image>
                <image:image>
                 <image:loc>http://example.com/photo.jpg</image:loc>
                </image:image>
              </url>
             </urlset>
2 Accessible content
  Host your content in an easy-to-reach reliable server
  Design a simple site architecture

  Link your internal pages sensibly

  Curate broken links

  URL structure

  Avoid frames and flash              $ w3m –dump “http://www.ft.com/” less
3 Clear content
  Determine the canonical page.

  Pagination and canonicalization

        <link rel="canonical" href="http://www.example.com/article
        story=abc&page=2"/>



        <link rel="prev" href="http://www.example.com/article?
        story=abc&page=1&sessionid=123" />



         <link rel="next" href="http://www.example.com/article?
         story=abc&page=3&sessionid=123" />
4 Controllable content
  Use robots.txt
  Block bots of spammers and scrapers
  Avoid cloaking
  Use the metatags noindex, noarchive

  Submit URLs you want to remove from Google’s index

  Monitor your site for hacked content

  Set the crawling rate of Googlebot
  Administer your PageRank budget
5 Valuable content
  Write a content management protocol to deal with
  obsolete content.



                 minimise 404 errors and provide a useful 404 page


  learn the differences between 301 and 302 redirects and use preferably 301 codes


            use the 410 HTTP status code in some cases for empty pages
6 Measurable content
  Get data from:


  Server logs                log <- getURL("sftp://user:password@host:
                             /path/to/apache/accesslog.log")

  Libraries of tags

  Google Analytics

  Split tests or tests A/B
google webmasters   Crawl Erros
Crawl Status
Index Status
Search Queries
Advanced SEO

Setting URL parameters on Google Webmasters Tools


         Latent Dirichleet Allocation (LDA)


        International and multilingual SEO


      Prediction of traffic – valuation of traffic
thank you
  Follow
@rubenmartinezs
@paradigmate
@javahispano

SEO for Developers

  • 1.
    < SEO FOR web developers /> Universidad CEU San Pablo Madrid, 26 febrero 2013 Ruben Martinez Paradigma | Javahispano
  • 2.
    WHAT IS SEo? SEO is everything that helps a website generate more revenues from search engines. Technical Off seo page
  • 3.
    Why is technicalSEO important 1 Helps close the gap between web servers and search engines 2 Helps close the gap between search engines and websites
  • 4.
    www search flow OPTIMIZE User UNDERSTAND Web developer resources World Wide Web Search engines Servers Site architecture Page Author/s Bottlenecks Connections Crawl Speed Structure Content Content Index Availability relevance purpose Rank SEO deals with the bottlenecks in the information flow
  • 5.
    What can SEOdo for a web developer? Save time Organize and energy functionalities Intermediate Detect the expectations unknown of UX, design and bugs early web developers
  • 6.
    How does anexperienced SEO audit a web site 1 Crawl 2 Filter $ head crawl.txt $ cut -f1,2 crawl.txt | sed -e 's/http://www.{domain}. {tld}//g' -e 's/t/,/g' |grep -v ".jpg|http:|.css|.js" >filtered.csv $ head -5 filtered.csv 3 Visualize the network and analyze 4 Gephi
  • 7.
    variables for audit Backlinks Targeted keywords Content inventory Site architecture Site health Engagement
  • 8.
    Measurement tools Backlinks Ahrefs, OpenSiteExplorer PageRank Google Toolbar Competition Adword’s Keyword Estimator Tool Rankings Google Webmasters Tools Content inventory Xenu, Screaming Frog Duplicate content Copyscape Pages indexed Operator “site:” on Google Site architecture Gephi Server logs Apache Log Viewer, Splunk Crawler reports Google Webmasters Tools Engagement Web analytics providers
  • 9.
    Link graph Example www.bigdataspain.org speakers.php terms-and-conditions.pdf /2012/program.php program.pdf program.php en-index.php venue.php hashtag-traking-live.php
  • 10.
    What SEO shoulddevelopers carry out? Content is king. Make sure that you have great content.
  • 11.
    1 Findable content Reach out to publishers Upload your content Upload sitemaps to search engines Image <?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" XML xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"> <url> sitemaps <loc>http://example.com/sample.html</loc> <image:image> <image:loc>http://example.com/image.jpg</image:loc> </image:image> <image:image> <image:loc>http://example.com/photo.jpg</image:loc> </image:image> </url> </urlset>
  • 12.
    2 Accessible content Host your content in an easy-to-reach reliable server Design a simple site architecture Link your internal pages sensibly Curate broken links URL structure Avoid frames and flash $ w3m –dump “http://www.ft.com/” less
  • 13.
    3 Clear content Determine the canonical page. Pagination and canonicalization <link rel="canonical" href="http://www.example.com/article story=abc&page=2"/> <link rel="prev" href="http://www.example.com/article? story=abc&page=1&sessionid=123" /> <link rel="next" href="http://www.example.com/article? story=abc&page=3&sessionid=123" />
  • 14.
    4 Controllable content Use robots.txt Block bots of spammers and scrapers Avoid cloaking Use the metatags noindex, noarchive Submit URLs you want to remove from Google’s index Monitor your site for hacked content Set the crawling rate of Googlebot Administer your PageRank budget
  • 15.
    5 Valuable content Write a content management protocol to deal with obsolete content. minimise 404 errors and provide a useful 404 page learn the differences between 301 and 302 redirects and use preferably 301 codes use the 410 HTTP status code in some cases for empty pages
  • 16.
    6 Measurable content Get data from: Server logs log <- getURL("sftp://user:password@host: /path/to/apache/accesslog.log") Libraries of tags Google Analytics Split tests or tests A/B
  • 17.
    google webmasters Crawl Erros
  • 18.
  • 19.
  • 20.
  • 21.
    Advanced SEO Setting URLparameters on Google Webmasters Tools Latent Dirichleet Allocation (LDA) International and multilingual SEO Prediction of traffic – valuation of traffic
  • 22.
    thank you Follow @rubenmartinezs @paradigmate @javahispano