Slideshare.net (beta)

 
Post to TwitterPost to Twitter
Post: 
Myspace Hi5 Friendster Xanga LiveJournal Facebook Blogger Tagged Typepad Freewebs BlackPlanet gigya icons

All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 5 (more)

SEO for the Semantic Web

From mihaigheza, 7 months ago

A brief history of SEO from WWW to RDF, Microformats and SPARQL.<br /> more

1771 views  |  0 comments  |  5 favorites  |  80 downloads  |  3 embeds (Stats)
 

Categories

Add Category
 
 

Groups / Events

 

 
Embed
options

More Info

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License
This slideshow is Public
Total Views: 1771
on Slideshare: 1717
from embeds: 54

Slideshow transcript

Slide 1: How do the machines know what Tasty Wheat tasted like? Mouse – The Matrix

Slide 2: Short SEO History Short SEO History • Web1 0 Web1.0 • Web2.0 • Web3.0

Slide 3: Genesis • A story of the Internet by A story of the Internet, by • Solving the most important problems l i fl db • Greatly influenced by one man…

Slide 4: Tim Berners‐Lee Tim Berners Lee “the World Wide Web is Berners-Lee's alone. He designed it. He loosed it on the g world. And he more than anyone else has fought to keep it open, nonproprietary and free.” Time Magazine, 1999 Time Magazine 1999

Slide 5: The Problem The Problem • Where can I find the information? Where can I find the information? “Our ineptitude in getting at the record is largely caused by the artificiality of the systems of indexing ” indexing. The Atlantic Monthly, 1945

Slide 6: Archie, 1990 Archie, 1990 • Indexed file names and Indexed file names and • Returned results based on pattern matching

Slide 7: Web1.0 Web1 0

Slide 8: Web1.0 • Means HTML Means HTML • Is born in 1991, with the help of • Tim Berners‐Lee (TBL), who also founded i ( ) h l f d d • WWW Consortium (W3C) at MIT, and also • Created WWW Virtual Library – the 1st catalog

Slide 9: Yahoo Directory, 1994 Yahoo Directory, 1994 • Vertical = categories is like Vertical = categories... is like • “Show me all the stuff and I’ll handle it” • Manually indexed stuff, which was ll i d d ff hi h • OK for starters, but… • Websites quickly grew in number and • Y! started charging money for one listing Y! started charging money for one listing • Increasingly more money...

Slide 11: ,1994 • First SE to fully search text First SE to fully search text • Bought by AOL, then • S ld Sold to Excite, which i hi h • Excite went bankrupt and • WebCrawler ends up bought by InfoSpace

Slide 12: Other  Search Engines Other “Search Engines” • 1994, reaches 60mil pages in  96 1994 reaches 60mil pages in ‘96 • 1995, bought by Overture, bought by Y! • 1996, meta search, bought by Lycos 996 h b h b • 1997, bought by IAC/InterActiveCorp • 1999, bought by Overture, meaning Y!

Slide 13: Shopping fun, right? Shopping fun, right?

Slide 14: , 1998 , 1998 • Open Directory Project Open Directory Project • Each listing is checked and certified by a  volunteer • The main source for Google Directory

Slide 15: Current State of Search Industry Current State of Search Industry

Slide 16: Web1.0 Problems • SE couldn’t understand text so SE couldn t understand text, so  • They said “why don’t you implement some  meta tags (description & keywords) so we can  meta tags (description & keywords) so we can get a glimpse of what you’re saying” • Th The relevancy of a page with respect to a  l f ih keyword was determined by a few factors, so • It was very easy to abuse and spam, therefore p q • Search Results had poor qualityy

Slide 17: Web2.0 Web2 0

Slide 18: Web2.0 • Is coined by Tim O’Reilly yet Is coined by... Tim O Reilly, yet • TBL later said that “web2.0” is a stupid,  meaningless term and that he thought of it  meaningless term and that he thought of it first in ’96 anyway

Slide 19: Web2.0 means Web2.0 means • which grew apart because of which grew apart because of • PageRank (1998) invented by • Larry & Sergei who adapted the algo from &S i h d d h l f • An MIT professor who had developed • A nasty mathematical formula for positioning  y p keywords in a 3d space model based on the  relevancy that one kw holds … whatever

Slide 20: PageRank actually means PageRank actually means • That a link is a vote and That a link is a vote and • Not all links are created equal, so • It matters who links to you h li k • Just like in our real life society

Slide 21: • Read the content of pages really well just that Read the content of pages really well, just that • Pages were crappy: –NNon‐standard coding t d d di – Ugly tech (like applets) – Senseless IA • So Google said: “don’t do evil and try to nicely  format the info, according to W3C standards” (remember TBL)

Slide 22: Enter the SEO Enter the SEO

Slide 23: SEO • Is a multitude of practices aimed at facilitating Is a multitude of practices aimed at facilitating  the indexing of pages by search engines • Evolves as the ranking algorithm changes and Evolves as the ranking algorithm changes, and • Of course, the algorithm is kept secret.

Slide 24: SEO actually means SEO actually means Courtesy of Kelly Ishikawa

Slide 25: SEO actually means SEO actually means • An on‐going battle between bots & SEO guys An on going battle between bots & SEO guys • Now 100+ factors influence ranking • And I’d like to take the time to talk about each  d ’d lik k h i lk b h one of them in the following…

Slide 26: Just kidding Just kidding

Slide 27: My SEO Cheat Sheet My SEO Cheat Sheet • Consider: 1. Page Titles 2. URLs (mod_rewrite) 3. Anchor Text 4. Website Architecture (IA) 5. Link Title & Alt Images 6. Relevant content (text) 7. 7 Sitemap xml Sitemap.xml 8. Hosting 9. Freshness

Slide 28: Resources Matt Cutts Blog Mihai’s SEO Cheat Sheet :D

Slide 29: Web2.0 Problems • © for pictures articles books etc for pictures, articles, books, etc • PPC fraud • Privacy i • Search Engine SPAM • Link bombing • Paid links Paid links • But more important...

Slide 30: Web2.0 Problems • SE still don’t understand what the $#%@ SE still don t understand what the $#%@  you’re talking about • Crawling a website’s interface to extract info is Crawling a website s interface to extract info is  almost insane

Slide 31: Web3.0 Web3 0

Slide 32: Web3.0 Web3.0  • Means semantic web semantic web • Attention migrates from syntax/formatting to  semantics and semantics and • Meta Data (data about the data) becomes...

Slide 33: Web3.0 & Resource Description Resource Description Microformats Framework

Slide 34: Resource Description Framework Resource Description Framework • A kind of XML A kind of XML • RDF = Subject + Predicate + Object • S + P + O creates a Triple which O i l hi h • Can describe almost anything in the universe • Triples are connectable (eg: FOAF) • RDFa = XHTML + RDF (W3C compliant) RDFa  XHTML + RDF (W3C compliant)

Slide 35: Microformats • hCalendar  • hCard • rel‐tag • VoteLinks • XFN • Geo • hResume • hReview hR i • etc

Slide 36: Case Study Case Study

Slide 37: SPARQL • SPARQL Protocol and RDF Query Language SPARQL Protocol and RDF Query Language • Standardized on 15th Jan 08 (1 month ago) and • Endorsed by?... TBL d db ? \"Trying to use the Semantic Web without SPARQL is like trying to use a relational Q y g database without SQL“ TBL

Slide 38: Potential • With SPARQL you skip the presentation layer With SPARQL you skip the presentation layer • You can query ad‐hoc any API, so • You don’t need to crawl in advance, therefore d ’ d li d h f • Information will be as fresh as it gets

Slide 39: And possibilities And possibilities • Query: “I can has pizza?”  Query:  I can has pizza? • Returns:  –Af i d f A friend of yours (XFN ‐ F b k) (XFN Facebook)  – has a colleague (FOAF ‐ LinkedIN) who – said that they make good pizza (hReview ‐ yelp) at ( ) – a restaurant nearby (geo – Gmaps) – Tip: U2 in concert today (hCalendar ‐ upcoming)

Slide 40: Perhaps now we can see Perhaps now we can see • Why Social Networking Communities are Why Social Networking Communities are  worth so much, even though most of them  don’t have a revenue model – Facebook – LinkedIN – Meebo – Beebo  – Pipu... • They/We are the databases of the future

Slide 42: Thanks! “Most of the right choices in SEO come from asking: What’s the best thing for the user?” g g Matt Cutts Mihai Gheza  Mih i Gh Creative Commons Attribution‐Noncommercial‐Share Alike 3.0 Unported License.