The Googlization of
search
Lars Iselid
Umeå UB 28 January 2014
It starts with an
information need.
Should you ask some
one who knows?
Do you know who you
should ask?
Should you call that
person?
Should you message
that person?
Should you mail that
person?
Should you snail mail
that person?
...but you want an
answer now!
Will you check it up in
your encyclopedia?
You'll google. It's faster.
It’s about the magic of
everything in one search
box
Your second brain:
Google in your pocket
If you don’t have the
”device” with you, let’s
ask the Google Monkey
Are my "serps" good
enough?
Serps=Search engine
result pages
Not good enough?!
Why??
Why do I get these
results?
Personalization? What's
that?
Google tries to
understand what you
want by analyzing your
former searching.
Ok, with personalization
I loose the control, in
some extent...
...and with customization
I keep the control?
Remember
personalization is like a
black box.
You may want to loose
that control if you get
better results?
How do Google rank my
serps?
The magic of the title-tag
<title> </title>
The magic of the links to
the website
It's not just a
quantitative count of
links, it's a qualitative
count.
Who has linked and who has linked to that linking web site?

Royal Library of
Sweden

Umeå University
Library

Angry Libra...
The magic of the text in
the incoming links
Could this be
manipulated?
Of course, we have link
farms, cloaking,
spamblogging etc. but
Google is punishing site
owners using this black
hat SEO.
But if it's not in Google
serp one, should I try the
second or the third or the
forth?
And if it's not in Google
then it doesn't exist, or?
Let me tell you the story
of the invisible web and
the library's hidden
treasures
The invisble web
Pages and documents
the search engine spider
can't index or won't, of
some reason, index.
The spider finds pages
by links. If the page has
no link from for example
the main site, the page
won't be indexed.
Sites behind passwords.
Sites not indexed
because of the robots.txt.
Web pages hidden in
databases. Not as big
problem as before.
AnthroSource
Is it possible to have one
single search box to the
library's treasures?
Yes, they call it
Discovery Tools.
Some call it a Google for
libraries.
We just call it our library
search, though the
product is commercial
and called Primo.
Paid library printed or
electronic material.

Free digital
material

Information about
material we don't have
access to, b...
Aleph
(Album)

Primo
Central
•  Medline
•  Web of
Science
•  Swepub
•  Gale
•  Encyclopedia
Britannica etc.

DiVA

SFX
Sea...
One thing is the fulltext...
...and another thing is
the metadata, information
describing the fulltext...
...we may have access
to.
Metadata in web pages
HTML
<meta name="keywords" content="umeå universitet, umeå, umea,
www.umu.se, forska, forskning, utbildning, samverkan, pr...
Dublin Core
<meta name="DC.Subject" content="umeå universitet, umeå, umea,
www.umu.se, forska, forskning, utbildning, samv...
Metadata in library
databases.
PMID- 23204569
OWN - NLM
STAT- In-Data-Review
DA - 20121203
IS - 0008-3194 (Print)
IS - 0008-3194 (Linking)
VI - 56
IP - 4...
<PubmedArticle>
<MedlineCitation Owner="NLM" Status="PubMed-notMEDLINE">
<PMID Version="1">23204569</PMID>
<DateCreated>
<...
Normalized XML data in
Primo discovery tool
Dublin Core
MarcXML

Medline/PubMed XML

PNX records
<record xmlns="http://www.exlibrisgroup.com/xsd/primo/primo_nm_bib">
<control>
<sourcerecordid>000320104</sourcerecordid>
...
With one search box, the
library wants to make its
service easier, faster,
more valuable...
...than tortured serps
from Google.
Will we succeed?
We must.
But remember that
Google mostly finds web
pages and documents
when...
...the library finds books,
articles, dissertations in a
structured manner.
Yes, sometimes the
book is a PDF document
on the web or the article a
web page on a web site...
...and Google indexes
that.
You can read it becuse
your CAS-connected, not
because it’s free.
It’s on the library web page also.
Still the library has
access to unique
material...
...and still Google and
libraries will complement
each other.
But when Google will
rely on algorithms
counting incoming links...
...the library will rely on
structured metadata.
When Google is good
enough...
...the library wants to be
better than enough.
Dad!! There is nothing*
about this on the web...
*not good enough material
Have you tried the
library resources?
Zzzzzzz......
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
The googlization of search 2014
Upcoming SlideShare
Loading in …5
×

The googlization of search 2014

536 views

Published on

Published in: Education, Technology
  • should be because not becuse ;-)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Should be index not indexes, or?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Lars, great presentation! Only a typing error on slide 85 ;-) Oliver
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

The googlization of search 2014

  1. 1. The Googlization of search Lars Iselid Umeå UB 28 January 2014
  2. 2. It starts with an information need.
  3. 3. Should you ask some one who knows?
  4. 4. Do you know who you should ask?
  5. 5. Should you call that person?
  6. 6. Should you message that person?
  7. 7. Should you mail that person?
  8. 8. Should you snail mail that person?
  9. 9. ...but you want an answer now!
  10. 10. Will you check it up in your encyclopedia?
  11. 11. You'll google. It's faster.
  12. 12. It’s about the magic of everything in one search box
  13. 13. Your second brain: Google in your pocket
  14. 14. If you don’t have the ”device” with you, let’s ask the Google Monkey
  15. 15. Are my "serps" good enough?
  16. 16. Serps=Search engine result pages
  17. 17. Not good enough?! Why??
  18. 18. Why do I get these results?
  19. 19. Personalization? What's that?
  20. 20. Google tries to understand what you want by analyzing your former searching.
  21. 21. Ok, with personalization I loose the control, in some extent...
  22. 22. ...and with customization I keep the control?
  23. 23. Remember personalization is like a black box.
  24. 24. You may want to loose that control if you get better results?
  25. 25. How do Google rank my serps?
  26. 26. The magic of the title-tag <title> </title>
  27. 27. The magic of the links to the website
  28. 28. It's not just a quantitative count of links, it's a qualitative count.
  29. 29. Who has linked and who has linked to that linking web site? Royal Library of Sweden Umeå University Library Angry Librarian Blog Stockholm University Library Pirate Bay
  30. 30. The magic of the text in the incoming links
  31. 31. Could this be manipulated?
  32. 32. Of course, we have link farms, cloaking, spamblogging etc. but Google is punishing site owners using this black hat SEO.
  33. 33. But if it's not in Google serp one, should I try the second or the third or the forth?
  34. 34. And if it's not in Google then it doesn't exist, or?
  35. 35. Let me tell you the story of the invisible web and the library's hidden treasures
  36. 36. The invisble web
  37. 37. Pages and documents the search engine spider can't index or won't, of some reason, index.
  38. 38. The spider finds pages by links. If the page has no link from for example the main site, the page won't be indexed.
  39. 39. Sites behind passwords.
  40. 40. Sites not indexed because of the robots.txt.
  41. 41. Web pages hidden in databases. Not as big problem as before.
  42. 42. AnthroSource
  43. 43. Is it possible to have one single search box to the library's treasures?
  44. 44. Yes, they call it Discovery Tools.
  45. 45. Some call it a Google for libraries.
  46. 46. We just call it our library search, though the product is commercial and called Primo.
  47. 47. Paid library printed or electronic material. Free digital material Information about material we don't have access to, but still can request.
  48. 48. Aleph (Album) Primo Central •  Medline •  Web of Science •  Swepub •  Gale •  Encyclopedia Britannica etc. DiVA SFX Search
  49. 49. One thing is the fulltext...
  50. 50. ...and another thing is the metadata, information describing the fulltext...
  51. 51. ...we may have access to.
  52. 52. Metadata in web pages
  53. 53. HTML <meta name="keywords" content="umeå universitet, umeå, umea, www.umu.se, forska, forskning, utbildning, samverkan, program, kurs, läsa, plugga, studera, studier, distans, sommaruniversitet, campus, universitetsbibliotek, ub, högskoleprovet" /> <meta name="description" content="Umeå universitet är ett av Sveriges största lärosäten med drygt 36 000 studenter och 4000 anställda. Här finns internationellt väletablerad forskning och ett komplett utbud av utbildningar. Vårt campus utgör en inspirerande miljö som inbjuder till gränsöverskridande möten – mellan studenter, forskare, lärare och externa parter. Genom samverkan med andra samhällsaktörer bidrar vi till utveckling och stärker kvaliteten i forskning och utbildning." /> <title>Umeå universitet</title>
  54. 54. Dublin Core <meta name="DC.Subject" content="umeå universitet, umeå, umea, www.umu.se, forska, forskning, utbildning, samverkan, program, kurs, läsa, plugga, studera, studier, distans, sommaruniversitet, campus, universitetsbibliotek, ub, högskoleprovet" /> <meta name="DC.Language" content="(SCHEME=ISO639-1) sv" /> <meta name="DC.Type" content="text" /> <meta name="DC.Format" content="(SCHEME=IMT) text/html" /> <meta name="DC.Identifier" content="/" /> <meta name="DC.Rights" content="Copyright Umeå University 2011" /> <meta name="DC.Description" content="Umeå universitet är ett av Sveriges största lärosäten med drygt 36 000 studenter och 4000 anställda.
  55. 55. Metadata in library databases.
  56. 56. PMID- 23204569 OWN - NLM STAT- In-Data-Review DA - 20121203 IS - 0008-3194 (Print) IS - 0008-3194 (Linking) VI - 56 IP - 4 DP - 2012 Dec TI - Management approaches to acute muscular strain and hematoma in National level soccer players: a report of two cases. PG - 262-8 AB - OBJECTIVE: To detail the presentation of two elite female soccer players with right thigh pain that occurred during training. This article will outline the investigation, diagnosis... AD - Tutor, CMCC. FAU - Stainsby, Brynne E AU - Stainsby BE FAU - Piper, Steven L AU - Piper SL FAU - Gringmuth, Robert AU - Gringmuth R LA - eng PT - Journal Article PL - Canada TA - J Can Chiropr Assoc JT - The Journal of the Canadian Chiropractic Association JID - 7507184 EDAT- 2012/12/04 06:00 MHDA- 2012/12/04 06:00 CRDT- 2012/12/04 06:00 PST - ppublish SO - J Can Chiropr Assoc. 2012 Dec;56(4):262-8.
  57. 57. <PubmedArticle> <MedlineCitation Owner="NLM" Status="PubMed-notMEDLINE"> <PMID Version="1">23204569</PMID> <DateCreated> <Year>2012</Year> <Month>12</Month> <Day>03</Day> </DateCreated> <DateCompleted> <Year>2012</Year> <Month>12</Month> <Day>04</Day> </DateCompleted> <DateRevised> <Year>2013</Year> <Month>05</Month> <Day>30</Day> </DateRevised> <Article PubModel="Print"> <Journal> <ISSN IssnType="Print">0008-3194</ISSN> <JournalIssue CitedMedium="Internet"> <Volume>56</Volume> <Issue>4</Issue> <PubDate> <Year>2012</Year> <Month>Dec</Month> </PubDate> </JournalIssue> <Title>The Journal of the Canadian Chiropractic Association</Title> <ISOAbbreviation>J Can Chiropr Assoc</ISOAbbreviation> </Journal> <ArticleTitle>Management approaches to acute muscular strain and hematoma in National level soccer players: a report of two cases.</ArticleTitle>
  58. 58. Normalized XML data in Primo discovery tool
  59. 59. Dublin Core MarcXML Medline/PubMed XML PNX records
  60. 60. <record xmlns="http://www.exlibrisgroup.com/xsd/primo/primo_nm_bib"> <control> <sourcerecordid>000320104</sourcerecordid> <sourceid>UMUB_ALEPH</sourceid> <recordid>UMUB_ALEPH000320104</recordid> <originalsourceid>UME01</originalsourceid> <ilsapiid>UME01000320104</ilsapiid> <sourceformat>MARC21</sourceformat> <sourcesystem>Aleph</sourcesystem> </control> <display> <type>book</type> <title>Impressionism</title> <creator>Bomford, David ; White, Raymond ; Williams, Louise</creator> <contributor>National Gallery (Storbritannien)</contributor> <publisher>London : National Gallery in association with Yale University Pressc cop. 1990</publisher> <creationdate>1990</creationdate> <format>227 s. : ill. (vissa i färg) ; 27cm.</format> <identifier>$$CISBN$$V0-300-05036-4 (hft.) ;; $$CISBN$$V0-300-05035-6 (inb.) ;</ identifier> <subject>London National Gallery Utst. 1990/91; Impressionism (Art) -- Exhibitions; Paintings, Impressionism; Impressionism -- Frankrike</subject> <language>eng</language> <relation>$$Cseries $$VArt in the making,</relation> <source>UMUB_ALEPH</source>
  61. 61. With one search box, the library wants to make its service easier, faster, more valuable...
  62. 62. ...than tortured serps from Google.
  63. 63. Will we succeed?
  64. 64. We must.
  65. 65. But remember that Google mostly finds web pages and documents when...
  66. 66. ...the library finds books, articles, dissertations in a structured manner.
  67. 67. Yes, sometimes the book is a PDF document on the web or the article a web page on a web site...
  68. 68. ...and Google indexes that.
  69. 69. You can read it becuse your CAS-connected, not because it’s free.
  70. 70. It’s on the library web page also.
  71. 71. Still the library has access to unique material...
  72. 72. ...and still Google and libraries will complement each other.
  73. 73. But when Google will rely on algorithms counting incoming links...
  74. 74. ...the library will rely on structured metadata.
  75. 75. When Google is good enough...
  76. 76. ...the library wants to be better than enough.
  77. 77. Dad!! There is nothing* about this on the web... *not good enough material
  78. 78. Have you tried the library resources?
  79. 79. Zzzzzzz......

×