if you build it, will
they visit?
Frederick Zarndt
IFLA Newspapers Section
frederick@frederickzarndt.com
Alyssa Pacy
Cambr...
2
there are lots of digital historic newspapers collections, some of them very big,
all around the world
library collection ~size pages dates
National Library of Australia Trove 9,880,000 1803-1994
California Digital Newspaper
...
Frederick Zarndt, Apr 2012 IFLA International Newspapers Conference, Bibliotheque
nationale de France, Paris. http://bit.l...
Gallipoli Campaign
April 1915 to January 1916
aka
Battle of Gallipoli
Dardanelles Campaign
Battle of Çanakkale
5
battle wa...
search phrase
(battle OR campaign)
AND
(Gallipoli OR Dardenelles OR Çanakkale)
date range 1-Jan-1915 to 31-Dec-1916
6
sear...
collection collection URL ~size pages number of results
Trove http://trove.nla.gov.au 9,880,000 16,321 articles
CDNC http:...
http://www.google.com/
http://www.google.co.uk/
http://www.google.com.au/
http://www.google.co.nz/
http://www.google.com.s...
search results
http://www.google.com/
http://www.google.co.uk/
http://www.google.com.au/
http://www.google.co.nz/
http://w...
maybe the search should be
more focused?
10
search phrase
(battle OR campaign)
AND
(Gallipoli OR Dardenelles OR Çanakkale)
date range 1-Jan-1915 to 31-Dec-1916
http:/...
search results
(battle OR campaign)
AND
(Gallipoli OR Dardenelles OR Çanakkale)
date range 1-Jan-1915 to 31-Dec-1916
http:...
the reason for poor search
results is not because
collections are intentionally
obscured from web crawlers
or indexing ser...
search results
14
elephind indexes ONLY historical digital newspaper collections
why?
?
??
??
?
¿
¿ 15
why are there so few (none! zero! nada! zip! zilch!) results from libraries in a
google search?
Nat Torkington, Nov 2011 address to the National and State Librarians of Australasia, Auckland.
http://nathan.torkington.c...
robots.txt says to web crawlers
“don’t index this”
sitemaps say to web crawlers
“do index this”
More about robots.txt at h...
Cambridge Public Library Historic Newspapers
18
upgraded robots.txt file and site map xml file in Dec 2012
Cambridge Public Library Historic Newspapers
19
upgraded robots.txt file and site map xml file in Dec 2012
Cambridge Public Library Historic Newspapers
organic search traffic before and after website SEO
upgrade
20
upgraded robot...
Vassar Newspaper Archives
21
upgraded robots.txt file and site map xml file in Dec 2012
Vassar Newspaper Archives visit duration
22
upgraded robots.txt file and site map xml file in Dec 2012
Organic search resu...
libraries spend a lot on digital content and far
too little on publicity, presentation, and
search engine optimization (SE...
?
Frederick Zarndt
IFLA Newspapers Section
frederick@frederickzarndt.com
Alyssa Pacy
Cambridge Public Library
apacy@cambri...
Upcoming SlideShare
Loading in …5
×

20130629 If you build it, will they visit [ala lita lightning talk]

255 views
185 views

Published on

LITA lightning talk at the 2013 ALA conference

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
255
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

20130629 If you build it, will they visit [ala lita lightning talk]

  1. 1. if you build it, will they visit? Frederick Zarndt IFLA Newspapers Section frederick@frederickzarndt.com Alyssa Pacy Cambridge Public Library apacy@cambridgema.gov Joanna DiPasquale Vassar College Libraries jdipasquale@vassar.edu 1
  2. 2. 2 there are lots of digital historic newspapers collections, some of them very big, all around the world
  3. 3. library collection ~size pages dates National Library of Australia Trove 9,880,000 1803-1994 California Digital Newspaper Collection CDNC 540,000 1846-2012 Naitonal Library of Finland Historical Newspaper Library 2,000,000 1771-1919 Bibliotheque nationale de France Gallica 2,200,000 1814-1944 Koninklijke Bibliotheek Historische Kranten 5,000,000 1618-1995 National Library of New Zealand Papers Past 2,960,000 1839-1945 National Library of Norway NBDigital Aviser 12,000,000 1763-2012 Singapore National Library Newspaper SG 2,400,000 1831-2009 British Library British Newspaper Archive 6,912,000 1710-1965 Library of Congress Chronicling America 6,025,000 1836-1922 As of Jun 2013As of Apr 2012 digital historic newspaper collections 3 there are lots of digital historic newspapers collections, some of them very big, all around the world
  4. 4. Frederick Zarndt, Apr 2012 IFLA International Newspapers Conference, Bibliotheque nationale de France, Paris. http://bit.ly/bnfnewspapers traffic rankings and search results show that content in library digital collections dwells in Internet obscurity 4
  5. 5. Gallipoli Campaign April 1915 to January 1916 aka Battle of Gallipoli Dardanelles Campaign Battle of Çanakkale 5 battle was big news. news stories are out-of-copyright in most places.
  6. 6. search phrase (battle OR campaign) AND (Gallipoli OR Dardenelles OR Çanakkale) date range 1-Jan-1915 to 31-Dec-1916 6 search modified as local search engines dictate
  7. 7. collection collection URL ~size pages number of results Trove http://trove.nla.gov.au 9,880,000 16,321 articles CDNC http://cdnc.ucr.edu 540,000 3 articles Historical Newspaper Library http://www.nationallibrary.fi/ 2,000,000 333 results Gallica http://gallica.bnf.fr 2,200,000 222 results Historische Kranten http://kranten.kb.nl 5,000,000 34,399 articles Papers Past http://paperspast.natlib.govt.nz 2,960,000 7,084 articles NBDigital Aviser http://www.nb.no/aviser/ 12,000,000 539 articles Newspaper SG http://newspapers.nl.sg 2,400,000 294 articles British Newspaper Archive http://britishnewspaperarchive.com 6,912,000 1857 articles Chronicling America http://chroniclingamerica.loc.gov 6,025,000 104,503 hits Results from Jun 2013Results from Apr 2012 search results 7
  8. 8. http://www.google.com/ http://www.google.co.uk/ http://www.google.com.au/ http://www.google.co.nz/ http://www.google.com.sg/ (battle OR campaign) AND (Gallipoli OR Dardenelles OR Çanakkale) Google advanced search no longer allows specific date ranges search phrase 8
  9. 9. search results http://www.google.com/ http://www.google.co.uk/ http://www.google.com.au/ http://www.google.co.nz/ http://www.google.com.sg/ (battle OR campaign) AND (Gallipoli OR Dardenelles OR Çanakkale) IN 1st 100 GOOGLE RESULTS, NOT A SINGLE RESULT FROM A LIBRARY! ☓ 9
  10. 10. maybe the search should be more focused? 10
  11. 11. search phrase (battle OR campaign) AND (Gallipoli OR Dardenelles OR Çanakkale) date range 1-Jan-1915 to 31-Dec-1916 http://news.google.com/ http://news.google.co.uk/ http://news.google.com.au/ http://news.google.co.nz/ http://news.google.com.sg/ http://news.google.no/ http://news.google.nl/ http://news.google.fr/ Google News advanced search does still allow specific date ranges 11
  12. 12. search results (battle OR campaign) AND (Gallipoli OR Dardenelles OR Çanakkale) date range 1-Jan-1915 to 31-Dec-1916 http://news.google.com/ http://news.google.co.uk/ http://news.google.com.au/ http://news.google.co.nz/ http://news.google.com.sg/ http://news.google.no/ http://news.google.nl/ http://news.google.fr/ ☓ AGAIN IN THE 1st 100 GOOGLE RESULTS, NOT A SINGLE RESULT FROM A LIBRARY! 12
  13. 13. the reason for poor search results is not because collections are intentionally obscured from web crawlers or indexing services 13 elephind demonstrates that digital newspaper collections are visible in april 2012 articles from new zealand’s papers past collection appeared in hit lists
  14. 14. search results 14 elephind indexes ONLY historical digital newspaper collections
  15. 15. why? ? ?? ?? ? ¿ ¿ 15 why are there so few (none! zero! nada! zip! zilch!) results from libraries in a google search?
  16. 16. Nat Torkington, Nov 2011 address to the National and State Librarians of Australasia, Auckland. http://nathan.torkington.com/blog/2011/11/23/libraries-where-it-all-went-wrong/ if I look at the results of ... digitization projects, I find the shittiest websites on the planet. it’s like a gallery spent all its money buying art and then just stuck the paintings in supermarket bags and leaned them against the wall. 16 why are there so few results from libraries in a google search? because as Nat Torkington says, libraries spend their money on digitizing content and acquiring digital content and then put the data in supermarket bags and leaned it against the wall. in other words libraries don’t give SEO proper attention.
  17. 17. robots.txt says to web crawlers “don’t index this” sitemaps say to web crawlers “do index this” More about robots.txt at http://en.wikipedia.org/wiki/Robots.txt More about sitemaps at http://www.sitemaps.org/ or http://en.wikipedia.org/wiki/Sitemaps + a simple SEO strategy to improve collection search visibility 17 why are there so few results from libraries in a google search? because as Nat Torkington says, libraries spend their money on digitizing content and acquiring digital content and then put the data in supermarket bags and leaned it against the wall. in other words libraries don’t give SEO proper attention.
  18. 18. Cambridge Public Library Historic Newspapers 18 upgraded robots.txt file and site map xml file in Dec 2012
  19. 19. Cambridge Public Library Historic Newspapers 19 upgraded robots.txt file and site map xml file in Dec 2012
  20. 20. Cambridge Public Library Historic Newspapers organic search traffic before and after website SEO upgrade 20 upgraded robots.txt file and site map xml file in Dec 2012 Organic search results are listings on search engine results pages that appear because of their relevance to the search terms, as opposed to their being advertisements. In contrast, non-organic search results may include pay per click advertising.
  21. 21. Vassar Newspaper Archives 21 upgraded robots.txt file and site map xml file in Dec 2012
  22. 22. Vassar Newspaper Archives visit duration 22 upgraded robots.txt file and site map xml file in Dec 2012 Organic search results are listings on search engine results pages that appear because of their relevance to the search terms, as opposed to their being advertisements. In contrast, non-organic search results may include pay per click advertising.
  23. 23. libraries spend a lot on digital content and far too little on publicity, presentation, and search engine optimization (SEO) 23 why are there so few results from libraries in a google search? because as Nat Torkington says, libraries spend their money on digitizing content and acquiring digital content and then put the data in supermarket bags and leaned it against the wall. in other words libraries don’t give SEO proper attention.
  24. 24. ? Frederick Zarndt IFLA Newspapers Section frederick@frederickzarndt.com Alyssa Pacy Cambridge Public Library apacy@cambridgema.gov Joanna DiPasquale Vassar College Libraries jdipasquale@vassar.edu 24

×