1. 24/10/2022 www.rba.co.uk 1
Google is not the only search tool
ARLG – ISG
Wednesday, 9th July 2014, CILIP, London
Presenter: Karen Blakeman
karen.blakeman@rba.co.uk, www.rba.co.uk
www.twitter.com/karenblakeman
Slides available at http://www.rba.co.uk/as/
Also available on authorSTREAM and Slideshare
This presentation is licensed under a Creative Commons Attribution License
2. All change!
Search engines - new algorithms, ranking and display,
personalisation
EU ruling on “right to be forgotten”, how much is being
censored/removed?
Free government and legal resources, official data and
statistics, open data
Social media
24/10/2022 www.rba.co.uk 2
3. Things you need to know about Google search
Google personalises your search
Personalises search based on
– location
– device that you are using
– past search history
– past browsing activity
– activity in other areas of Google e.g. YouTube, blogs,
images
24/10/2022 www.rba.co.uk 3
4. Private browsing - quickest way “un-personalise”search
Chrome - New Incognito window Ctrl+Shift+N
FireFox Ctrl+Shift+P
Internet Explorer Ctrl+Shift+P
Opera Ctrl+Shift+N
Will not remove country/location personalisation
Not search engine specific, built into the browser
5. Things you need to know about Google search
Google automatically looks for variations on your search
terms and sometimes drops terms from your search
– Google may or may not tell you that it has ignored some of your
terms
– “..” around terms, phrases, names, titles of documents does not
always work
– To force an exact match and inclusion of a term prefix it with
‘intext:’
public transport intext:algal biofuels
– Use Verbatim for an exact match search
7. Google now showing missing search terms?
Not always shown – possibly still a live experiment?
24/10/2022 www.rba.co.uk 7
8. Things you need to know about Google search
Google web search does not search everything it has in
its database
– two indexes: main, default index and the supplemental index
– supplemental index may contain less popular, unusual, specialist
material
– supplemental index comes into play when Google thinks your
search has returned too few results
– Verbatim and some advanced search commands seems to
trigger a search in the supplemental index
9. Things you need to know about Google search
Google changes its algorithms several hundred times a year
How Google makes improvements to its search algorithm -
YouTube https://www.youtube.com/watch?v=J5RZOU6vK4Q
10. Things you need to know about Google search
We are all Google’s lab rats
Just Testing: Google Users May See Up To A Dozen
Experiments
http://searchengineland.com/just-testing-google-
searchers-may-see-up-to-a-dozen-experiments-
141570
Mostly minor effects on search but sometimes
totally bizarre results
11. What I see on my screen will not be what
you see on your screen, will not be what
your colleagues see on theirs, will not be
what your users see.
24/10/2022 www.rba.co.uk 11
12. Hummingbird
Not just an update but a completely new algorithm
Tries to make “sense” of your query and put it into context,
natural language queries
Uses search history, your location, what other people have
searched on and clicked on, device being used
Now difficult to predict how Google will handle your search and
how results will be displayed
Layout of results and menu options depend on type of search
24/10/2022 www.rba.co.uk 12
13. EU - so called “right to be forgotten” ruling
24/10/2022 www.rba.co.uk 13
Edition of Monday, January
19, 1998, page 23 -
Newspaper - Lavanguardia.es
http://hemeroteca.lavanguardi
a.com/preview/1998/01/19/pag
ina-23/33842001/pdf.html
EU Court of Justice ruled that
Google is a “data controller”
under Data Protection
legislation and must remove
links to information that is
“inadequate, irrelevant .... or
excessive” from search
results on a person’s name.
14. Information is NOT removed from the web
Subject can apply to have links in search results that point to
specific information removed from the results
Not just Google – all search engines with an EU presence
Only applies to searches conducted in the EU + Norway,
Switzerland, Iceland and Lichtenstein
Not automatic – subject has to apply and request will be assessed
to see if the information is “inadequate, irrelevant or no longer
relevant, or excessive in relation to the purposes for which they
were processed.”
Google’s request form available at
https://support.google.com/legal/contact/lr_eudpa?product=webse
arch# (Bing working on one)
24/10/2022 www.rba.co.uk 14
15. How to get around it?
Google now removing results (and also adding back in
results) from searches in European country versions of
Google
Indicates on the results page if information has been
excluded
Google adds removal statement from all results for searches
on personal names even if nothing has been removed (name
generally has to be within double quotes in the search for this
to happen)
Use non-European Google to see all results
e.g. Google.com, Google.ca - but will see country biased
results
24/10/2022 www.rba.co.uk 15
18. Google menu options change depending on your search
24/10/2022 www.rba.co.uk 18
19. Google rewrites page titles
24/10/2022 www.rba.co.uk 19
Google's Matt Cutts: Why Google Will Ignore Your Page Title Tag &
Write Its Own http://searchengineland.com/googles-matt-cutts-look-
title-match-query-190039
20. Bing does it as well
24/10/2022 www.rba.co.uk 20
http://searchenginewatch.com/article/2352871/How-Bing-Chooses-Your-
Webpage-Titles
26. Google gets it wrong yet again!
24/10/2022 www.rba.co.uk 26
Google "Henry VIII wives": Jane Seymour reveals search engine's blind spots
http://www.slate.com/blogs/future_tense/2013/09/23/google_henry_viii_wives_
jane_seymour_reveals_search_engine_s_blind_spots.html
Image courtesy of Will Oremus
30. Search commands that are still around
PDF for legislation, consultation documents, research
documents, government reports, industry papers
ppt or pptx for presentations, tracking down an expert on a topic
xls or xlsx for spreadsheets containing data
Use the advanced search screen or the filetype: command
"control of dogs (wales) bill" filetype:pdf
organ donation wales opt out filetype:ppt
organ donation wales opt out filetype:pptx
organ donation wales filetype:xls
organ donation wales filetype:xlsx
Combine with site command
organ donation filetype:xls site:nhs.uk
24/10/2022 www.rba.co.uk 30
31. Search commands that are still around (2)
site: to search within a site or type of site
housing regeneration swansea site:wales.gov.uk
housing regeneration swansea site:gov.uk
Also site:ac.uk site:nhs.uk
Can exclude sites using –site:
housing regeneration swansea site:gov.uk
-site:wales.gov.uk
organ donation statistics wales -site:au
Does NOT search inside databases or protected areas
24/10/2022 www.rba.co.uk 31
32. Date
Restrict your results to information that has been published
within the last hour, day, week, month, year or your own date
range
Search tools, Any time and select an option
24/10/2022 www.rba.co.uk 32
33. Bing/Yahoo
Yahoo now uses Bing’s database, commands and ranking algorithms
Yahoo Finance still available
No advanced search screen on Bing - use commands
List at Advanced Operator Reference http://msdn.microsoft.com/en-
us/library/ff795620.aspx
filetype: site:
AND, NOT, OR parentheses for complex Boolean searches
NEAR:n where n is a number, specifies that the terms must be within that
number of words of each other and in any order
-banana NEAR:3 toffee
Date option only for US version
24/10/2022 www.rba.co.uk 33
34. Bing http://www.bing.com/
Results seem to be more consumer/retail focused
– more ‘shopping’ than research
– results improve as soon as you start using the advanced search
commands
Sometimes more up to date than Google
– updates sites more frequently
– adds new sites more quickly
– useful if you are looking for information on a new company or
organisation
BUT interesting features and options available to US users only
– changing location and version of Bing does not always work
– using anonymous proxy does not always work
24/10/2022 www.rba.co.uk 34
37. DuckDuckGo – http://duckduckgo.com/
Does not track, does not personalise, no EU presence so no
“right to be forgotten”
Results are a compilation of about 50 sources including
Wikipedia, Wolfram Alpha, Bing, Blekko and its own Web crawler
DuckDuckBot. “In partnership with Yandex”
Advanced search DuckDuckGo Syntax
http://help.duckduckgo.com/customer/portal/articles/300304
DuckDuckGo – silly name but a neat little search tool
http://www.rba.co.uk/wordpress/2011/11/07/duckduckgo-silly-
name-but-a-neat-little-search-tool/
24/10/2022 www.rba.co.uk 37
38. Millionshort http://millionshort.com
Million Short: unearthing information hidden in the dungeons of
Google’s results
– http://www.rba.co.uk/wordpress/2012/10/04/million-short-
unearthing-stuff-hidden-in-the-dungeons-of-googles-results/
Uses Bing API plus other sources
Great for finding specialist articles that Google buries beyond reach
Removes top 10k sites from results - can change to top million, 100k,
1k, 100
Can add sites back in, can block sites
Can “Boost!” sites so that they always appear at the top
Can use site: and filetype: commands
Country versions give different results (under Manage Settings and
Country)
24/10/2022 www.rba.co.uk 38
40. Yandex http://www.yandex.com/
– for filetype use mime:
diabetic retinopathy mime:pptx
– has an advanced search screen at
http://yandex.com/search/advanced
Blekko http://www.blekko.com/
Ask http://www.ask.com/
Teoma http://www.teoma.com/
– all three support filetype: and site:
24/10/2022 www.rba.co.uk 40
52. Google Scholar
http://scholar.google.com/
“Google Scholar provides a simple way to broadly search for scholarly
literature. From one place, you can search across many disciplines and
sources: articles, theses, books, abstracts and court opinions, from
academic publishers, professional societies, online repositories,
universities and other web sites. Google Scholar helps you find relevant
work across the world of scholarly research”.
• Search all scholarly literature from one convenient place
• Explore related works, citations, authors, and publications
• Locate the complete document through your library or on the web
• Keep up with recent developments in any area of research
• Check who's citing your publications, create a public author profile
24/10/2022 www.rba.co.uk 52
53. Google Scholar
Does not cover all key journals in all subjects – no source list,
but getting better
Top publications for subjects and languages under Metrics link
on home page or
http://scholar.google.co.uk/citations?view_op=top_venues&hl=en
Scholar indexes the full text but you may have to pay to view the
whole article
Groups different versions of an article together
24/10/2022 www.rba.co.uk 53
54. Google Scholar
24/10/2022 www.rba.co.uk 54
Does NOT use the publishers’ metadata
Date and author search looks in the area of the document where
those elements are usually found
Page numbers, part of an address, data item may be mistaken
for publication year
Sometimes gets the author wrong
Is MA Lib really
the author?
55. Google Scholar for systematic reviews?
BMC Medical Informatics and Decision Making | Full text | Is the
coverage of google scholar enough to be used alone for systematic
reviews http://www.biomedcentral.com/1472-6947/13/7
No, Google Scholar Shouldn’t be Used Alone for Systematic Review
Searching | Laika's MedLibLog
http://laikaspoetnik.wordpress.com/2013/07/09/no-google-scholar-
shouldnt-be-used-alone-for-systematic-review-searching/
BMC Medical Research Methodology | Full text | Google Scholar as
replacement for systematic literature searches: good relative recall and
precision are not enough
http://www.biomedcentral.com/1471-2288/13/131
24/10/2022 www.rba.co.uk 55
56. Google Scholar advanced search commands
Use advanced search screen or commands as follows:
+ sign before a search term to force an exact match, for example +norne
“....” around phrases for example “environmental remediation”
intitle: to search for a single word in the title, for example intitle:zeolites
environmental remediation
allintitle: to search for all of your terms in the title, for example
allintitle:zeolites environmental remediation
author: to search on an author’s name, for example
zeolites environmental remediation author:rhodes
site: to limit your search to specific institution for example
marcellus shale site:psu.edu
Commands can be combined for a precise search, for example
author:wolford site:psu.edu allintitle:marcellus shale
24/10/2022 www.rba.co.uk 56
57. Microsoft Academic Search
http://academic.research.microsoft.com/
Journal articles, pre-prints, post-prints, conference
proceedings, reports and white papers
Free to use but the full text of some papers can only be
viewed on payment of a fee to the original journal publisher
Author may have several different profiles and articles may
be assigned to wrong author
Sometimes very slow to load
24/10/2022 www.rba.co.uk 57
60. Jeffrey Beall
List of Predatory Publishers 2014 | Scholarly Open Access
http://scholarlyoa.com/2014/01/02/list-of-predatory-publishers-
2014/
24/10/2022 www.rba.co.uk 60
61. Institutional repositories and open access
BASE - Bielefeld Academic Search Engine http://www.base-search.net/
CORE (COnnecting Repositories) http://core.kmi.open.ac.uk/search
DART-Europe E-theses Portal http://www.dart-europe.eu/basic-search.php
DOAJ: Directory of Open Access Journals http://www.doaj.org/doaj
Institutional Repository Search (IRS) http://irs.mimas.ac.uk/
Open DOAR http://opendoar.org/
RIAN - Pathways to Irish Research http://rian.ie
ROAR - Registry of Open Access Repositories http://roar.eprints.org/
OpenAIRE http://www.openaire.eu/
24/10/2022 www.rba.co.uk 61
62. Specialist search tools for research information
A selection can be found at
http://www.rba.co.uk/search/links.shtml#research
ArXiv http://arxiv.org/
BioMed Central http://www.biomedcentral.com/
Chemistry Central http://www.chemistrycentral.com/
ChemSpider http://www.chemspider.com/
Deep Web Technologies
Mednar http://mednar.com/
Science.gov http://www.science.gov/
Science Research http://scienceresearch.com/
WorldWideScience http://worldwidescience.org/
24/10/2022 www.rba.co.uk 62
63. Specialist search tools for research information
Europe PubMed Central http://europepmc.org/
Mendeley http://www.mendeley.com/
Open Biology http://rsob.royalsocietypublishing.org/
PhilPapers: Online Research in Philosophy http://philpapers.org/
PubMed Central http://www.ncbi.nlm.nih.gov/pmc/
SSRN (Social Science Research Network) http://www.ssrn.com/en/
TechXtra http://www.techxtra.ac.uk/
24/10/2022 www.rba.co.uk 63
64. BBC News - Public libraries get online access to
research journals http://www.bbc.co.uk/news/education-
25981183
24/10/2022 www.rba.co.uk 64
For personal research, non-commercial use.
65. Public Library Initiative by PLS and ProQuest | Access to
Research http://www.accesstoresearch.org.uk/
List of participating libraries and publishers
Public Library Initiative by PLS and ProQuest | Access To
Research http://freetoviewjournals.pls.org.uk/
Search tool for the journals and articles covered by the
agreement.
List of journals covered by the agreement
Not only open access but also subscription journals/articles
Database can be searched and summaries displayed from
anywhere but articles can only be viewed and printed off on
library premises
24/10/2022 www.rba.co.uk 65
67. Searching government sites
Departmental websites moving to www.gov.uk – older
material supposed to be archived
Aimed more at the general public rather than the serious
researcher
Navigation can be poor
Internal search options can be poor
Use Google and its advanced commands to search a site
– site:
– filetype:
24/10/2022 www.rba.co.uk 67
68. UK Government Web Archive | The National Archives
http://www.nationalarchives.gov.uk/webarchive/
Browse by category or choose your organisation from an A-Z list
Choose the date of the archived version of the website you want
to view [Can be difficult to search]
24/10/2022 www.rba.co.uk 68
69. UK Government Web Archive | The National Archives
http://www.nationalarchives.gov.uk/webarchive/
24/10/2022 www.rba.co.uk 69
77. Official statistics and open data
UK National Statistics Publication Hub
– http://www.statistics.gov.uk/
Office for National Statistics
– http://www.ons.gov.uk/
data.gov.uk
– http://data.gov.uk/
Welsh Government | Statistics
– http://wales.gov.uk/statistics-and-research/
StatsWales
– http://statswales.wales.gov.uk/
Eurostat http://epp.eurostat.ec.europa.eu/
European Union - Open Data Portal
– http://open-data.europa.eu/open-data/
24/10/2022 www.rba.co.uk 77
78. Tony Hirst OUseful.Info, the blog... Trying to find useful things
to do with emerging technologies in open education
http://blog.ouseful.info/
24/10/2022 www.rba.co.uk 78
79. Chart and image gallery: 30+ free tools for data visualization and
analysis - Computerworld
http://www.computerworld.com/s/article/9214755/Chart_and_image_gal
lery_30_free_tools_for_data_visualization_and_analysis
24/10/2022 www.rba.co.uk 79
81. Google Public Data Explorer
http://www.google.com/publicdata/
One of Google's best kept secrets!
Public data sets made available by Eurostat, World Bank, IMF,
CSO Ireland, OECD, ITU, some national statistics offices (but
not ONS), and many more.
Source and date updated given.
Charts and charting options can highlight oddities and missing
data
Look at the charts to see if there is a sudden change in the
trends.
24/10/2022 www.rba.co.uk 81
82. Google Public Data Explorer Minimum Wage – something
is missing
24/10/2022 www.rba.co.uk 82
Some countries are missing e.g.
Germany because they don’t have
a minimum wage
84. Guardian Data Store http://www.guardian.co.uk/data
24/10/2022 www.rba.co.uk 84
Data and analysis on topics that are in the news
Some data sets created from information obtained via FoI
Links to the original datasets are provided
86. Correlation does not mean causation
Per capita consumption of mozzarella cheese (US) correlates with Civil
engineering doctorates awarded (US)
http://tylervigen.com/view_correlation?id=3890
24/10/2022 www.rba.co.uk 86
90. Keeping up to date
Inside Search http://insidesearch.blogspot.com/
Official Google Blog http://googleblog.blogspot.com/
SearchReSearch : http://searchresearch1.blogspot.co.uk/
Search Engine Land http://searchengineland.com/
Search Engine Watch http://searchenginewatch.com/
Search Engine Roundtable http://www.seroundtable.com/
Karen Blakeman’s Blog http://www.rba.co.uk/wordpress/
Phil Bradley's weblog http://philbradley.typepad.com/
24/10/2022 www.rba.co.uk 90