Advanced Web Searching, IFEG, 3rd April 2012


Published on

Presentation on advanced search given to the Information for Energy Group (IFEG) Spring Symposium. Hosted by the Energy Institute, London, 3rd April 2012

Published in: Technology, Business
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Advanced Internet Search Strategies 9 April 2012 (c) Karen Blakeman 2012
  • Advanced Internet Search Strategies 9 April 2012 (c) Karen Blakeman 2012
  • 09/04/12 (c) Karen Blakeman 2012
  • Advanced Internet Search Strategies 9 April 2012 (c) Karen Blakeman 2010
  • 09/04/12 (c) Karen Blakeman 2012
  • Advanced Web Searching, IFEG, 3rd April 2012

    1. 1. Advanced web searching Information for Energy Group (IFEG) Spring Seminar Energy Institute, London 3rd April 2012 Karen Blakeman RBA Information Services Slides are available at Twitter: @karenblakeman Balance: 100 kW Hydroelectric Turbine at Mapledurham. This presentation is licensed under a Creative Commons Attribution 3.0 License
    2. 2. Trends in searchNo longer straightforward text searching of web pages anddocuments Localisation Personalisation Social Mobile09/04/12 2
    3. 3. General search toolsGoogle - 91% of UK market, 67% US marketBing/Yahoo – Yahoo now uses Bings database and search algorithms for web, image, video and news searchAlternatives to the "big two" – DuckDuckGo – Blekko – Yandex.comIncreasingly need to be aware of specialist tools09/04/12 3
    4. 4. Sanity checking Google “if I hadn’t searched across more than Google for data on a small, new company that I was asked to research recently, Iwould have missed out on some very significant information that Google just wasn’t showing me.”09/04/12 4
    5. 5. How Google started 11 November 1998 The Internet Archive www.archive.org09/04/12 5
    6. 6. How was Google different? Links (citations) a major part of ordering search results 6
    7. 7. Where is Google now?2001Revenues $86,426 thousandsNet Income $10,964 thousands 2011 Revenues $37,905 millions Net Income $9,737 millions 2011 – 96% of revenues are from advertising Google is mass market consumer oriented. Serious researchers wanting reliable, structured search are a miniscule fraction of their customer base. 09/04/12 7
    8. 8. New How People Spend Their Time Online –Stephens Lighthouse 8
    9. 9. How Google organises and sorts information Has a primary index of higher "quality" documents and a secondary index. Only the primary index is searched when running straightforward searches. Secondary index comes into play with more complex searches and if a small number of results are found. “Dear Bing, We Have 10,000 Ranking Signals To Your 1,000. Love, Google” Over 200 hundred “signals” and each may have over 50 variations09/04/12 9
    10. 10. How Google ranks and organises your resultsGoogle personalizes and tailorsyour results depending on yourlocation, computer/device,browser, past searches, whatyou have looked at in the past,your +1s, your Google+account, what you had forbreakfast...and anything else itcan find by rummaging aroundin your Google dashboardTo see whats in your dashboard log in to your Google account and go to see Google personalisation: web history isn’t the only problem 09/04/12 10
    11. 11. Google to Launch Third-Party Commenting Platform 11
    12. 12. Bing does it all as well! – "Adaptive search" – links with Facebook Sign out of accounts, clear cookies, switch off history DuckDuckGo – no tracking, no personalisation Private browsing/No-tracking options in browser Use Chrome Incognito (Chrome owned by Google!)09/04/12 12
    13. 13. Googles new Privacy Policy “Our new Privacy Policy makes clear that, if you’re signed in, we may combine information you’ve provided from one service with information from other services. In short, we’ll treat you as a single user across all our products, which will mean a simpler, more intuitive Google experience.“09/04/12 13
    14. 14. What I see on my screen for a search is not what you’ll see on yours.09/04/12 14
    15. 15. Google totally loses the plotFor 10 days in February 2011: coots = lions Google decides that coots are really lions – Update on coots vs. lions – 15
    16. 16. Coots = lions09/04/12 16
    17. 17. Three search tricksThese three techniques can change what Google (and othersearch engines) decides to give you and also the order of theresults.Repeat important search termscoots coots mating behaviour (found coots)Change the order of your termsmating behaviour coots (found coots)Change one of your search termscoots mating behaviour (found lions)coots courtship behaviour (found coots)coots mating ritual (found coots)09/04/12 17
    18. 18. Excluding pages containing wordsWant to exclude pages containing a term? Place a - (minus sign)before the termUse with care as may miss important materialExcluding lions from our bizarre coots search coots mating behaviour –lionsgave us:09/04/12 18
    19. 19. Coots=lions was an extreme example of how Google can workWe think Google was doing the following: - assumed a typing error or was running a mobile/smartphone predictive text algorithm (coots=cats)? - ran an automatic variation/synonym search on cats? - used a search frequency rule and found that lions mating behaviour was requested more than cats?09/04/12 19
    20. 20. Dear Google, stop messing with my search no longer looks for all of yourterms in a page 09/04/12 20
    21. 21. See what Google sees Hover over a result and a "preview" of the page should appear to the right together with a Cached link – this is Google copy09/04/12 21
    22. 22. “When you do a multi-term query on Google (even with quoted terms), the algorithm sometimes backs-off from hard ANDing all of the terms’s clear that people will often write long queries (with anywhere from 5 to 10 terms) for which there are no results. Google will then selectively remove the terms that are the lowest frequency to give you some results (rather than none)....Soft AND is a way to reduce the overall frustration and give the searcher something to examine (and with luck, a chance to reformulate their query).” Dan Russell 22
    23. 23. VerbatimForces Google to run an exactmatch search. Run your search firstand then select Verbatim from theleft hand menu on your results pageCannot be combined with timeoptions in the side barGoogle: Verbatim for exact matchsearch 09/04/12 23
    24. 24. Google doing its own thing can be good09/04/12 24
    25. 25. Search gets Social - Resistance is Futile Googles new(ish) social network Google Plus (Google+) Google trying forcing people to create a Google+ profile Search Plus Your World (SPYW) referred to as Search+ now available in and is the default. Gives priority to content from people in your Google+ network if you are signed in to your account. (And the next Google killer is….Google! )09/04/12 25
    26. 26. 09/04/12 26
    27. 27. [Not signed in to a Google account]09/04/12 27
    28. 28. Signed in to Google account on Google.com09/04/12 28
    29. 29. 09/04/12 29
    30. 30. About 4,940,000 results (0.74 seconds) !!!09/04/12 30
    31. 31. Google results side barThese help you focus yoursearch"Everything" does NOT searcheverythingVary depending on type ofsearch e.g. web, news, imagesOpen up the "more" options tosee everything09/04/12 31
    32. 32. Google side barsImages Videos News Books Blogs 09/04/12 32
    33. 33. Google Images Similar images09/04/12 33
    34. 34. Google Images09/04/12 34
    35. 35. Google Images – use an existing image Click on the camera icon in the search box and then either enter the URL of an image or upload it09/04/12 35
    36. 36. FlickrFlickr Creative Commons or advanced search screen 36
    37. 37. Images - other sources for Creative Commonsand public domain imagesWikimedia Commons (check thelicence information towards the bottom of the page e.g. - public domainGeograph Creative Commons 2.0Most of the images on US government web sites are publicdomain (but do check)NASA - public domain9 April 2012 Karen Blakeman 37
    38. 38. Google Video Not the same as Youtube09/04/12 38
    39. 39. Video Bing Videos YouTube Blinkx - April 2012 Karen Blakeman 39
    40. 40. Google News09/04/12 40
    41. 41. Google News Archive09/04/12 41
    42. 42. Silobreaker.com09/04/12 42
    43. 43. Silobreaker.com09/04/12 43
    44. 44. Google Books09/04/12 44
    45. 45. Blogs9 April 2012 Karen Blakeman 45
    46. 46. Related searches09/04/12 46
    47. 47. Translated foreign pages for a different perspective Google suggests languages from context of search but you can choose your own Your search is translated and the results are translated into your language 09/04/12 47
    48. 48. Advanced searchUse search commands or Advanced Search screen 48
    49. 49. Problems finding information on a particular site?Use Googles site: commandCan combine with date options in side menu09/04/12 49
    50. 50. Or if you are interested in UK academic reports09/04/12 50
    51. 51. Looking for a particular type of information for example statistics, researchreport, expert presentation?Use the filetype: commandFor statistics world oil consumption filetype:xls world oil consumption filetype:xlsx world oil consumption filetype:xlsx OR filetype:xlsFor government, research, industry reports UK oil consumption forecasts filetype:pdfFor conference presentations or trying to locate an expert renewable energy UK filetype:ppt renewable energy UK filetype:pptx renewable energy UK filetype:ppt site:ac.uk09/04/12 51
    52. 52. Numerical range searchAnything to do with numbersUse advanced search screen or1st number followed by two full stops followed by 2nd number followed by unit of measurement (if applicable) – Norway oil production forecasts 2012..2020 – Norway oil production forecasts 2012..2020 filetype:xls OR filetype:xlsx09/04/12 52
    53. 53. Advanced commands continued inurl: for example inurl:"carbon capture" targets intitle: for example intitle:"carbon capture" targets asterisk (*) to search for terms separated by 1-5 words (may have to use quotation marks) solar * panels "solar * panels" Picks up solar PV panels, solar photovoltaic panels, solar water heating panels09/04/12 53
    54. 54. SynonymsGoogle often looks for variations of your terms but you cannotrely on it always happeningUse the tilde ~ before a term to look for what Google considersare synonyms – ~energy will pick up oil, fuel, gas, electricityNo information/documentation on how synonyms are createdVery general, consumer oriented rather than scientificCan be used with Verbatim09/04/12 54
    55. 55. Google alternatives - Bing and YahooYahoo now uses Bings database and rankingMany of the Advanced Search commands are similar to Google’s, seeSearch Tools Summary and Comparison of the interesting developments and features are only available inthe US versionResults tend to be more consumer/retail focused unless usingadvanced search featuresCoverage not identical to Google’s - sometimes yields importantunique contentSometimes more up to date than Google09/04/12 55
    56. 56. DuckDuckGo – silly name but a neat little search tool tracking, no “filter bubble”Commands site: filetype: sort:date to sort by date (uses results from Blekko)Syntax and keyboard shortcuts at 56
    57. 57. Yandex.com09/04/12 57
    58. 58. advanced search09/04/12 58
    59. 59. Blekko for sorting by date (/date), searching for images(/images) and videos (/videos)create your own to search your specified list of sites (similar toGoogle Custom Search Engines) wind turbine electricity generation /karenblakeman/renewable“Musings about librarianship: Using Blekko to search acrossthousands of library sites” 59
    60. 60. BlekkoCannot do filetype, inurl, intitle searchesDrop down menu next to page in results list for – site search (or use /site), similar pages (or use /similar), inbound links to the page (or use /links)09/04/12 60
    61. 61. Google Scholar useful place to start your research or if you are looking for aspecific paper but no source list, not comprehensive and omitsmany key scientific publicationsBoth peer-reviewed and un-reviewed articles, pre-prints,institutional repositories, references to books, citationsDoes not use publishers’ meta data, author search unreliable,search on year of publication unreliableSometimes does strange things with your search terms but youcan still use + before a term to force exact match searchSometimes finds unique content09/04/12 61
    62. 62. 09/04/12 62
    63. 63. Authors encouraged to claim papers 09/04/12 63
    64. 64. Microsoft Academic Search 64
    65. 65. Microsoft Academic Search09/04/12 65
    66. 66. Microsoft Academic SearchProblems – coverage – sometimes gets the author completely wrongM Edwards shouldbe Martin Edwardsnot Maria-BenedictaEdwards“Will the Real Scott Wilson Please Stand Up, Please Stand Up” – 66
    67. 67. Mendeley.com09/04/12 67
    68. 68. Mendeley.com09/04/12 68
    69. 69. Google Public Data Explorer 69
    70. 70. Statistical Review of World Energy 2011 | BP 70
    71. 71. Energy Export Databrowser 71
    72. 72. GasTrends databrowser 72
    73. 73. Department of Energy and Climate Change 09/04/12 73
    74. 74. RESTATS 74
    75. 75. OFFSTATS 75
    76. 76. OFFSTATS 76
    77. 77. Priced industry and market researchAggregators may not be comprehensive – use aggregators as an index to see who is publishing on your topic, for example (go to Energy and Utilities)For emerging markets sites specialising in energy reports - just search on energy market research reports 09/04/12 77
    78. 78. LinkedIn.com9 April 2012 Karen Blakeman 78
    79. 79. LinkedIn.com09/04/12 79
    80. 80. 80
    81. 81. Topsy.com09/04/12 81
    82. 82. Icerocket.com09/04/12 82
    83. 83. - create your own newspaper09/04/12 83
    84. 84. – individual Twitterstream 84
    85. 85. – keyword 85
    86. 86. Create your own Google custom search engine For – regularly searched sites – selected sites on a subject or type of organisation Cannot include password protected sources or sites where you have to fill in a form to access the information Information on setting up a Google Custom Search Engine (CSE) Googles blog on custom search April 2012 Karen Blakeman 86
    87. 87. 1. Think about the type of information you are looking for – news, statistics, experts2. Get to know the options in Googles sidebar3. Get to know the advanced search commands for example site: filetype: intitle: numeric range4. Get to know the alternative search tools and use them!5. Keep up to date with changes to search tools and in particular Google09/04/12 87
    88. 88. Keeping up to dateInside Search Google Blog Scholar Blog Engine Land Engine Watch Black Belt-Sourcing/Recruiting Blakeman’s Blog Bradleys weblog 88