Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Finding data: advanced search operators


Published on

Part of the Data and Multimedia Journalism module on the MA in Online Journalism at Birmingham City University

Published in: Education
  • Be the first to comment

Finding data: advanced search operators

  1. 1. MED7126 Data and Multimedia Journalism Paul Bradshaw Getting the data Advanced search tips
  2. 2. Don’t ask for what you want: describe what you expect to find Search operators
  3. 3. What text will it contain? Where will that text be? What text will it not contain? Imagine the data: text
  4. 4. Specific references, not general: Specify a constituency… …a school …an institution code …an invoice number …a piece of jargon
  5. 5. quotes: “disclosure log” asterisk “between * and 2014” minus “hate crime” -religion -"publication scheme" Number ranges: 2000..2014
  6. 6. ‘life expectancy Birmingham’
  7. 7. "life expectancy" 
 "perry barr"
  8. 8. inurl:
  9. 9. inurl:foi inurl:ccg inurl:intranet inurl:search.asp inurl:search.php
  10. 10. intitle: allintitle:
  11. 11. intitle:foi allintitle:disclosure log intitle:“bank fines”
  12. 12. intext: allintext:
  13. 13. intext:“miserable failure” allintext:miserable failure
  14. 14. "life expectancy" 
 "perry barr"
  15. 15. "life expectancy" 
 "perry barr" 
  16. 16. "life expectancy" 
 "perry barr" 
  17. 17. "life expectancy" 
 "perry barr" 
  18. 18. "life expectancy" 
 "perry barr" 
  19. 19. Where is it likely to be What format? When was it not published? Imagine the data: meta data
  20. 20. site:
  21. 21. site:org disclosure
  22. 22. filetype:
  23. 23. filetype:xls filetype:xlsx filetype:pdf filetype:csv filetype:ppt filetype:doc filetype:docx filetype:xml
  24. 24. search tools
  25. 25. “disclosure log” allintitle:hate crime report filetype:pdf art inurl:search.asp -library Combine operators:
  26. 26.
  27. 27.
  28. 28. Some sites use the robots.txt protocol to tell search engines not to index Use DownThemAll to download the site and search it locally Sites that aren’t indexed
  29. 29. Do it now: Search for a piece of jargon in your field, on a particular type of site Search for spreadsheets or PDFs mentioning an individual in your field
  30. 30. Links: u:paulbradshaw/t:data +searchengine