2013 Tips and Tricks Track, Make it work like Google: Creating Search Indexes by David Haines

430 views
321 views

Published on

Users want to have one search box for everything and it needs to had autocomplete functionality. This presentation will go over how to create an attribute index table for searching across multiple tables and features in different databases. Then it will cover techniques applications can use the index table to implement autocomplete functionality.

Published in: Technology, News & Politics
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
430
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

2013 Tips and Tricks Track, Make it work like Google: Creating Search Indexes by David Haines

  1. 1. Make it work like Google: Creating a search index David Haines Land Use GIS Manager Land Use
  2. 2. “Make it Work like Google”
  3. 3. #1. Auto-complete #2. One search field #3. Without at least one of the above people will use something else. #4. Good results
  4. 4. What Boulder County Land Use we’re doing now…
  5. 5. There was this flood a few weeks ago…
  6. 6. Boulder County Land Use Dept. 9/17/2013
  7. 7. Pictometry 4/27/2011 Land Use Dept. 9/17/2013
  8. 8. Wasn’t Always this Way
  9. 9. Multiple Entry Fields Different Types of Search
  10. 10. Too many databases Hay for the Winter by Trey Ratcliff http://www.flickr.com/photos/stuckincustoms/8672061349/ Attribution License
  11. 11. Here? Here? Here? Here? Here? Here? Here? Here? Here? Hay for the Winter by Trey Ratcliff http://www.flickr.com/photos/stuckincustoms/8672061349/ Attribution License
  12. 12. Address & Owner Database Building Permit Database Assessment Database Place Name Database Subdivision Database
  13. 13. Roll of Hay by Klearchos Kapoutsis http://www.flickr.com/photos/klearchos/3824322183/ Attribution-NonCommercial License
  14. 14. Address & Owner Database Building Permit Database Assessment Database Place Name Database Subdivision Database Roll of Hay by Klearchos Kapoutsis http://www.flickr.com/photos/klearchos/3824322183/ Attribution-NonCommercial License
  15. 15. A Table to Search
  16. 16. What needs to be in the table? What to search for Where to go once you find it
  17. 17. Search Index Table fields Field*(ESRI) Description SearchText The search term. The “correct” answer. A perfect hit.
  18. 18. Search Index Table fields Field*(ESRI) Description SearchText The search term. The “correct” answer. A perfect hit. Workspace The database the search term is in
  19. 19. Search Index Table fields Field*(ESRI) Description SearchText The search term. The “correct” answer. A perfect hit. Workspace The database the search term is in FeatureClass The table in the database the search term is in
  20. 20. Search Index Table fields Field*(ESRI) Description SearchText The search term. The “correct” answer. A perfect hit. Workspace The database the search term is in FeatureClass The table in the database the search term is in IdField The field in the table the search term is in
  21. 21. Search Index Table fields Field*(ESRI) Description SearchText The search term. The “correct” answer. A perfect hit. Workspace The database the search term is in FeatureClass The table in the database the search term is in IdField The field in the table the search term is in Id The record (or row) the the search term is in. This is what the search should go to.
  22. 22. Search: 1234 Main Street
  23. 23. Search: 1234 Main Street Match Field SearchText Workspace FeatureClass IdField Id Description 1234 S Main Street Boulder G:gis.sde dbo.parcel PIN MKE11171971
  24. 24. Search: 1234 Main Street Field SearchText Workspace FeatureClass IdField Id Description 1234 S Main Street Boulder gisgisdata.sde Zoom! dbo.parcel PIN MKE11171971
  25. 25. Blue skies and silos by Matthew Rutledge http://www.flickr.com/photos/rutlo/3872475221/ Attribution-NonCommercial License Different Data Silos?
  26. 26. Boulder County Land Use Search Index Theme Example Database Parcel Number 157416001234 Assessor - CAMA Address 1234 S Main Street Assessor – CAMA Owner John Doe Assessor – CAMA Tax Account R01234567 Assessor – CAMA Building Permit BP-13-0001 Land Use - Accela Docket Number SPR-13-0010 Land Use – Accela Docket Name Smith Residence Land Use – Accela Subdivision Big Oak Meadows GIS - ArcSDE Mining Claim Name Blue Bird Mine #2 GIS – ArcSDE Geographic Names Longs Peak GIS - ArcSDE
  27. 27. Do the hard work to make it simple “Making something look simple is easy; making something simple to use is much harder — especially when the underlying systems are complex — but that’s what we should be doing.” https://www.gov.uk/designprinciples#fourth
  28. 28. Steps to update the index table Load existing index
  29. 29. Steps to update the index table Load existing index Load reference data (i.e. data your searching)
  30. 30. Steps to update the index table Load existing index Load reference data (i.e. data your searching) Find reference data not in index
  31. 31. Steps to update the index table Load existing index Load reference data (i.e. data your searching) Find reference data not in index Add that data
  32. 32. Steps to update the index table Load existing index Load reference data (i.e. data your searching) Find reference data not in index Add that data Search for index data not in the index (deleted)
  33. 33. Steps to update the index table Load existing index Load reference data (i.e. data your searching) Find reference data not in index Add that data Search for index data not in the index (deleted) Remove it
  34. 34. Steps to update the index table Load existing index Load reference data (i.e. data your searching) Find reference data not in index Add that data Search for index data not in the index (deleted) Remove it Repeat for each dataset
  35. 35. One Search Index Many Applications Swiss Army Knife by AJ Cann http://www.flickr.com/photos/ajc1/4663140532/ Attribution License
  36. 36. Now that you built your table… … optimize your applications
  37. 37. Optimizing wildcard searching Try First: “OnlyWord%” Try Second: “%OnlyWord%” Search: “tree” Results: “Tree View” “Treetop” “Green Tree” “North Woodtree”
  38. 38. Optimizing wildcard searching Word1 Word2% Word1% %Word2% F a s t e r Word1% %Word2% %Word3% … %Word1% %Word2% %Word1% %Word2% %Word3% …
  39. 39. Score Your Results The Levenshtein distance between two words is the minimum number of singlecharacter edits (insertion, deletion, substitution) [including spaces] required to change one word into the other. 1234 Ced 1234 N Cedar Brook Lane Source: en.wikipedia.org/wiki/Levenshtein_distance
  40. 40. Statistics Address Searches 85% Owner Searches 5% Parcel Number Searches 5% Address Search Average 6.5 Characters Address Search 1.6 Words 49% one word, 43% two words, 8% 3+ words
  41. 41. Land Use
  42. 42. Pictometry 4/27/2011
  43. 43. Land Use Dept. 9/17/2013

×