Your SlideShare is downloading. ×
Search Systems
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Search Systems


Published on

Published in: Technology

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Information Architecture Search Systems
  • 2. Does your site need search? ▫ Does your site have enough contents? ▫ Will this divert resources from navigation systems? ▫ Do you have time and knowledge to optimize the search system? ▫ Are there alternatives? ▫ Will your users bother with search?
  • 3. Before you add a search system • Do not assume that a search engine alone will satisfy all users information needs • Should be used in addition to well structured navigation, not replacing navigation
  • 4. Need a search system if… • When you have too much content to browse or content warrants it ▫ Eg – course catalog, research site, large site like Microsoft, real estate site • Fragmented subsites – Eg – UB • Site is a learning tool – Eg – web coding tutorials online • Dynamic site like a newspaper where articles are archived and only way to access them is to search
  • 5. Search System Anatomy • Indexing by SE • Web Sites need to be SEO • Spiders • What is indexed – url, title, headings, keywords, content • Search interface • Boolean operators (and, or, not)
  • 6. The Retrieval Process Search Search User Query Content Results Interface Engine Query Ranked Docs Operations Retrieved Docs DB Manager Module Text Database
  • 7. Search Systems • Types of searches: ▫ Basic Search (also known as “keyword search” ▫ Advanced search: Use of search refinement and metadata search. • Search Engines are the software applications and foundation of search systems
  • 8. Choosing what to search • Don’t have to index everything • If you conduct an inventory and analysis of your content you should have a good idea of what content is “good” • Silos – staff directories, sub sites, tech articles, books, etc… • Content components – title, author, etc..
  • 9. Search Zones • Subsets of the site that have been indexed separately. ▫ Example ▫ Amazon does a great job of this • Can be: content type, audience, role, topic, geography, chronology, department
  • 10. Types of Pages • Navigation pages – pages that help you browse a site • Destination pages – contain actual information • Want to make sure search results contain mostly destination pages
  • 11. Search Systems • Selecting content components to index ▫ Take advantage of the site structure ▫ Components to index: • Body • Image Link • Title • Image alt text • URL • Description • Site name • Keywords • Link • Remote anchor text
  • 12. Search Algorithms • There are many types of algorithms available. • The bottom line is to select the one that is appropriate for the type of search capabilities required by the user.
  • 13. Set Theoretic •Fuzzy Classic Models •Extended Boolean •Boolean U •Vector space s Retrieval: •Probabilistic Algebraic e Adhoc •Generalized Vector r Filtering •Lat. Semantic Index T •Neural Networks Structured Models a •Non Overlapping s Lists Probabilistic k •Proximal nodes s •Inference Network Browsing •Belief Network Browsing •Language Models •Flat •Structure Guided •Hypertext
  • 14. Pattern Matching Algorithms • Most common, matches a string that user entered • Depending on your user’s needs you have to emphasize recall or precision. • Recall - #relevant docs retrieved / #relevant docs in collection • Precision - #relevant docs retrieved / #total docs in collection
  • 15. Pattern Matching Algorithms • Automatic Stemming – expands a term to include other terms that share the same root ▫ Eg: “word” gets you “password” • No Stemming – results contain just that word • Depends on the content you are indexing. Eg – course catalog
  • 16. Other Approaches • Document Similarity - Allowing user feedback (more like this option) ▫ Can be done by re-querying w/o stopwords or automatically based on metadata • Collaborative filtering  Cited by  Active Bibliography (related docs)  Users who viewed this document also viewed  Similar documents based on text  Related documents based on co-citation
  • 17. Query Builders • Tools that help SE performance – invisible to users ▫ Spell-checkers – Google’s “did you mean” ▫ Phonetic tools – sounds like ▫ Stemming tools – same stem results ▫ Natural language processing tools – how to ▫ Controlled vocabulary – include synonyms
  • 18. Presenting Results • What to display? ▫ Title ▫ Summary ▫ Relevance score ▫ Other parts of the structure of docs ▫ Depends on your audience – more or less info – give users the option to see ‘detailed’ results if they choose – descriptive vs reprenstational • How many documents? ▫ Number of retrieved docs ▫ Number of results per page
  • 19. Listing Results • Sorting  Alphabetically  Chronologically • Ranking  By relevance  By popularity  By users’ or experts’ ratings  By pay-for-placement
  • 20. Listing Results • Grouping results: Clustering • Exporting results  Print or email results  Select a subset of results  Save search • No single approach is perfect – combine approaches
  • 21. Search Interfaces • Factors that affect the interface design  User’s searching expertise  Type of results wanted  Type of information being searched  Amount of information being searched
  • 22. Search Interface • The box: Simple and clear ▫ Good for users that don’t want to learn more about the search mechanism ▫ Placement of search matters on a site ▫ Put close to main navigation or near top of page ▫ Don’t be creative with button label
  • 23. Advanced Search • Unveils search system functionality ▫ Field searching ▫ Date ranges ▫ Search zones • How often do you take advantage of these features?
  • 24. Supporting Revision • What to do when users don’t get what they want?  Repeat search in results  Explain where results came from (what data was searched)  Explain what the user did (restate query, filters, sort order)  Integrate searching and browsing (product inventory)
  • 25. Search Systems • When users get stuck ▫ Way too many results  Options to narrow search ▫ Zero results:  Offer means of revising the search  Search tips  A means of browsing (I.e. site map)  Human contact if searching & browsing don’t work
  • 26. Search Systems • Commercial web site search available:  Verity Ultraseek  Altavista  Google  …… and many others
  • 27. Search Systems • Free search options: ▫ Adding Google search to your site:  ▫ Open source software:  Lucene: (Jakarta Project)  MG: (Managing Gigabytes)
  • 28. Discussion Questions • How has the search engine changed the way we use the web? • Where do you see it going in the future? • Search Engines – Pros / Cons • Articles