Improving Search Engines using Online Communities

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Improving Search Engines using Online Communities - Presentation Transcript

    1. Improving Search Engines using Online Communities Anatoliy Gruzd <agruzd2@uiuc.edu> Research Forum Graduate School of Library and Information Science University of Illinois, Urbana-Champaign, IL March 14, 2007 It takes an [Internet] village …
    2. Agenda
      • Common search problems
      • Online bookmarking - http://del.icio.us
      • Pilot Study
      • Future work
    3. Common search problems
      • The main drawback of all modern search engines is that they force the user to guess words that might appear in all relevant documents and at the same time will not appear in NON-relevant documents.
      • A relevant page will not be retrieved, if it does not contain keywords that the user chose for searching.
      • 2. Even If user’s search keywords are found inside a web page, it does not always mean that the page is relevant to the user.
    4. Query#1: weight loss User’s Query Web page Matching Results weight loss weight loss ??? Architecture of a typical search engine
    5. Query#1: weight loss
      • http://www.paleofood.com/
        • Recipes are: grain-free, bean-free, potato-free, dairy - free, and sugar-free.
    6. Query#2: assignment about &quot;human brain&quot; for homeschooling This is an instructor’s blog for a Human Development class in the Evergreen State College. The page was retrieved because of two unrelated postings titled “ Homeschoolers use selective socialization” and “ Part Of Human Brain Functions Like A Digital Computer”.
    7. Agenda
      • Common search problems
      • Online bookmarking - http://del.icio.us
      • Pilot Study
      • Future work
    8.  
    9. username
    10. C ommon T ags for http://www.paleofood.com/
      • ethnic
      • evolutionary eating
      • food
      • allergies
      • german
      • naturopathic
      • primitivism
      • weight loss
      Tag Tag Tag
    11. User’s Query Web page Matching Results weight loss weight loss ??? Tags
    12. Agenda
      • Common search problems
      • Online bookmarking - http://del.icio.us
      • Pilot Study
      • Future work
    13. Pilot Study User’s Query Web page Matching Results A Tags Matching Results B System A System B
    14. Pilot Study
      • Search engine
        • I ndri , a cooperative effort between the University of Massachusetts and Carnegie Mellon University
      • Search queries
        • ~20-30 Users’ real questions found on the Internet
      • Pilot dataset
        • 454 health-related web pages
    15. “ The Open Directory Project is the largest, most comprehensive human-edited directory of the Web. ” http:// dmoz.org Started with ~ 64,000 URLs (from Top/Health/Conditions_and_Diseases ) -> only 544 are bookmarked by del.icio.us users -> only 454 were accessible at the time of my experiment Pilot dataset : 454 health-related web pages /Digestive_Disorders 23 /Respiratory_Disorde 26 /Cardiovascular_Disorders 35 /Endocrine_Disorders 53 /Immune_Disorders/Immune_Deficiency 54 / Cancer 101 / Neurological_Disorders 115
    16. N oise in T ags
      • toread
      • todo
      • interesting
      • imported
      • safari_export
      • system:unfiled
      • .imported
    17. Compound tags
      • g eneral health
      • c omputer software
      • cancer patients - support groups
      • h igh blood pressure
      • who i want to share with
    18. Tags-based Keywords-based
      • (+++) Neuroscience For Kids - Explore the nervous system
      • (+++)
      • (+++)
      • (---) / term &quot;assignment&quot;
      • (---) / term &quot;brain [center]&quot;
      • (+++) Neuroscience For Kids - Explore the nervous system
      homeschool human medical reference education cognitive biology psychology anatomy Common tag s Web page Matching Results A System A Tags Matching Results B System B
    19. Agenda
      • Common search problems
      • Online bookmarking - http://del.icio.us
      • Pilot Study
      • Future work
    20. Future work
      • Use a larger dataset
      • Compare results across different subject domains and genres
      • Explore ways to combine tags and keywords to determine whether it will improve the quality of results (if at all)

    + Dalhousie University, CanadaDalhousie University, Canada, 3 years ago

    custom

    671 views, 0 favs, 2 embeds more stats

    Anatoliy Gruzd
    Research Forum, Graduate School of more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 671
      • 655 on SlideShare
      • 16 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 3
    Most viewed embeds
    • 15 views on http://people.lis.uiuc.edu
    • 1 views on http://people.lis.illinois.edu

    more

    All embeds
    • 15 views on http://people.lis.uiuc.edu
    • 1 views on http://people.lis.illinois.edu

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories