Using Apache Solr

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    7 Favorites

    Using Apache Solr - Presentation Transcript

    1. Full Text Search with Apache Solr Pittaya Sroilong pittaya@gmail.com
    2. Who am I?
    3. Solr?
    4. Not her!
    5. But a search server
    6. based on Lucene
    7. Lucene?
    8. Full-text search library
    9. 100% java :-(
    10. Solr is based on Lucene
    11. XML/HTTP, JSON interface
    12. Open Source
    13. Shield us from using Java :-)
    14. Who use Solr/Lucene?
    15. Who use Solr/Lucene?
    16. What is our problem?
    17. How do we implement this?
    18. SELECT * FROM post WHERE topic LIKE ‘%aoi%’ OR author LIKE ‘%aoi%’ ORDER BY id DESC
    19. SELECT * FROM post WHERE (topic LIKE ‘%aoi%’ OR author LIKE ‘%aoi%’) OR (topic LIKE ‘%miyabi%’ OR author LIKE ‘%miyabi%’) ORDER BY id DESC
    20. Full table scan = Performance killer
    21. No search scoring
    22. RDBMS isn’t designed to do this
    23. Use the right tool!
    24. Indexer Update index Query Solr Web App Lucene Result
    25. 1
    26. De ne schema.xml
    27. <field name=\"id\" type=\"string\" indexed=\"true\" stored=\"true\" /> <field name=\"fullname\" type=\"string\" indexed=\"true\" stored=\"true\" /> <field name=\"position\" type=\"string\" indexed=\"true\" stored=\"true\" /> <field name=\"tag\" type=\"stringi\" indexed=\"true\" stored=\"true\" multiValued=\"true\" />
    28. 2
    29. Deploy on any J2EE container
    30. Tomcat, Jetty, etc.
    31. 3
    32. Index documents
    33. Document format <add><doc> <field name=”id”>555</field> <field name=”fullname”>Kaka</field> <field name=”position”>Midfielder</field> <field name=”tag”>AC Milan</field> <field name=”tag”>Brazil</field> </doc></add>
    34. Post to Solr http://<host>/solr/update
    35. Any language that can do HTTP POST
    36. PHP, Perl, Python
    37. cURL
    38. Commit <commit />
    39. 4
    40. Search
    41. Query from http://<host>/solr/select
    42. Use Solr query syntax
    43. http://<host>/solr/select? q=tag:madrid&start=0&rows =2& =fullname,position,tag
    44. Response in XML or JSON (con gurable)
    45. <response> <result numFound=”46” start=”0”> <doc> <str name=”fullname”>Sergio Ramos</str> <str name=”position”>Defender</str> <str name=”tag”>Real Madrid</str> <str name=”tag”>Spain</str> </doc> <doc> <str name=”fullname”>Diego Forlan</str> <str name=”position”>Striker</str> <str name=”tag”>Atletico Madrid</str> <str name=”tag”>Uruguay</str> </doc> </result> </response>
    46. &wt=json
    47. { “result”: { “numFound”: 46, “start”: 0, “docs” : [ { “fullname”: “Sergio Ramos”, “position”: “Defender”, “tag”: [“Real Madrid”, “Spain”] }, { “fullname”: “Diego Forlan”, “position”: “Striker”, “tag”: [“Atletico Madrid”, “Uruguay”] } ] } }
    48. Query examples
    49. • David Pizzarro • Equiv: David OR Pizzarro • Default operator is “OR” (con gurable) • Result: David Villa, David Pizzarro, Claudio Pizzarro, David Seaman
    50. • +David +tag:Roma • Equiv: David AND tag:Roma • Result: David Pizzarro
    51. • +David +position:(Striker OR Mid elder) • Result: David Villa, David Pizzarro
    52. Updating
    53. Post new document to http://<host>/solr/update
    54. Deleting
    55. <delete> <id>345</id> </delete>
    56. <delete> <query>tag:Brazil</query> </delete>
    57. <delete> <query>*:*</query> </delete>
    58. Thai support
    59. fwdder.com
    60. Sharing forward mails
    61. Use customized eld in schema.xml
    62. <fieldType name=\"html_th\" class=\"solr.TextField\" positionIncrementGap=\"100\"> <analyzer type=\"index\"> <tokenizer class=\"solr.HTMLStripStandardTokenizerFactory\"/> <filter class=\"solr.ThaiWordFilterFactory\" /> <filter class=\"solr.StopFilterFactory\" ignoreCase=\"true\" words=\"stopwords.txt\"/> <filter class=\"solr.LowerCaseFilterFactory\"/> <filter class=\"solr.EnglishPorterFilterFactory\" protected=\"protwords.txt\"/> <filter class=\"solr.RemoveDuplicatesTokenFilterFactory\"/> </analyzer> </fieldType>
    63. <field name=\"id\" type=\"string\" indexed=\"true\" stored=\"true\" /> <field name=\"title\" type=\"html_th\" indexed=\"true\" stored=\"true\" /> <field name=\"detail\" type=\"html_th\" indexed=\"true\" stored=\"true\" /> <field name=\"tag\" type=\"stringi\" indexed=\"true\" stored=\"true\" multiValued=\"true\" /> <field name=\"userid\" type=\"integer\" indexed=\"false\" stored=\"true\" />
    64. Index analyzer
    65. Debugging
    66. &debugQuery=on
    67. Further readings • http://lucene.apache.org/solr/ • http://wiki.apache.org/solr • http://www.xml.com/pub/a/2006/08/09/ solr-indexing-xml-with-lucene- andrest.html • http://lucene.apache.org/java/docs/ scoring.html
    68. Q&A

    + pittayapittaya, 2 years ago

    custom

    4680 views, 7 favs, 1 embeds more stats

    intro to full text search solution, Apache Solr

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 4680
      • 4679 on SlideShare
      • 1 from embeds
    • Comments 0
    • Favorites 7
    • Downloads 157
    Most viewed embeds
    • 1 views on http://www.fromlabs.com

    more

    All embeds
    • 1 views on http://www.fromlabs.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories