Enterprise Search in
Plone using Solr
Calvin Hendryx-Parker
Plone Conference 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Java Based
• Full-Text Search
• Web Services API
• Standards Based Interfaces
• Scalable
• XML Con...
PLONE CONFERENCE 2010
• Indexing
• Query
Playing with Solr
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Data Schema
• Faceted Search
• Administrative
Interface
• Incremental Updates
• Supports Sharding
...
PLONE CONFERENCE 2010
• Stopwords
• Synonyms
• Highlighted Context
Snippets
• Spelling Suggestions
• More Like This
Sugges...
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010Solr Performance
• Wiktionary Dataset
• 49.5 Millions lines of XML
• 1.3 GB of data
• 1.7 Million Pag...
PLONE CONFERENCE 2010
collective.solr
Integration Options
with Plone
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Monkey Patching
• Relies on collective.indexing
• Duplicates all indexes
• Sub-Optimal Integration...
PLONE CONFERENCE 2010
What to do?
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Reevaluate
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• No Monkey Patching
• Simpler Code
Solr Integration as a
Catalog Index
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• ZCatalog Index
• Doesn't depend on
Plone
• Utilizes new
foreign_connections
Connection Method
• Pa...
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Still handled by the ZCatalog
• Could change in the future
Sorting
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Handle Parsing Attributes for Indexing
• Translate field-specific queries to Solr
• Registered as Zo...
PLONE CONFERENCE 2010
<html>
<body>
<h3>Code Sample</h3>
<p>Replace this text!</p>
</body>
</html>
Example Handler
class T...
PLONE CONFERENCE 2010
• GenericSetup Profile
• Tests
• Uses solrpy instead of
the unsupported
solr.py
Other alm.solrindex
F...
PLONE CONFERENCE 2010
• Can replace several ZCatalog indexes
• Remove any indexes you have replaced
• Use it for all Text ...
PLONE CONFERENCE 2010
Demo
Project Gutenburg Data
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Questions?
Wednesday, October 27, 2010
Check out
sixfeetup.com/demos
Wednesday, October 27, 2010
Upcoming SlideShare
Loading in...5
×

Enterprise search in plone using solr

1,215

Published on

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,215
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
20
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Enterprise search in plone using solr

  1. 1. Enterprise Search in Plone using Solr Calvin Hendryx-Parker Plone Conference 2010 Wednesday, October 27, 2010
  2. 2. PLONE CONFERENCE 2010 • Java Based • Full-Text Search • Web Services API • Standards Based Interfaces • Scalable • XML Configuration • Extensible What is Solr? Wednesday, October 27, 2010
  3. 3. PLONE CONFERENCE 2010 • Indexing • Query Playing with Solr Wednesday, October 27, 2010
  4. 4. PLONE CONFERENCE 2010 Wednesday, October 27, 2010
  5. 5. PLONE CONFERENCE 2010 Wednesday, October 27, 2010
  6. 6. PLONE CONFERENCE 2010 • Data Schema • Faceted Search • Administrative Interface • Incremental Updates • Supports Sharding • Index Databases, Local Files and Web Pages • Supports Multiple Indexes Solr Features Wednesday, October 27, 2010
  7. 7. PLONE CONFERENCE 2010 • Stopwords • Synonyms • Highlighted Context Snippets • Spelling Suggestions • More Like This Suggestions • Supports Rich Documents Solr Features Wednesday, October 27, 2010
  8. 8. PLONE CONFERENCE 2010 Wednesday, October 27, 2010
  9. 9. PLONE CONFERENCE 2010 Wednesday, October 27, 2010
  10. 10. PLONE CONFERENCE 2010 Wednesday, October 27, 2010
  11. 11. PLONE CONFERENCE 2010Solr Performance • Wiktionary Dataset • 49.5 Millions lines of XML • 1.3 GB of data • 1.7 Million Pages Indexed in 5.5 hours • ZODB Size after import 1.1GB Wednesday, October 27, 2010
  12. 12. PLONE CONFERENCE 2010 collective.solr Integration Options with Plone Wednesday, October 27, 2010
  13. 13. PLONE CONFERENCE 2010 • Monkey Patching • Relies on collective.indexing • Duplicates all indexes • Sub-Optimal Integration with Zope Transactions • Relies on Thread Locals collective.solr Issues Wednesday, October 27, 2010
  14. 14. PLONE CONFERENCE 2010 What to do? Wednesday, October 27, 2010
  15. 15. PLONE CONFERENCE 2010 Reevaluate Wednesday, October 27, 2010
  16. 16. PLONE CONFERENCE 2010 • No Monkey Patching • Simpler Code Solr Integration as a Catalog Index Wednesday, October 27, 2010
  17. 17. PLONE CONFERENCE 2010 • ZCatalog Index • Doesn't depend on Plone • Utilizes new foreign_connections Connection Method • Pass through Solr Queries • Direct access to the Solr Response Enter alm.solrindex Wednesday, October 27, 2010
  18. 18. PLONE CONFERENCE 2010 Wednesday, October 27, 2010
  19. 19. PLONE CONFERENCE 2010 Wednesday, October 27, 2010
  20. 20. PLONE CONFERENCE 2010 • Still handled by the ZCatalog • Could change in the future Sorting Wednesday, October 27, 2010
  21. 21. PLONE CONFERENCE 2010 • Handle Parsing Attributes for Indexing • Translate field-specific queries to Solr • Registered as Zope Utilities alm.solrindex Field Handlers Wednesday, October 27, 2010
  22. 22. PLONE CONFERENCE 2010 <html> <body> <h3>Code Sample</h3> <p>Replace this text!</p> </body> </html> Example Handler class TextFieldHandler(DefaultFieldHandler): def parse_query(self, field, field_query): name = field.name request = {name: field_query} record = parseIndexRequest(request, name, ('query',)) if not record.keys: return None query_str = ' '.join(record.keys) if not query_str: return None return {'q': u'+%s:%s' % (name, quote_query(query_str))} Wednesday, October 27, 2010
  23. 23. PLONE CONFERENCE 2010 • GenericSetup Profile • Tests • Uses solrpy instead of the unsupported solr.py Other alm.solrindex Features Wednesday, October 27, 2010
  24. 24. PLONE CONFERENCE 2010 • Can replace several ZCatalog indexes • Remove any indexes you have replaced • Use it for all Text Indexes • Still Utilize the ZCatalog Indexes for Everything Else Tips Wednesday, October 27, 2010
  25. 25. PLONE CONFERENCE 2010 Demo Project Gutenburg Data Wednesday, October 27, 2010
  26. 26. PLONE CONFERENCE 2010 Questions? Wednesday, October 27, 2010
  27. 27. Check out sixfeetup.com/demos Wednesday, October 27, 2010
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×