Your SlideShare is downloading. ×

OrientDB & Lucene

2,267

Published on

Published in: Technology, Education
0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,267
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
42
Comments
0
Likes
11
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Enrico Risa The Dynamic Duo OrientDB & Lucene
  • 2. Outline ❖ Apache Lucene in a nutshell! ❖ OrientDB Indexing! ❖ OrientDB-Lucene
 - Full Text Index
 - Spatial Index! ❖ Roadmap 2.0
  • 3. What Is Lucene? ❖ Free-text indexing library! ❖ Implements standard IR/search functionality
 ● Query models, ranking, indexing! ❖ Written in Java! ❖ Simple Api! ❖ Fast, Mature and constantly evolving! ❖ Many extension points
  • 4. Who uses Lucene? ❖ Twitter! ❖ Linkedin! ❖ Apple! ❖ Solr! ❖ Elastic Search! ❖ Neo4J! ❖ and now OrientDB
  • 5. Base Lucene workflow
  • 6. Documents ❖ Basic Unit for indexing and searching! ❖ Contains a list of Fields! ❖ Schema-less
  • 7. Fields ❖ Basic component of a Document! ❖ Fields
 - name
 - value
 - store
 - analyzed

  • 8. Fields Types & Options ❖ Types
 -Field
 -StringField
 -TextField
 -StoredField
 -IntField
 -…More! ❖ Options
 -Stored or Not
 -Indexed or not
 -Analyzed or not
 

  • 9. Directory ❖ RAMDirectory
 Ram based index! ❖ FSDirectory
 File-based index! ❖ NIOFSDirectory
 Same as FSDirectory but using NIO api.

  • 10. Indexing Documents
  • 11. Searching Index
  • 12. Inverted Index
  • 13. Luke: a graphical user interface ❖ Open Lucene Index! ❖ Browse documents! ❖ Run query! ❖ ….
  • 14. OrientDB Indexing ❖ SBTree 
 (Unique,Not unique, Full Text, Dictionary)! ❖ HashIndex
 (Unique,Not unique, Full Text, Dictionary)! ❖ MVRB-Tree (Deprecated since 1.6)! ❖ Lucene (OrientDB-Lucene)! ❖ … https://github.com/orientechnologies/orientdb/ wiki/Custom-Index-Engine
  • 15. OrientDB Lucene ❖ Open Source at 
 https://github.com/orientechnologies/orientdb-lucene! ❖ This project aims to bring the power of Lucene index into OrientDB.! ❖ Supports only Spatial Index And Full Text
  • 16. Installing OrientDB Lucene ❖ Embedded Mode
 
 
 
 ❖ Server Mode
 Grab a jar build and copy it into $ORIENTDB_HOME/plugins
  • 17. Spatial Index ❖ No native implementation.! ❖ Build on top Lucene-Spatial Module.! ❖ Currently only points are supported.! ❖ Near and Within query.
  • 18. Lucene Spatial ❖ Spatial4j
 - Handle Shapes (Point,Circle,Rectangle, Polygon)
 - Distance and Area math utitilities
 - Read WKT format! ❖ Provide Indexing Strategy
 - RecursivePrefixTree! ❖ Spatial Query using Shapes
  • 19. Creating a Spatial Index ❖ SQL
 
 ❖ JAVA
  • 20. Spatial Operators ❖ NEAR
 Find all Points near a given location (latitude,longitude)! ❖ WITHIN
 Find all Points within a Given Bounding Box
  • 21. Near Operator ❖ Custom Operator that rely on Lucene Index! ❖ Special Syntax to support spatial args ($spatial)! ❖ Context variable $distance! ❖ Result set sorted from nearest to farthest.
  • 22. Within Operator ❖ Bounding Box Search! ❖ Currently Points within Box! ❖ Result set not sorted
  • 23. Full Text Index ❖ Native Full Text Implementation.! ❖ Supports multiple fields.! ❖ Supports Lucene query syntax.! ❖ Lucene Analyzers
  • 24. Creating a Full Text Index ❖ SQL
 
 ❖ JAVA
  • 25. Full Text Operators ❖ LUCENE
 [<fields>] LUCENE <exp>
 
 - Query your index using Query Parser syntax
 - Support Multiple fields
 - Target all fields (MultiFieldQueryParser)
 - Target specific field (QueryParser)

  • 26. Lucene Operator ❖ MultiFieldQueryParser
 Target all fields
 
 ❖ QueryParser
 Target specific field
  • 27. Indexing Performance ❖ Full Text
 - 9M records in ~300s with StandardAnalyzer and one field! ❖ Spatial 
 9M records in ~500s with two field (Point)
  • 28. Roadmap 2.0 ❖ Production Ready! ❖ Monitoring lucene index! ❖ More configuration! ❖ Gui tool integrated in Studio
  • 29. Roadmap 2.0 (Spatial Index) ❖ Index more shape! ❖ More operators (Intersect..)! ❖ Not only BBox! ❖ Support for GeoJson
 http://geojson.org
  • 30. Roadmap 2.0 (Full Text) ❖ Document & Field Boosting! ❖ Score in result set! ❖ Custom Analyzers & Filters! ❖ Search Engine
  • 31. Thank You Questions? ❖ Contact Me
 - Enrico Risa e.risa@orientechnologies.com
 - Twitter https://twitter.com/wolf4ood

×