OrientDB & Lucene

5,138 views

Published on

Published in: Technology, Education

OrientDB & Lucene

  1. 1. Enrico Risa The Dynamic Duo OrientDB & Lucene
  2. 2. Outline ❖ Apache Lucene in a nutshell! ❖ OrientDB Indexing! ❖ OrientDB-Lucene
 - Full Text Index
 - Spatial Index! ❖ Roadmap 2.0
  3. 3. What Is Lucene? ❖ Free-text indexing library! ❖ Implements standard IR/search functionality
 ● Query models, ranking, indexing! ❖ Written in Java! ❖ Simple Api! ❖ Fast, Mature and constantly evolving! ❖ Many extension points
  4. 4. Who uses Lucene? ❖ Twitter! ❖ Linkedin! ❖ Apple! ❖ Solr! ❖ Elastic Search! ❖ Neo4J! ❖ and now OrientDB
  5. 5. Base Lucene workflow
  6. 6. Documents ❖ Basic Unit for indexing and searching! ❖ Contains a list of Fields! ❖ Schema-less
  7. 7. Fields ❖ Basic component of a Document! ❖ Fields
 - name
 - value
 - store
 - analyzed

  8. 8. Fields Types & Options ❖ Types
 -Field
 -StringField
 -TextField
 -StoredField
 -IntField
 -…More! ❖ Options
 -Stored or Not
 -Indexed or not
 -Analyzed or not
 

  9. 9. Directory ❖ RAMDirectory
 Ram based index! ❖ FSDirectory
 File-based index! ❖ NIOFSDirectory
 Same as FSDirectory but using NIO api.

  10. 10. Indexing Documents
  11. 11. Searching Index
  12. 12. Inverted Index
  13. 13. Luke: a graphical user interface ❖ Open Lucene Index! ❖ Browse documents! ❖ Run query! ❖ ….
  14. 14. OrientDB Indexing ❖ SBTree 
 (Unique,Not unique, Full Text, Dictionary)! ❖ HashIndex
 (Unique,Not unique, Full Text, Dictionary)! ❖ MVRB-Tree (Deprecated since 1.6)! ❖ Lucene (OrientDB-Lucene)! ❖ … https://github.com/orientechnologies/orientdb/ wiki/Custom-Index-Engine
  15. 15. OrientDB Lucene ❖ Open Source at 
 https://github.com/orientechnologies/orientdb-lucene! ❖ This project aims to bring the power of Lucene index into OrientDB.! ❖ Supports only Spatial Index And Full Text
  16. 16. Installing OrientDB Lucene ❖ Embedded Mode
 
 
 
 ❖ Server Mode
 Grab a jar build and copy it into $ORIENTDB_HOME/plugins
  17. 17. Spatial Index ❖ No native implementation.! ❖ Build on top Lucene-Spatial Module.! ❖ Currently only points are supported.! ❖ Near and Within query.
  18. 18. Lucene Spatial ❖ Spatial4j
 - Handle Shapes (Point,Circle,Rectangle, Polygon)
 - Distance and Area math utitilities
 - Read WKT format! ❖ Provide Indexing Strategy
 - RecursivePrefixTree! ❖ Spatial Query using Shapes
  19. 19. Creating a Spatial Index ❖ SQL
 
 ❖ JAVA
  20. 20. Spatial Operators ❖ NEAR
 Find all Points near a given location (latitude,longitude)! ❖ WITHIN
 Find all Points within a Given Bounding Box
  21. 21. Near Operator ❖ Custom Operator that rely on Lucene Index! ❖ Special Syntax to support spatial args ($spatial)! ❖ Context variable $distance! ❖ Result set sorted from nearest to farthest.
  22. 22. Within Operator ❖ Bounding Box Search! ❖ Currently Points within Box! ❖ Result set not sorted
  23. 23. Full Text Index ❖ Native Full Text Implementation.! ❖ Supports multiple fields.! ❖ Supports Lucene query syntax.! ❖ Lucene Analyzers
  24. 24. Creating a Full Text Index ❖ SQL
 
 ❖ JAVA
  25. 25. Full Text Operators ❖ LUCENE
 [<fields>] LUCENE <exp>
 
 - Query your index using Query Parser syntax
 - Support Multiple fields
 - Target all fields (MultiFieldQueryParser)
 - Target specific field (QueryParser)

  26. 26. Lucene Operator ❖ MultiFieldQueryParser
 Target all fields
 
 ❖ QueryParser
 Target specific field
  27. 27. Indexing Performance ❖ Full Text
 - 9M records in ~300s with StandardAnalyzer and one field! ❖ Spatial 
 9M records in ~500s with two field (Point)
  28. 28. Roadmap 2.0 ❖ Production Ready! ❖ Monitoring lucene index! ❖ More configuration! ❖ Gui tool integrated in Studio
  29. 29. Roadmap 2.0 (Spatial Index) ❖ Index more shape! ❖ More operators (Intersect..)! ❖ Not only BBox! ❖ Support for GeoJson
 http://geojson.org
  30. 30. Roadmap 2.0 (Full Text) ❖ Document & Field Boosting! ❖ Score in result set! ❖ Custom Analyzers & Filters! ❖ Search Engine
  31. 31. Thank You Questions? ❖ Contact Me
 - Enrico Risa e.risa@orientechnologies.com
 - Twitter https://twitter.com/wolf4ood

×