SlideShare a Scribd company logo

CouchDB-Lucene

Introduction to full text search with CouchDB and Lucene, focussed on usage with Ruby (on Rails).

1 of 31
Download to read offline
Finding stuff under the Couch
   with CouchDB-Lucene



           Martin Rehfeld
        @ RUG-B 01-Apr-2010
CouchDB

•   JSON document store
•   all documents in a given database reside in
    one large pool and may be retrieved using
    their ID ...
•   ... or through Map & Reduce based indexes
So how do you do full
    text search?
You potentially could
 achieve this with just
Map & Reduce functions
But that would mean
implementing an actual
   search engine ...
... and this has been done
           before.
Ad

Recommended

More Related Content

What's hot

Jetpack Compose a nova forma de implementar UI no Android
Jetpack Compose a nova forma de implementar UI no AndroidJetpack Compose a nova forma de implementar UI no Android
Jetpack Compose a nova forma de implementar UI no AndroidNelson Glauber Leal
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Amazon Web Services
 
Introduction to Struts 1.3
Introduction to Struts 1.3Introduction to Struts 1.3
Introduction to Struts 1.3Ilio Catallo
 
Shared preferences
Shared preferencesShared preferences
Shared preferencesSourabh Sahu
 
Java Hibernate Programming with Architecture Diagram and Example
Java Hibernate Programming with Architecture Diagram and ExampleJava Hibernate Programming with Architecture Diagram and Example
Java Hibernate Programming with Architecture Diagram and Examplekamal kotecha
 
Hibernate architecture
Hibernate architectureHibernate architecture
Hibernate architectureAnurag
 
Java Server Faces (JSF) - Basics
Java Server Faces (JSF) - BasicsJava Server Faces (JSF) - Basics
Java Server Faces (JSF) - BasicsBG Java EE Course
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentationHyphen Call
 
Hibernate Presentation
Hibernate  PresentationHibernate  Presentation
Hibernate Presentationguest11106b
 
Implicit objects advance Java
Implicit objects advance JavaImplicit objects advance Java
Implicit objects advance JavaDarshit Metaliya
 

What's hot (20)

Jetpack Compose a nova forma de implementar UI no Android
Jetpack Compose a nova forma de implementar UI no AndroidJetpack Compose a nova forma de implementar UI no Android
Jetpack Compose a nova forma de implementar UI no Android
 
web server
web serverweb server
web server
 
Soap Vs Rest
Soap Vs RestSoap Vs Rest
Soap Vs Rest
 
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon ...
 
Introduction to Struts 1.3
Introduction to Struts 1.3Introduction to Struts 1.3
Introduction to Struts 1.3
 
Shared preferences
Shared preferencesShared preferences
Shared preferences
 
Couch db
Couch dbCouch db
Couch db
 
MongoDB
MongoDBMongoDB
MongoDB
 
CouchDB
CouchDBCouchDB
CouchDB
 
Java Hibernate Programming with Architecture Diagram and Example
Java Hibernate Programming with Architecture Diagram and ExampleJava Hibernate Programming with Architecture Diagram and Example
Java Hibernate Programming with Architecture Diagram and Example
 
enterprise java bean
enterprise java beanenterprise java bean
enterprise java bean
 
Flyweight Pattern
Flyweight PatternFlyweight Pattern
Flyweight Pattern
 
Hibernate architecture
Hibernate architectureHibernate architecture
Hibernate architecture
 
Mongodb
MongodbMongodb
Mongodb
 
Android and REST
Android and RESTAndroid and REST
Android and REST
 
Java Server Faces (JSF) - Basics
Java Server Faces (JSF) - BasicsJava Server Faces (JSF) - Basics
Java Server Faces (JSF) - Basics
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
MongoDB presentation
MongoDB presentationMongoDB presentation
MongoDB presentation
 
Hibernate Presentation
Hibernate  PresentationHibernate  Presentation
Hibernate Presentation
 
Implicit objects advance Java
Implicit objects advance JavaImplicit objects advance Java
Implicit objects advance Java
 

Viewers also liked

Real World CouchDB
Real World CouchDBReal World CouchDB
Real World CouchDBJohn Wood
 
CouchDB – A Database for the Web
CouchDB – A Database for the WebCouchDB – A Database for the Web
CouchDB – A Database for the WebKarel Minarik
 
OSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDBOSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDBBradley Holt
 
(R)évolutions, l'innovation entre les lignes
(R)évolutions, l'innovation entre les lignes(R)évolutions, l'innovation entre les lignes
(R)évolutions, l'innovation entre les lignes366
 
Lucene - 10 ans d'usages plus ou moins classiques
Lucene - 10 ans d'usages plus ou moins classiquesLucene - 10 ans d'usages plus ou moins classiques
Lucene - 10 ans d'usages plus ou moins classiquesSylvain Wallez
 
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...StampedeCon
 
ZendCon 2011 Learning CouchDB
ZendCon 2011 Learning CouchDBZendCon 2011 Learning CouchDB
ZendCon 2011 Learning CouchDBBradley Holt
 
Couch Db In 60 Minutes
Couch Db In 60 MinutesCouch Db In 60 Minutes
Couch Db In 60 MinutesGeorge Ang
 
Couch db@nosql+taiwan
Couch db@nosql+taiwanCouch db@nosql+taiwan
Couch db@nosql+taiwanKenzou Yeh
 
CouchDB at New York PHP
CouchDB at New York PHPCouchDB at New York PHP
CouchDB at New York PHPBradley Holt
 
CouchApps: Requiem for Accidental Complexity
CouchApps: Requiem for Accidental ComplexityCouchApps: Requiem for Accidental Complexity
CouchApps: Requiem for Accidental ComplexityFederico Galassi
 
An introduction to CouchDB
An introduction to CouchDBAn introduction to CouchDB
An introduction to CouchDBDavid Coallier
 
Search engine-optimization-starter-guide-fr
Search engine-optimization-starter-guide-frSearch engine-optimization-starter-guide-fr
Search engine-optimization-starter-guide-frNeoSting
 

Viewers also liked (18)

Real World CouchDB
Real World CouchDBReal World CouchDB
Real World CouchDB
 
CouchDB – A Database for the Web
CouchDB – A Database for the WebCouchDB – A Database for the Web
CouchDB – A Database for the Web
 
OSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDBOSCON 2011 Learning CouchDB
OSCON 2011 Learning CouchDB
 
(R)évolutions, l'innovation entre les lignes
(R)évolutions, l'innovation entre les lignes(R)évolutions, l'innovation entre les lignes
(R)évolutions, l'innovation entre les lignes
 
MySQL Indexes
MySQL IndexesMySQL Indexes
MySQL Indexes
 
Lucene - 10 ans d'usages plus ou moins classiques
Lucene - 10 ans d'usages plus ou moins classiquesLucene - 10 ans d'usages plus ou moins classiques
Lucene - 10 ans d'usages plus ou moins classiques
 
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
 
ZendCon 2011 Learning CouchDB
ZendCon 2011 Learning CouchDBZendCon 2011 Learning CouchDB
ZendCon 2011 Learning CouchDB
 
Couch Db In 60 Minutes
Couch Db In 60 MinutesCouch Db In 60 Minutes
Couch Db In 60 Minutes
 
Couch db@nosql+taiwan
Couch db@nosql+taiwanCouch db@nosql+taiwan
Couch db@nosql+taiwan
 
CouchDB at New York PHP
CouchDB at New York PHPCouchDB at New York PHP
CouchDB at New York PHP
 
Couch db
Couch dbCouch db
Couch db
 
CouchDB
CouchDBCouchDB
CouchDB
 
CouchApps: Requiem for Accidental Complexity
CouchApps: Requiem for Accidental ComplexityCouchApps: Requiem for Accidental Complexity
CouchApps: Requiem for Accidental Complexity
 
CouchDB Vs MongoDB
CouchDB Vs MongoDBCouchDB Vs MongoDB
CouchDB Vs MongoDB
 
An introduction to CouchDB
An introduction to CouchDBAn introduction to CouchDB
An introduction to CouchDB
 
CouchDB
CouchDBCouchDB
CouchDB
 
Search engine-optimization-starter-guide-fr
Search engine-optimization-starter-guide-frSearch engine-optimization-starter-guide-fr
Search engine-optimization-starter-guide-fr
 

Similar to CouchDB-Lucene

10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDBrogerbodamer
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overviewAmit Juneja
 
Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)Postman
 
03 form-data
03 form-data03 form-data
03 form-datasnopteck
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in BerlinToshiaki Katayama
 
Examiness hints and tips from the trenches
Examiness hints and tips from the trenchesExaminess hints and tips from the trenches
Examiness hints and tips from the trenchesIsmail Mayat
 
Building Your First MongoDB App
Building Your First MongoDB AppBuilding Your First MongoDB App
Building Your First MongoDB AppHenrik Ingo
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling rogerbodamer
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation Amit Ghosh
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring DataEric Bottard
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用LearningTech
 
Academy PRO: Elasticsearch. Data management
Academy PRO: Elasticsearch. Data managementAcademy PRO: Elasticsearch. Data management
Academy PRO: Elasticsearch. Data managementBinary Studio
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 MinutesKarel Minarik
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsMongoDB
 

Similar to CouchDB-Lucene (20)

10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
Schema Design with MongoDB
Schema Design with MongoDBSchema Design with MongoDB
Schema Design with MongoDB
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)
 
03 form-data
03 form-data03 form-data
03 form-data
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlin
 
Examiness hints and tips from the trenches
Examiness hints and tips from the trenchesExaminess hints and tips from the trenches
Examiness hints and tips from the trenches
 
Building Your First MongoDB App
Building Your First MongoDB AppBuilding Your First MongoDB App
Building Your First MongoDB App
 
Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling Intro to MongoDB and datamodeling
Intro to MongoDB and datamodeling
 
MongoDB Aggregation
MongoDB Aggregation MongoDB Aggregation
MongoDB Aggregation
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
 
Power tools in Java
Power tools in JavaPower tools in Java
Power tools in Java
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
 
Academy PRO: Elasticsearch. Data management
Academy PRO: Elasticsearch. Data managementAcademy PRO: Elasticsearch. Data management
Academy PRO: Elasticsearch. Data management
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
 
Webinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev TeamsWebinar: General Technical Overview of MongoDB for Dev Teams
Webinar: General Technical Overview of MongoDB for Dev Teams
 
MongoDb and NoSQL
MongoDb and NoSQLMongoDb and NoSQL
MongoDb and NoSQL
 

Recently uploaded

Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Umar Saif
 
Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?MENGSAYLOEM1
 
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
Zi-Stick UBS Dongle ZIgbee from  Aeotec manualZi-Stick UBS Dongle ZIgbee from  Aeotec manual
Zi-Stick UBS Dongle ZIgbee from Aeotec manualDomotica daVinci
 
H3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxH3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxMemory Fabric Forum
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPMemory Fabric Forum
 
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions..."How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...Fwdays
 
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docxLeveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docxVotarikari Shravan
 
AI Act & Standardization: UNINFO involvement
AI Act & Standardization: UNINFO involvementAI Act & Standardization: UNINFO involvement
AI Act & Standardization: UNINFO involvementMimmo Squillace
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfkatalinjordans1
 
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...UiPathCommunity
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...DianaGray10
 
Campotel: Telecommunications Infra and Network Builder - Company Profile
Campotel: Telecommunications Infra and Network Builder - Company ProfileCampotel: Telecommunications Infra and Network Builder - Company Profile
Campotel: Telecommunications Infra and Network Builder - Company ProfileCampotelPhilippines
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17Ana-Maria Mihalceanu
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringMassimo Talia
 
LF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIELF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIEDanBrown980551
 
My self introduction to know others abut me
My self  introduction to know others abut meMy self  introduction to know others abut me
My self introduction to know others abut meManoj Prabakar B
 
"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura RochniakFwdays
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch TuesdayIvanti
 
Confoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceConfoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceSusan Ibach
 

Recently uploaded (20)

Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
Progress Report: Ministry of IT under Dr. Umar Saif Aug 23-Feb'24
 
Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?
 
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
Zi-Stick UBS Dongle ZIgbee from  Aeotec manualZi-Stick UBS Dongle ZIgbee from  Aeotec manual
Zi-Stick UBS Dongle ZIgbee from Aeotec manual
 
H3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxH3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptx
 
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IPQ1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
 
5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion
 
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions..."How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
 
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docxLeveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
 
AI Act & Standardization: UNINFO involvement
AI Act & Standardization: UNINFO involvementAI Act & Standardization: UNINFO involvement
AI Act & Standardization: UNINFO involvement
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdf
 
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
Dev Dives: Leverage APIs and Gen AI to power automations for RPA and software...
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
 
Campotel: Telecommunications Infra and Network Builder - Company Profile
Campotel: Telecommunications Infra and Network Builder - Company ProfileCampotel: Telecommunications Infra and Network Builder - Company Profile
Campotel: Telecommunications Infra and Network Builder - Company Profile
 
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17Enhancing Productivity and Insight  A Tour of JDK Tools Progress Beyond Java 17
Enhancing Productivity and Insight A Tour of JDK Tools Progress Beyond Java 17
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineering
 
LF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIELF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIE
 
My self introduction to know others abut me
My self  introduction to know others abut meMy self  introduction to know others abut me
My self introduction to know others abut me
 
"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak
 
2024 February Patch Tuesday
2024 February Patch Tuesday2024 February Patch Tuesday
2024 February Patch Tuesday
 
Confoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceConfoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data science
 

CouchDB-Lucene

  • 1. Finding stuff under the Couch with CouchDB-Lucene Martin Rehfeld @ RUG-B 01-Apr-2010
  • 2. CouchDB • JSON document store • all documents in a given database reside in one large pool and may be retrieved using their ID ... • ... or through Map & Reduce based indexes
  • 3. So how do you do full text search?
  • 4. You potentially could achieve this with just Map & Reduce functions
  • 5. But that would mean implementing an actual search engine ...
  • 6. ... and this has been done before.
  • 7. Enter Lucene Apache Lucene is a high- performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. Courtesy of The Apache Foundation
  • 8. Lucene Features • ranked searching • many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more • fielded searching (e.g., title, author, contents) • boolean operators • sorting by any field • allows simultaneous update and searching
  • 9. CouchDB Integration • couchdb-lucene (ready to run Lucene plus CouchDB interface) • Search interface via http_db_handlers, usually _fti • Indexer interface via CouchDB update_notification facility and fulltext design docs
  • 10. Sample design document, i.e., _id: „_design/search“ { "fulltext": { "by_name": { "defaults": { "store":"yes" }, "index":"function(doc) { var ret=new Document(); ret.add(doc.name); return ret }" } } }
  • 11. Sample design document, i.e., _id: „_design/search“ Name of the index { "fulltext": { "by_name": { "defaults": { "store":"yes" }, "index":"function(doc) { var ret=new Document(); ret.add(doc.name); return ret }" } } }
  • 12. Sample design document, i.e., _id: „_design/search“ Name of the index { "fulltext": { Default options "by_name": { (can be overridden per field) "defaults": { "store":"yes" }, "index":"function(doc) { var ret=new Document(); ret.add(doc.name); return ret }" } } }
  • 13. Sample design document, i.e., _id: „_design/search“ Name of the index { "fulltext": { Default options "by_name": { (can be overridden per field) "defaults": { "store":"yes" }, "index":"function(doc) { var ret=new Document(); ret.add(doc.name); return ret }" } } Index function }
  • 14. Sample design document, i.e., _id: „_design/search“ Name of the index { "fulltext": { Default options "by_name": { (can be overridden per field) "defaults": { "store":"yes" }, "index":"function(doc) { var ret=new Document(); ret.add(doc.name); return ret }" } } Index function Builds and returns documents to } be put into Lucene‘s index (may return an array of multiple documents)
  • 15. Querying the index http://localhost:5984/your-couch-db/_fti/ your-design-document-name/your-index-name? q= query string sort= comma-separated fields to sort on limit= max number of results to return skip= offset include_docs= include CouchDB documents in response
  • 16. A full stack example
  • 17. CouchDB Person Document { "_id": "9db68c69726e486b811859937fbb6b09", "_rev": "1-c890039865e37eb8b911ff762162772e", "name": "Martin Rehfeld", "email": "martin.rehfeld@glnetworks.de", "notes": "Talks about CouchDB Lucene" }
  • 18. Objectives • Search for people by name • Search for people by any field‘s content • Querying from Ruby • Paginating results
  • 19. Index Function function(doc) { // first check if doc is a person document! ... var ret=new Document(); ret.add(doc.name); ret.add(doc.email); ret.add(doc.notes); ret.add(doc.name, {field:“name“, store:“yes“}); ret.add(doc.email, {field:“email“, store:“yes“}); return ret; }
  • 20. Index Function function(doc) { // first check if doc is a person document! ... var ret=new Document(); } content added to ret.add(doc.name); ret.add(doc.email); ret.add(doc.notes); „default“ field ret.add(doc.name, {field:“name“, store:“yes“}); ret.add(doc.email, {field:“email“, store:“yes“}); return ret; }
  • 21. Index Function function(doc) { // first check if doc is a person document! ... var ret=new Document(); } content added to ret.add(doc.name); ret.add(doc.email); ret.add(doc.notes); „default“ field ret.add(doc.name, {field:“name“, store:“yes“}); ret.add(doc.email, {field:“email“, store:“yes“}); return ret; content added to } named fields
  • 22. Field Options name description available options the field name to index field user-defined under date, double, float, int, long, type the type of the field string whether the data is stored. store The value will be returned yes, no in the search result analyzed, whether (and how) the data analyzed_no_norms, no, index is indexed not_analyzed, not_analyzed_no_norms
  • 23. Querying the Index I http://localhost:5984/mydb/_fti/search/ global?q=couchdb { "q": "default:couchdb", "etag": "119e498956048ea8", "skip": 0, "limit": 25, "total_rows": 1, "search_duration": 0, "fetch_duration": 8, "rows": [ { "id": "9db68c69726e486b811859937fbb6b09", "score": 4.520571708679199, "fields": { "name": "Martin Rehfeld", "email": "martin.rehfeld@glnetworks.de", } } ] }
  • 24. Querying the Index I http://localhost:5984/mydb/_fti/search/ global?q=couchdb default field { "q": "default:couchdb", is queried "etag": "119e498956048ea8", "skip": 0, "limit": 25, "total_rows": 1, "search_duration": 0, "fetch_duration": 8, "rows": [ { "id": "9db68c69726e486b811859937fbb6b09", "score": 4.520571708679199, "fields": { "name": "Martin Rehfeld", "email": "martin.rehfeld@glnetworks.de", } } ] }
  • 25. Querying the Index I http://localhost:5984/mydb/_fti/search/ global?q=couchdb default field { "q": "default:couchdb", is queried Content of fields "etag": "119e498956048ea8", "skip": 0, "limit": 25, with store:“yes“ "total_rows": 1, "search_duration": 0, option are returned "fetch_duration": 8, "rows": [ with the query { results "id": "9db68c69726e486b811859937fbb6b09", "score": 4.520571708679199, "fields": { "name": "Martin Rehfeld", "email": "martin.rehfeld@glnetworks.de", } } ] }
  • 26. Querying the Index II http://localhost:5984/mydb/_fti/search/ global?q=name:rehfeld { "q": "name:rehfeld", "etag": "119e498956048ea8", "skip": 0, "limit": 25, "total_rows": 1, "search_duration": 0, "fetch_duration": 8, "rows": [ { "id": "9db68c69726e486b811859937fbb6b09", "score": 4.520571708679199, "fields": { "name": "Martin Rehfeld", "email": "martin.rehfeld@glnetworks.de", } } ] }
  • 27. Querying the Index II http://localhost:5984/mydb/_fti/search/ global?q=name:rehfeld { "q": "name:rehfeld", name field "etag": "119e498956048ea8", "skip": 0, "limit": 25, is queried "total_rows": 1, "search_duration": 0, "fetch_duration": 8, "rows": [ { "id": "9db68c69726e486b811859937fbb6b09", "score": 4.520571708679199, "fields": { "name": "Martin Rehfeld", "email": "martin.rehfeld@glnetworks.de", } } ] }
  • 28. Querying from Ruby class Search include HTTParty base_uri "localhost:5984/#{CouchPotato::Config.database_name}/_fti/search" format :json def self.query(options = {}) index = options.delete(:index) get("/#{index}", :query => options) end end
  • 29. Controller / Pagination class SearchController < ApplicationController HITS_PER_PAGE = 10 def index result = Search.query(params.merge(:skip => skip, :limit => HITS_PER_PAGE)) @hits = WillPaginate::Collection.create(params[:page] || 1, HITS_PER_PAGE, result['total_rows']) do |pager| pager.replace(result['rows']) end end private def skip params[:page] ? (params[:page].to_i - 1) * HITS_PER_PAGE : 0 end end
  • 30. Resources • http://couchdb.apache.org/ • http://lucene.apache.org/java/docs/index.html • http://github.com/rnewson/couchdb-lucene • http://lucene.apache.org/java/3_0_1/ queryparsersyntax.html
  • 31. Q &A ! Martin Rehfeld http://inside.glnetworks.de martin.rehfeld@glnetworks.de @klickmich

Editor's Notes

  1. short recap of what CouchDB is
  2. some (very) limited examples are actually floating around
  3. mapping all documents, split them into words, push through a stemmer, and cross-index them with the documents containing them
  4. ... multiple times, in fact
  5. add all searchable content to the default field, add fields for searching by individual field or using contents in view
  6. the stored field contents can be used to render search results without touching CouchDB
  7. the stored field contents can be used to render search results without touching CouchDB
  8. could be as simple as that (using the httparty gem &amp; Couch Potato) sans error handling
  9. using the Search class in an controller + pagination; utilizing the will_paginate gem