Your SlideShare is downloading. ×
0
Search Evolution – von Lucene  zu Solr und ElasticSearchFlorian Hopf@fhopfhttp://www.florian-hopf.de
Index   Indizieren                Index    Suchen
IndexTerm           Document Id
Analyzinghttp://www.flickr.com/photos/quinnanya/5196951914/
Analyzing    Such Evolution -Von Lucene zu Solr undElasticSearch  Verteiltes Suchen mitElasticsearch
Analyzing    Such                          Term            Document Id Evolution -                      Such            1V...
Analyzing    Such Evolution -                      Term            Document IdVon Lucene      1. Tokenization   such      ...
Analyzing    Such Evolution -                      Term            Document IdVon Lucene      1. Tokenization zu Solr und ...
Inverted Index
Analyzer
Query Syntaxdatenbank OR DBtitle:elasticsearch"apache lucene"speaker:hopp~elastic* AND date:[20130101 TO 20130501]
Relevance
http://www.ibm.com/developerworks/java/library/os-apache-lucenesearch/
DocumentsDocument  Field title      title     Integration1 Suchen mitmit Apache Camel                    Verteiltes       ...
Attributes           NOT_ANALYZEDANALYZED                  NO             Index                ...
AttributesYES                NO       Store
IndexingDocument es = new Document();es.add(new Field("title",   "Verteiltes Suchen mit Elasticsearch",   Field.Store.YES,...
IndexingDirectory dir = FSDirectory.open(   new File("/tmp/testindex"));IndexWriterConfig config = new IndexWriterConfig( ...
SearchingIndexReader reader = IndexReader.open(dir);IndexSearcher searcher = new IndexSearcher(reader);QueryParser parser ...
Webapp                                          Webapp         XML, JSON, JavaBin, Ruby, ...Client              http      ...
Config                    Solr Home             conf                data               solr-schema.xml                    ...
Schemaschema.xml      Field Types         Fields
Schema<fieldType name="text_de" class="solr.TextField">   <analyzer>      <tokenizer         class="solr.StandardTokenizer...
Schema<fields>   <field name="title" type="text_de"     indexed="true" stored="true"/>   <field name="speaker" type="strin...
IndexingSolrInputDocument document =           new SolrInputDocument();document.addField("path",           "/tmp/foo");doc...
Solrconfigsolrconfig.xml      Lucene Config         Caches     Request Handler   Search Components
Solrconfig<requestHandler name="/bedcon"   class="solr.SearchHandler">   <lst name="defaults">      <int name="rows">10</i...
SearchingSolrQuery solrQuery = new SolrQuery("suche");solrQuery.setQueryType("/bedcon");QueryResponse response = server.qu...
Faceting...solrQuery.setFacet(true);solrQuery.addFacetField("speaker");QueryResponse response = server.query(solrQuery);Li...
Indexingcurl -XPOST   http://localhost:9200/bedcon/talk/ -d {    "speaker" :        "Dr. Halil-Cem Gürsoy",    "date"    :...
Mappingcurl -XPUT   http://host/bedcon/talk/_mapping -d {   "talk" : {      "properties" : {      "title" : {           "t...
Searchingcurl -XGET   http://host/bedcon/talk/_search?q=elasticsearch{...},"hits":{"total":1,"max_score":0.054244425,   "h...
Searchingcurl -XGEThttp://localhost:9200/bedcon/talk/_search -d {   "query" : {      "query_string" : {"query" : "elastics...
SearchingSearchResponse response =   esClient.prepareSearch("bedcon")   .addFacet(      FacetBuilders.termsFacet("speaker"...
Verteilung
Verteilung
http://lucene.apache.orghttp://lucene.apache.org/solr/http://elasticsearch.orghttps://github.com/fhopf/lucene-solr-talk@fh...
Images●   http://www.morguefile.com/archive/display/3470●   http://www.flickr.com/photos/quinnanya/5196951914/       Quinn...
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearch
Search Evolution - Von Lucene zu Solr und ElasticSearch
Upcoming SlideShare
Loading in...5
×

Search Evolution - Von Lucene zu Solr und ElasticSearch

1,675

Published on

Bedcon 2013 Vortrag über Grundlagen einer Suche, Lucene, Solr und ElasticSearch

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,675
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
24
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Search Evolution - Von Lucene zu Solr und ElasticSearch"

  1. 1. Search Evolution – von Lucene zu Solr und ElasticSearchFlorian Hopf@fhopfhttp://www.florian-hopf.de
  2. 2. Index Indizieren Index Suchen
  3. 3. IndexTerm Document Id
  4. 4. Analyzinghttp://www.flickr.com/photos/quinnanya/5196951914/
  5. 5. Analyzing Such Evolution -Von Lucene zu Solr undElasticSearch Verteiltes Suchen mitElasticsearch
  6. 6. Analyzing Such Term Document Id Evolution - Such 1Von Lucene 1. Tokenization zu Solr und Evolution 1ElasticSearch Von 1 Lucene 1 zu 1 Solr 1 und 1 ElasticSearch 1 Verteiltes Verteiltes 2Suchen mit Suchen 2Elasticsearch mit 2 Elasticsearch 2
  7. 7. Analyzing Such Evolution - Term Document IdVon Lucene 1. Tokenization such 1 zu Solr und evolution 1ElasticSearch von 1 2. Lowercasing lucene 1 zu 1 solr 1 und 1 elasticsearch 1,2 VerteiltesSuchen mit verteiltes 2Elasticsearch suchen 2 mit 2
  8. 8. Analyzing Such Evolution - Term Document IdVon Lucene 1. Tokenization zu Solr und such 1,2ElasticSearch evolution 1 von 1 2. Lowercasing luc 1 zu 1 solr 1 3. Stemming und 1 Verteiltes elasticsearch 1,2Suchen mit verteilt 2Elasticsearch mit 2
  9. 9. Inverted Index
  10. 10. Analyzer
  11. 11. Query Syntaxdatenbank OR DBtitle:elasticsearch"apache lucene"speaker:hopp~elastic* AND date:[20130101 TO 20130501]
  12. 12. Relevance
  13. 13. http://www.ibm.com/developerworks/java/library/os-apache-lucenesearch/
  14. 14. DocumentsDocument Field title title Integration1 Suchen mitmit Apache Camel Verteiltes Name ganz einfach Elasticsearch Value Value 1 Field date title Name 1 20130404 Value Value 1 Field title speaker title Integration1 Halil-Cem mit Apache Camel Name Dr. einfach Gürsoy 1 ganz Value Value
  15. 15. Attributes NOT_ANALYZEDANALYZED NO Index ...
  16. 16. AttributesYES NO Store
  17. 17. IndexingDocument es = new Document();es.add(new Field("title", "Verteiltes Suchen mit Elasticsearch", Field.Store.YES, Field.Index.ANALYZED));es.add(new Field("date", "20130404", Field.Store.NO, Field.Index.ANALYZED));es.add(new Field("speaker", "Dr. Halil-Cem Gürsoy", Field.Store.YES, Field.Index.ANALYZED));
  18. 18. IndexingDirectory dir = FSDirectory.open( new File("/tmp/testindex"));IndexWriterConfig config = new IndexWriterConfig( Version.LUCENE_36, new GermanAnalyzer(Version.LUCENE_36));IndexWriter writer = new IndexWriter(dir, config);writer.addDocument(es);writer.commit();
  19. 19. SearchingIndexReader reader = IndexReader.open(dir);IndexSearcher searcher = new IndexSearcher(reader);QueryParser parser = new QueryParser( Version.LUCENE_36, "title", new GermanAnalyzer(Version.LUCENE_36));Query query = parser.parse("suche");TopDocs result = searcher.search(query, 10);assertEquals(1, result.totalHits);int id = result.scoreDocs[0].doc;Document doc = searcher.doc(id);String title = doc.get("title");assertEquals( "Verteiltes Suchen mit Elasticsearch", title);
  20. 20. Webapp Webapp XML, JSON, JavaBin, Ruby, ...Client http Solr Lucene
  21. 21. Config Solr Home conf data solr-schema.xml Lucene config.xml
  22. 22. Schemaschema.xml Field Types Fields
  23. 23. Schema<fieldType name="text_de" class="solr.TextField"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.GermanLightStemFilterFactory"/> </analyzer></fieldType>
  24. 24. Schema<fields> <field name="title" type="text_de" indexed="true" stored="true"/> <field name="speaker" type="string" indexed="true" stored="true" multiValued="true"/> <field name="speaker_search" type="text_ws" indexed="true" stored="false" multiValued="true"/> [...]</fields><copyField source="speaker" dest="speaker_search"/>
  25. 25. IndexingSolrInputDocument document = new SolrInputDocument();document.addField("path", "/tmp/foo");document.addField("title", "Verteiltes Suchen mit Elasticsearch");document.addField("speaker", "Dr. Halil-Cem Gürsoy");SolrServer server = new HttpSolrServer("http://localhost:8080");server.add(document);server.commit();
  26. 26. Solrconfigsolrconfig.xml Lucene Config Caches Request Handler Search Components
  27. 27. Solrconfig<requestHandler name="/bedcon" class="solr.SearchHandler"> <lst name="defaults"> <int name="rows">10</int> <str name="q.op">AND</str> <str name="q.alt">*:*</str> <str name="defType">edismax</str> <str name="qf"> content title^1.5 speaker_search </str> </lst></requestHandler>
  28. 28. SearchingSolrQuery solrQuery = new SolrQuery("suche");solrQuery.setQueryType("/bedcon");QueryResponse response = server.query(solrQuery);assertEquals(1, response.getResults().size());SolrDocument result = response.getResults().get(0);assertEquals("Verteiltes Suchen mit Elasticsearch", result.get("title"));assertEquals("Dr. Halil-Cem Gürsoy", result.getFirstValue("speaker"));
  29. 29. Faceting...solrQuery.setFacet(true);solrQuery.addFacetField("speaker");QueryResponse response = server.query(solrQuery);List<FacetField.Count> speakerFacet = response.getFacetField("speaker").getValues();assertEquals(1, speakerFacet.get(0).getCount());assertEquals("Dr. Halil-Cem Gürsoy", speakerFacet.get(0).getName());
  30. 30. Indexingcurl -XPOST http://localhost:9200/bedcon/talk/ -d { "speaker" : "Dr. Halil-Cem Gürsoy", "date" : "2013-04-04T16:00:00", "title" : "Verteiltes Suchen mit Elasticsearch"}{"ok":true,"_index":"bedcon","_type":"talk","_id":"CeltdivQRGSvLY_dBZv1jw","_version":1}
  31. 31. Mappingcurl -XPUT http://host/bedcon/talk/_mapping -d { "talk" : { "properties" : { "title" : { "type" : "string", "analyzer" : "german" } } }}
  32. 32. Searchingcurl -XGET http://host/bedcon/talk/_search?q=elasticsearch{...},"hits":{"total":1,"max_score":0.054244425, "hits":[{ ..., "_score":0.054244425, "_source" : { "speaker" : "Dr. Halil-Cem Gürsoy", "date" : "2013-04-04T16:00:00", "title": "Verteiltes Suchen mit Elasticsearch" } }}
  33. 33. Searchingcurl -XGEThttp://localhost:9200/bedcon/talk/_search -d { "query" : { "query_string" : {"query" : "elasticsearch"} }, "facets" : { "tags" : { "terms" : {"field" : "speaker"} } }}
  34. 34. SearchingSearchResponse response = esClient.prepareSearch("bedcon") .addFacet( FacetBuilders.termsFacet("speaker") .field("speaker")) .setQuery( QueryBuilders.queryString("elasticsearch")) .execute().actionGet();assertEquals(1, response.getHits().getTotalHits());
  35. 35. Verteilung
  36. 36. Verteilung
  37. 37. http://lucene.apache.orghttp://lucene.apache.org/solr/http://elasticsearch.orghttps://github.com/fhopf/lucene-solr-talk@fhopfmail@florian-hopf.dehttp://blog.florian-hopf.de
  38. 38. Images● http://www.morguefile.com/archive/display/3470● http://www.flickr.com/photos/quinnanya/5196951914/ Quinn Dombrowski● http://www.morguefile.com/archive/display/695239● http://www.morguefile.com/archive/display/93433● http://www.morguefile.com/archive/display/811746● http://www.morguefile.com/archive/display/12965● http://www.morguefile.com/archive/display/181488
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×