Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)

1,585 views

Published on

Slides zum Vortrag über Lucene, Solr und ElasticSearch bei der Java User Group Mannheim am 20.06.2013

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,585
On SlideShare
0
From Embeds
0
Number of Embeds
38
Actions
Shares
0
Downloads
9
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)

  1. 1. Search Evolution – von Lucenezu Solr und ElasticSearchFlorian Hopf@fhopfhttp://www.florian-hopf.de20.06.2013
  2. 2. IndexIndizierenSuchenIndex
  3. 3. Term Document IdIndex
  4. 4. http://www.flickr.com/photos/quinnanya/5196951914/Analyzing
  5. 5. SuchEvolution -Von Lucenezu Solr undElasticSearchVerteiltesSuchen mitElasticsearchAnalyzing
  6. 6. Term Document IdSuch 1Evolution 1Von 1Lucene 1zu 1Solr 1und 1ElasticSearch 1Verteiltes 2Suchen 2mit 2Elasticsearch 21. TokenizationSuchEvolution -Von Lucenezu Solr undElasticSearchVerteiltesSuchen mitElasticsearchAnalyzing
  7. 7. Term Document Idsuch 1evolution 1von 1lucene 1zu 1solr 1und 1elasticsearch 1,2verteiltes 2suchen 2mit 21. Tokenization2. LowercasingSuchEvolution -Von Lucenezu Solr undElasticSearchVerteiltesSuchen mitElasticsearchAnalyzing
  8. 8. Term Document Idsuch 1,2evolution 1von 1luc 1zu 1solr 1und 1elasticsearch 1,2verteilt 2mit 21. Tokenization2. Lowercasing3. StemmingSuchEvolution -Von Lucenezu Solr undElasticSearchVerteiltesSuchen mitElasticsearchAnalyzing
  9. 9. Inverted Index
  10. 10. Analyzer
  11. 11. datenbank OR DBtitle:elasticsearch"apache lucene"speaker:hopp~elastic* AND date:[20130101 TO 20130501]Query Syntax
  12. 12. Relevance
  13. 13. http://www.ibm.com/developerworks/java/library/os-apache-lucenesearch/
  14. 14. DocumentField Name 1 Value 1title ValueDocumentsDocumentField Name 1 Value 1title Integration ganz einfach mit Apache CamelField Name 1 Value 1title ValueField Name 1 Value 1speaker Florian HopfField Name 1 Value 1title ValueField Name 1 Value 1date 20130620Field Name 1 Value 1title ValueField Name 1 Value 1title Integration ganz einfach mit Apache CamelField Name 1 Value 1title ValueField Name 1 Value 1title Such-Evolution
  15. 15. AttributesStoreYES NO
  16. 16. AttributesIndexANALYZED YES NOTextField StoredFieldStringField
  17. 17. Document doc = new Document();doc.add(new TextField("title","Suchen und Finden mit Lucene und Solr",Field.Store.YES));doc.add(new StoredField("speaker","Florian Hopf"));doc.add(new StringField("date","20120704",Field.Store.YES));Indexing
  18. 18. Directory dir = FSDirectory.open(new File("/tmp/testindex"));IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43,new GermanAnalyzer(Version.LUCENE_43));IndexWriter writer = new IndexWriter(dir, config);writer.addDocument(doc);writer.commit();Indexing
  19. 19. IndexReader reader = IndexReader.open(dir);IndexSearcher searcher = new IndexSearcher(reader);QueryParser parser = new QueryParser(Version.LUCENE_43,"title",new GermanAnalyzer(Version.LUCENE_43));Query query = parser.parse("suche");TopDocs result = searcher.search(query, 10);assertEquals(1, result.totalHits);int id = result.scoreDocs[0].doc;Document doc = searcher.doc(id);String title = doc.get("title");assertEquals("Suchen und Finden mit Lucene und Solr",title);Searching
  20. 20. WebappWebappClient Solr LucenehttpXML, JSON, JavaBin, Ruby, ...
  21. 21. schema.xmlField TypesFieldsSchema
  22. 22. <fieldType name="text_de" class="solr.TextField"><analyzer><tokenizerclass="solr.StandardTokenizerFactory"/><filterclass="solr.LowerCaseFilterFactory"/><filterclass="solr.GermanLightStemFilterFactory"/></analyzer></fieldType>Schema
  23. 23. <fields><field name="title" type="text_de"indexed="true" stored="true"/><field name="speaker" type="string"indexed="true" stored="true"multiValued="true"/><field name="speaker_search" type="text_ws"indexed="true" stored="false"multiValued="true"/>[...]</fields><copyField source="speaker" dest="speaker_search"/>Schema
  24. 24. SolrInputDocument document =new SolrInputDocument();document.addField("path", "/tmp/foo");document.addField("title","Suchen und Finden mit Lucene und Solr");document.addField("speaker", "Florian Hopf");SolrServer server =new HttpSolrServer("http://localhost:8080");server.add(document);server.commit();Indexing
  25. 25. solrconfig.xmlLucene ConfigCachesRequest HandlerSearch ComponentsSolrconfig
  26. 26. <requestHandler name="/jug"class="solr.SearchHandler"><lst name="defaults"><int name="rows">10</int><str name="q.op">AND</str><str name="q.alt">*:*</str><str name="defType">edismax</str><str name="qf">contenttitle^1.5speaker_search</str></lst></requestHandler>Solrconfig
  27. 27. SolrQuery solrQuery = new SolrQuery("suche");solrQuery.setQueryType("/jug");QueryResponse response = server.query(solrQuery);assertEquals(1, response.getResults().size());SolrDocument result = response.getResults().get(0);assertEquals("Suchen und Finden mit Lucene und Solr",result.get("title"));assertEquals("Florian Hopf",result.getFirstValue("speaker"));Searching
  28. 28. ...solrQuery.setFacet(true);solrQuery.addFacetField("speaker");QueryResponse response = server.query(solrQuery);List<FacetField.Count> speakerFacet =response.getFacetField("speaker").getValues();assertEquals(1, speakerFacet.get(0).getCount());assertEquals("Florian Hopf",speakerFacet.get(0).getName());Faceting
  29. 29. curl -XPOST http://localhost:9200/jug/talk/ -d {"speaker" :"Florian Hopf","date" :"2012-07-04T19:15:00","title" :"Suchen und Finden mit Lucene und Solr"}{"ok":true,"_index":"jug","_type":"talk","_id":"CeltdivQRGSvLY_dBZv1jw","_version":1}Indexing
  30. 30. curl -XPUT http://host/jug/talk/_mapping -d {"talk" : {"properties" : {"title" : {"type" : "string","analyzer" : "german"}}}}Mapping
  31. 31. curl -XGET http://host/jug/talk/_search?q=title:suche{...},"hits":{"total":1,"max_score":0.054244425,"hits":[{...,"_score":0.054244425,"_source" : {"speaker" :"Florian Hopf","date" :"2012-07-04T19:15:00","title":"Suchen und Finden mit Lucene und Solr"}}}Searching
  32. 32. curl -XGEThttp://localhost:9200/jug/talk/_search -d {"query" : {"query_string" : {"query" : "suche"}},"facets" : {"tags" : {"terms" : {"field" : "speaker"}}}}Searching
  33. 33. QueryBuilder query = queryString("suche");TermsFacetBuilder facet =termsFacet("speaker").field("speaker");SearchResponse response =esClient.prepareSearch("jug").addFacet(facet).setQuery(query).execute().actionGet();assertEquals(1, response.getHits().getTotalHits());Searching
  34. 34. Verteilung
  35. 35. Verteilung
  36. 36. Suggestions
  37. 37. Geo-Suche
  38. 38. http://lucene.apache.orghttp://lucene.apache.org/solr/http://elasticsearch.orghttps://github.com/fhopf/lucene-solr-talk@fhopfmail@florian-hopf.dehttp://blog.florian-hopf.de
  39. 39. 26.06. Heiko W. Rupp„RHQ - aktuelle und kommende Entwicklungen“„Testen von Hypermedia-APIs mit RestAssured“10.07. Gerrit Grunwald„Wie Phoenix aus der Asche…JavaFX 2.0“25.09. Bernd Rücker„Open Source BPM mit BPMN 2.0 und Java“http://jug-ka.de
  40. 40. ● http://www.morguefile.com/archive/display/3470● http://www.flickr.com/photos/quinnanya/5196951914/Quinn Dombrowski● http://www.morguefile.com/archive/display/695239● http://www.morguefile.com/archive/display/93433● http://www.morguefile.com/archive/display/811746● http://www.morguefile.com/archive/display/12965● http://www.morguefile.com/archive/display/181488● http://www.morguefile.com/archive/display/826258Images

×