Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)

1,435 views
1,338 views

Published on

Slides zum Vortrag über Lucene, Solr und ElasticSearch bei der Java User Group Mannheim am 20.06.2013

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,435
On SlideShare
0
From Embeds
0
Number of Embeds
36
Actions
Shares
0
Downloads
9
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Search Evolution - Von Lucene zu Solr und ElasticSearch (Majug 20.06.2013)

  1. 1. Search Evolution – von Lucenezu Solr und ElasticSearchFlorian Hopf@fhopfhttp://www.florian-hopf.de20.06.2013
  2. 2. IndexIndizierenSuchenIndex
  3. 3. Term Document IdIndex
  4. 4. http://www.flickr.com/photos/quinnanya/5196951914/Analyzing
  5. 5. SuchEvolution -Von Lucenezu Solr undElasticSearchVerteiltesSuchen mitElasticsearchAnalyzing
  6. 6. Term Document IdSuch 1Evolution 1Von 1Lucene 1zu 1Solr 1und 1ElasticSearch 1Verteiltes 2Suchen 2mit 2Elasticsearch 21. TokenizationSuchEvolution -Von Lucenezu Solr undElasticSearchVerteiltesSuchen mitElasticsearchAnalyzing
  7. 7. Term Document Idsuch 1evolution 1von 1lucene 1zu 1solr 1und 1elasticsearch 1,2verteiltes 2suchen 2mit 21. Tokenization2. LowercasingSuchEvolution -Von Lucenezu Solr undElasticSearchVerteiltesSuchen mitElasticsearchAnalyzing
  8. 8. Term Document Idsuch 1,2evolution 1von 1luc 1zu 1solr 1und 1elasticsearch 1,2verteilt 2mit 21. Tokenization2. Lowercasing3. StemmingSuchEvolution -Von Lucenezu Solr undElasticSearchVerteiltesSuchen mitElasticsearchAnalyzing
  9. 9. Inverted Index
  10. 10. Analyzer
  11. 11. datenbank OR DBtitle:elasticsearch"apache lucene"speaker:hopp~elastic* AND date:[20130101 TO 20130501]Query Syntax
  12. 12. Relevance
  13. 13. http://www.ibm.com/developerworks/java/library/os-apache-lucenesearch/
  14. 14. DocumentField Name 1 Value 1title ValueDocumentsDocumentField Name 1 Value 1title Integration ganz einfach mit Apache CamelField Name 1 Value 1title ValueField Name 1 Value 1speaker Florian HopfField Name 1 Value 1title ValueField Name 1 Value 1date 20130620Field Name 1 Value 1title ValueField Name 1 Value 1title Integration ganz einfach mit Apache CamelField Name 1 Value 1title ValueField Name 1 Value 1title Such-Evolution
  15. 15. AttributesStoreYES NO
  16. 16. AttributesIndexANALYZED YES NOTextField StoredFieldStringField
  17. 17. Document doc = new Document();doc.add(new TextField("title","Suchen und Finden mit Lucene und Solr",Field.Store.YES));doc.add(new StoredField("speaker","Florian Hopf"));doc.add(new StringField("date","20120704",Field.Store.YES));Indexing
  18. 18. Directory dir = FSDirectory.open(new File("/tmp/testindex"));IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43,new GermanAnalyzer(Version.LUCENE_43));IndexWriter writer = new IndexWriter(dir, config);writer.addDocument(doc);writer.commit();Indexing
  19. 19. IndexReader reader = IndexReader.open(dir);IndexSearcher searcher = new IndexSearcher(reader);QueryParser parser = new QueryParser(Version.LUCENE_43,"title",new GermanAnalyzer(Version.LUCENE_43));Query query = parser.parse("suche");TopDocs result = searcher.search(query, 10);assertEquals(1, result.totalHits);int id = result.scoreDocs[0].doc;Document doc = searcher.doc(id);String title = doc.get("title");assertEquals("Suchen und Finden mit Lucene und Solr",title);Searching
  20. 20. WebappWebappClient Solr LucenehttpXML, JSON, JavaBin, Ruby, ...
  21. 21. schema.xmlField TypesFieldsSchema
  22. 22. <fieldType name="text_de" class="solr.TextField"><analyzer><tokenizerclass="solr.StandardTokenizerFactory"/><filterclass="solr.LowerCaseFilterFactory"/><filterclass="solr.GermanLightStemFilterFactory"/></analyzer></fieldType>Schema
  23. 23. <fields><field name="title" type="text_de"indexed="true" stored="true"/><field name="speaker" type="string"indexed="true" stored="true"multiValued="true"/><field name="speaker_search" type="text_ws"indexed="true" stored="false"multiValued="true"/>[...]</fields><copyField source="speaker" dest="speaker_search"/>Schema
  24. 24. SolrInputDocument document =new SolrInputDocument();document.addField("path", "/tmp/foo");document.addField("title","Suchen und Finden mit Lucene und Solr");document.addField("speaker", "Florian Hopf");SolrServer server =new HttpSolrServer("http://localhost:8080");server.add(document);server.commit();Indexing
  25. 25. solrconfig.xmlLucene ConfigCachesRequest HandlerSearch ComponentsSolrconfig
  26. 26. <requestHandler name="/jug"class="solr.SearchHandler"><lst name="defaults"><int name="rows">10</int><str name="q.op">AND</str><str name="q.alt">*:*</str><str name="defType">edismax</str><str name="qf">contenttitle^1.5speaker_search</str></lst></requestHandler>Solrconfig
  27. 27. SolrQuery solrQuery = new SolrQuery("suche");solrQuery.setQueryType("/jug");QueryResponse response = server.query(solrQuery);assertEquals(1, response.getResults().size());SolrDocument result = response.getResults().get(0);assertEquals("Suchen und Finden mit Lucene und Solr",result.get("title"));assertEquals("Florian Hopf",result.getFirstValue("speaker"));Searching
  28. 28. ...solrQuery.setFacet(true);solrQuery.addFacetField("speaker");QueryResponse response = server.query(solrQuery);List<FacetField.Count> speakerFacet =response.getFacetField("speaker").getValues();assertEquals(1, speakerFacet.get(0).getCount());assertEquals("Florian Hopf",speakerFacet.get(0).getName());Faceting
  29. 29. curl -XPOST http://localhost:9200/jug/talk/ -d {"speaker" :"Florian Hopf","date" :"2012-07-04T19:15:00","title" :"Suchen und Finden mit Lucene und Solr"}{"ok":true,"_index":"jug","_type":"talk","_id":"CeltdivQRGSvLY_dBZv1jw","_version":1}Indexing
  30. 30. curl -XPUT http://host/jug/talk/_mapping -d {"talk" : {"properties" : {"title" : {"type" : "string","analyzer" : "german"}}}}Mapping
  31. 31. curl -XGET http://host/jug/talk/_search?q=title:suche{...},"hits":{"total":1,"max_score":0.054244425,"hits":[{...,"_score":0.054244425,"_source" : {"speaker" :"Florian Hopf","date" :"2012-07-04T19:15:00","title":"Suchen und Finden mit Lucene und Solr"}}}Searching
  32. 32. curl -XGEThttp://localhost:9200/jug/talk/_search -d {"query" : {"query_string" : {"query" : "suche"}},"facets" : {"tags" : {"terms" : {"field" : "speaker"}}}}Searching
  33. 33. QueryBuilder query = queryString("suche");TermsFacetBuilder facet =termsFacet("speaker").field("speaker");SearchResponse response =esClient.prepareSearch("jug").addFacet(facet).setQuery(query).execute().actionGet();assertEquals(1, response.getHits().getTotalHits());Searching
  34. 34. Verteilung
  35. 35. Verteilung
  36. 36. Suggestions
  37. 37. Geo-Suche
  38. 38. http://lucene.apache.orghttp://lucene.apache.org/solr/http://elasticsearch.orghttps://github.com/fhopf/lucene-solr-talk@fhopfmail@florian-hopf.dehttp://blog.florian-hopf.de
  39. 39. 26.06. Heiko W. Rupp„RHQ - aktuelle und kommende Entwicklungen“„Testen von Hypermedia-APIs mit RestAssured“10.07. Gerrit Grunwald„Wie Phoenix aus der Asche…JavaFX 2.0“25.09. Bernd Rücker„Open Source BPM mit BPMN 2.0 und Java“http://jug-ka.de
  40. 40. ● http://www.morguefile.com/archive/display/3470● http://www.flickr.com/photos/quinnanya/5196951914/Quinn Dombrowski● http://www.morguefile.com/archive/display/695239● http://www.morguefile.com/archive/display/93433● http://www.morguefile.com/archive/display/811746● http://www.morguefile.com/archive/display/12965● http://www.morguefile.com/archive/display/181488● http://www.morguefile.com/archive/display/826258Images

×