Apache Solr

476 views

Published on

Apresentação sobre Apache Solr feita na segunda edição do DevShare, realizado na sede da Bluesoft.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
476
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Apache Solr

  1. 1. Apache Solr Rafael Valério Gustavo ViníciusMonday, February 4, 13
  2. 2. Servidor de busca ultra-rápido baseado em Lucene* * Biblioteca de motor de busca de alta performance, escrita em Java.Monday, February 4, 13
  3. 3. Como funciona?Monday, February 4, 13
  4. 4. Como funciona?Monday, February 4, 13
  5. 5. Dois Processos #1 Indexação #2 BuscaMonday, February 4, 13
  6. 6. #1 Indexação #2 BuscaMonday, February 4, 13
  7. 7. FIELD ANALYZERS #1 Indexação #2 BuscaMonday, February 4, 13
  8. 8. FIELD ANALYZERSMonday, February 4, 13
  9. 9. FIELD ANALYZERSMonday, February 4, 13
  10. 10. Texto FIELD ANALYZERSMonday, February 4, 13
  11. 11. Texto FIELD ANALYZERS TokensMonday, February 4, 13
  12. 12. FIELD ANALYZERSMonday, February 4, 13
  13. 13. FIELD ANALYZERSMonday, February 4, 13
  14. 14. FIELD ANALYZERSMonday, February 4, 13
  15. 15. FIELD ANALYZERS TOKENIZERS FILTERSMonday, February 4, 13
  16. 16. TOKENIZERS Quebram os dados do campo em unidades léxicas, ou tokens.Monday, February 4, 13
  17. 17. FILTERS Analisam um fluxo de tokens + mantém | transforma | descarta | criaMonday, February 4, 13
  18. 18. TOKENIZER FILTER FILTER FILTER FILTERMonday, February 4, 13
  19. 19. TOKENIZER FILTER FILTER FILTER FILTERMonday, February 4, 13
  20. 20. TOKENIZER FILTER FILTER FILTER FILTERMonday, February 4, 13
  21. 21. TOKENIZER FILTER FILTER = ANALYZER FILTER FILTERMonday, February 4, 13
  22. 22. schema.xml Simples <fieldType name="nametext" class="solr.TextField"> <analyzer class="org.apache.lucene.analysis.WhitespaceAnalyzer"/> </fieldType> Composto <fieldType name="nametext" class="solr.TextField"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StandardFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory"/> <filter class="solr.EnglishPorterFilterFactory"/> </analyzer> </fieldType>Monday, February 4, 13
  23. 23. Fases de AnáliseMonday, February 4, 13
  24. 24. Na indexação, um campo é criado e os tokens são armazenados.Monday, February 4, 13
  25. 25. Na busca, o termo pesquisado é analisado e comparado com os campos do índice.Monday, February 4, 13
  26. 26. schema.xml Duas Fases <fieldType name="nametext" class="solr.TextField"> <analyzer *type="index"{*}> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.KeepWordFilterFactory" words="keepwords.txt"/> <filter class="solr.SynonymFilterFactory" synonyms="syns.txt"/> </analyzer> <analyzer *type="query"{*}> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>Monday, February 4, 13
  27. 27. Exemplos de TokenizersMonday, February 4, 13
  28. 28. Standard Tokenizer <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> </analyzer> IN: Please, email john.doe@foo.com by 03-09, re: m37-xq. OUT: "Please", "email", "john.doe@foo.com", "by", "03-09", "re", "m37-xq"Monday, February 4, 13
  29. 29. Keyword Tokenizer <analyzer> <tokenizer class="solr.KeywordTokenizerFactory"/> </analyzer> IN: Please, email john.doe@foo.com by 03-09, re: m37-xq. OUT: “Please, email john.doe@foo.com by 03-09, re: m37-xq.”Monday, February 4, 13
  30. 30. Obrigado! rafael@webgoal.com.br @rafaelvalerio gustavo@webgoal.com.br @gustavovnciusMonday, February 4, 13

×