Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Growing with elastic search

To be presented @ RootConf, 2018

  • Be the first to comment

  • Be the first to like this

Growing with elastic search

  1. 1. Growing with ElasticSearch Devi A S L @ RootConf 11th May, 2018
  2. 2. About me ● Over a decade of experience in building software ● Lead developer/Architect at PowerToFly
  3. 3. Our journey with ElasticSearch 2014: launched with Postgres Full text search 2015: Faceted Search with ES v1.4 2016: Log monitoring system with ELK 2.3 2017: Analytics pipeline with ELK 5.5
  4. 4. Search for a search engine Postgres v9.3 Sphinx v2.1 Solr v4.x ElasticSearch v1.4 Full text search ✓ ✓ ✓ ✓ Support for facets ❌ ✓ ✓ ✓ Cluster ready ❌ ❌ Limited ✓ Search in PDFs ❌ ❌ ✓ ✓ REST APIs ❌ ❌ ❌ ✓ Nested docs, Parent-Child relations ❌ NA Limited ✓ Powerful and Flexible Query DSL ❌ NA ❌ ✓
  5. 5. distributed, multitenant-capable, full-text search engine. ● Built upon battle tested Lucene ● Powerful and flexible Query DSL ● Powerful Aggregations ● REST APIs for everything ● Ease with nested documents and parent-child relationships ● Suitable eco system for data pipelines The goodness of ElasticSearch
  6. 6. What sits where ? Internet Search Service ES cluster Periodic Indexing job Postgres DB Primary datastore for core data jobs, candidates data
  7. 7. Log monitoring with ELK
  8. 8. Log monitoring: From a third-party solution to ELK based AWS S3 ElasticSearch cluster web & worker nodes with filebeat logstash Dashboards on Kibana Daily indices logs
  9. 9. Analytics pipeline with ELK stack
  10. 10. Recommendation engine Web Application ElasticSearch cluster web nodes with filebeat logstash User activity Kibana Dashboards Daily indices
  11. 11. Handling growth
  12. 12. ● enable slow query log, customizable per index Search performance tuning
  13. 13. ● Avoid nested documents, if you can Document modelling
  14. 14. ● Deep pagination is costly with search API Use scroll API where applicable
  15. 15. ● POST /unused_index/_close ● POST /index_with_more_segments/_forcemerge ● Use _rollover API to let hot/recent indexes use best servers Manage your indexes
  16. 16. ● Disable indexing, storing, norms, _source when you don’t need ● Use smallest numeric data or make it keyword ● Optimize number of primary shards ● Use bulk requests, optimize their size Index performance tuning
  17. 17. Summary ● Elastic stack is growing and improving - see if it fits your needs ● Defaults are good only to start - know what they are and tune them ● Different indexes for different data ● Understand your needs and model your documents well
  18. 18. Thank You! @asldevi

    Be the first to comment

    Login to see the comments

To be presented @ RootConf, 2018

Views

Total views

100

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

5

Shares

0

Comments

0

Likes

0

×