Index Activiti Data on Elasticsearch

1,891 views

Published on

Activit User Day 2015 presentation with Mike Dias (@mike_dias) and Silvio Neto (@silvioneto) about Activiti integration with Elasticsearch.

Published in: Data & Analytics
  • Be the first to comment

Index Activiti Data on Elasticsearch

  1. 1. Index Activiti data on Elasticsearch Activiti User Day Paris 2015
  2. 2. Silvio dos Passos Neto CTO at iColabora @silvioneto
  3. 3. “Don’t  bridge  the   business-­‐IT  divide.   Obliterate  it!”  (2003) Smith & Fingar
  4. 4. ?
  5. 5. ?
  6. 6. @mike_dias
  7. 7. The big table problem
  8. 8. ID_ NAME_ VALUE_* … ACT_HI_VARINST
  9. 9. User form Process Instance ID_ NAME_ VALUE_* … 1 client_name Jonh … 2 client_tel 123456 … 3 due_date 01/06/2015 … 4 demand_desc I have a problem… … ACT_HI_VARINST
  10. 10. User form Process Instance Process Instance User form ID_ NAME_ VALUE_* … 1 client_name Jonh … 2 client_tel 123456 … 3 due_date 01/06/2015 … 4 demand_desc I have a problem… … 5 client_name Bob … 6 client_tel 654321 … 7 due_date 10/06/2015 … 8 demand_desc My internet conn… … ACT_HI_VARINST
  11. 11. 85 fields x ~1000 Process per day = ~85.000 variables per day
  12. 12. ~15 million variables in 9 months
  13. 13. The Tool
  14. 14. Built on top of
  15. 15. Analytics
  16. 16. Distributed
  17. 17. Indexing the data
  18. 18. Historic Data
  19. 19. P P P P P P P P P P P P P P P P P P P Process Lake
  20. 20. P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P CPU 1 CPU 2 CPU 3 CPU 4 Process Lake
  21. 21. P P P CPU 1 V V V V V V V V Variables T T T Tasks P Process { }P V V V V V V V V T T T JSON REST API
  22. 22. Real-Time Data
  23. 23. E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E Engine Events E E E E E E E E E E E E
  24. 24. E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E Engine Events E E E E E E E E E E E E E EE E EE { } JSON REST API Listeners
  25. 25. Playing with the data
  26. 26. Search
  27. 27. { "query":{ "path":"variables", "nested":{ "query":{ "match":{ "text":"João Silva" } } } } }
  28. 28. Search results
  29. 29. Compare
  30. 30. SELECT * FROM ACT_HI_VARINST WHERE NAME_ = 'passport' AND TEXT_ = '1234'
  31. 31. { "filter":{ "nested":{ "path":"variables", "filter":{ "bool":{ "must":[ { “term": { "name":"passport" }}, { “term": { "text":"1234" }} ] } } } } }
  32. 32. Response Time 0 secs 45 secs 90 secs 135 secs 180 secs MySQL Elasticsearch 0,08 secs 161 secs
  33. 33. Response Time 0 secs 45 secs 90 secs 135 secs 180 secs MySQL Elasticsearch 0,08 secs 161 secs CENSURED
  34. 34. Lessons learned
  35. 35. Full text search is a helpful feature
  36. 36. Reduce MySQL workload
  37. 37. ES is great for analytics
  38. 38. Next steps
  39. 39. Apache Spark Lightning-Fast Cluster Computing
  40. 40. Java EE dependency
  41. 41. Open source
  42. 42. Thank you! @mike_dias @silvioneto
  43. 43. Questions? @mike_dias @silvioneto

×