Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Lessons learned building a big data analytics engine, from proprietary to open source

81 views

Published on

Lessons learned building a big data analytics engine, from proprietary to open source by Álvaro Santamaria & Joel Brunger

After spending four years building a proprietary all-in-one streaming analytics engine for financial services, it became clear that open-source was starting to pull ahead. Alvaro will talk about the challenges of creating an IT operations solution for financial services; what to build, what not to build, and how to use open source tools to get past the infrastructure and focus on the business problems that matter.



Published in: Technology
  • Be the first to comment

  • Be the first to like this

Lessons learned building a big data analytics engine, from proprietary to open source

  1. 1. Álvaro Santamaría Data Scientist – ITRS @dofideas Joel Brunger System Engineer - MapR @joelbrunger Lessons Learned with Visualisation and Machine Learning for Big Data
  2. 2. Make real-time decisions based on historical wisdom
  3. 3. Jay Krepps – 2013 “The Log: What every software engineer should know about real-time data's unifying abstraction” The uber-system Lego-like, OS-based
  4. 4. 1. Big-data viz is not a matter of “front-end”
  5. 5. 2. 𝜅arao 𝜅e visualisations
  6. 6. 3. Information extraction
  7. 7. 3. Information extraction
  8. 8. 3. Information extraction
  9. 9. 3. Information extraction
  10. 10. 3. Information extraction GROUP BY country, timestamp window of 10 minutes SELECT count(), average(temperature), median(temperature), max(temperature), ... tdigest(temperature)
  11. 11. 3. Information extraction
  12. 12. 4. Hold state…
  13. 13. 4. Hold state…
  14. 14. 5. … and deliver (state).
  15. 15. 6. Build your pipeline
  16. 16. J on the Beach All Data, One Platform, Every Cloud Limitless Possibilities
  17. 17. What is MapR? MapR is the industry’s leading data platform for AI and Analytics.
  18. 18. Rendezvous Architecture MapR is the industry’s leading data platform for AI and Analytics.
  19. 19. Rendezvous Architecture MapR is the industry’s leading data platform for AI and Analytics.
  20. 20. Rendezvous Architecture MapR is the industry’s leading data platform for AI and Analytics.
  21. 21. Rendezvous Architecture MapR is the industry’s leading data platform for AI and Analytics.
  22. 22. Rendezvous Architecture The Decoy is design to just collect data inputs
  23. 23. Rendezvous Architecture MapR is the industry’s leading data platform for AI and Analytics.
  24. 24. Rendezvous Architecture Introducing the Canary
  25. 25. Rendezvous Architecture MapR is the industry’s leading data platform for AI and Analytics.
  26. 26. Why did ITRS choose MapR for ‘Gateway Hub’ MapR is the industry’s leading data platform for AI and Analytics. Ø Simplicity (integrated platform) Ø Real-time Ø Processing must be performed in the cluster Ø Enterprise features
  27. 27. MapR enables ITRS ‘Gateway Hub’ to provide the following benefits MapR is the industry’s leading data platform for AI and Analytics. Ø Smarter monitoring Ø Additional features, application and Services Ø Global Data Fabric Ø Support ML in real-time
  28. 28. Thank you JOnTheBeach 2018
  29. 29. Álvaro Santamaría Data Scientist – ITRS @dofideas Joel Brunger System Engineer - MapR @joelbrunger Lessons Learned with Visualisation and Machine Learning for Big Data

×