Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Published on
Related Resources
----------------------
Video (starts at 25 minutes): https://skillsmatter.com/skillscasts/7038-lightning-talks-2
From Scala Exchange 2015: https://skillsmatter.com/conferences/1948-scala-exchange-2014
Reactive Machine Learning: http://www.reactivemachinelearning.com/
Data Engineering blogging: https://medium.com/data-engineering
Talk Summary
-----------------
Before you can ever get started building large-scale data analytic systems, you need to start with one crucial element: data. Collecting data, especially collecting lots of data, is harder than it seems. Data ingested with the wrong data model can be worse than no data at all. A data collection system that is too slow can bring an entire platform grinding to a halt.
Don't panic! Scalable, non-destructive data collection is possible. This talk will focus on strategies for data collection based on real world experience building large scale machine learning systems. It will introduce ideas from the emerging paradigm of reactive machine learning that are based on older ideas about immutable facts and pervasive, intrinsic uncertainty.
Login to see the comments