1. Putting the science back in
Data Science
Daniel Whitenack, @dwhitena
Data Scientist and Advocate, @pachydermIO
Chicago Data Science Conference 2017
2. Outline
1. Why do we care about Reproducibility?
2. How can we achieve Reproducibility?
3. Demo with R and Pachyderm
4. Resources
@dwhitena, @datascienceassn, #DSAChicago2017
3. Why do we care about
Reproducibility?
@dwhitena, @datascienceassn, #DSAChicago2017
4. How can we achieve
Reproducibility?
(at scale)
@dwhitena, @datascienceassn, #DSAChicago2017
12. … enter Pachyderm
An open source, distributed processing and data versioning
framework built on containers.
@dwhitena, @datascienceassn, #DSAChicago2017
20. Conclusion/Resources
● Run the code/pipeline
● Join the Pachyderm Slack channel
● Check out the Pachyderm docs
● Slack/tweet me @dwhitena
● Read a related article
@dwhitena, @datascienceassn, #DSAChicago2017