Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
One of the earliest challenges facing new data science practitioners is how to scale their work from something that runs on their laptop to larger-scale jobs. Tools like Spark and Hadoop can have a steep learning curve, and often require explicit management of a compute cluster. Here we talk about pyWren, a python library that lets users run their workloads on hundreds of cloud machines with no distributed computing knowledge, for a few dollars at a time. We will walk the audience from writing simple data analysis functions on their laptop to running on 1000 cores on Amazon's web services, in 30 minutes.
Presented at AnacondaCON 2017 by Eric Jonas, UC Berkeley.
Login to see the comments