Part 1 of 3 of series focusing on the infrastructure aspect of getting started with Big Data, specifically Hadoop. This presentation starts small, installing a pre-packaged virtual machine from Hadoop vendor Cloudera on your local machine.
We then install R, copy some sample data into HDFS and test everything by running Jonathan Seidman's a sample streaming job.
Presented at the Boston Predictive Analytics Big Data Workshop, March 10, 2012
Clipping is a handy way to collect important slides you want to go back to later.