Intro to Hadoop
by Eric Wendelin
- 3,758 views
We give an Overview of Hadoop, HDFS, and MapReduce. We then move on to present scenarios for Hadoop usage with Java code, and touch on some of the more useful features of and projects under the Hadoop ...
We give an Overview of Hadoop, HDFS, and MapReduce. We then move on to present scenarios for Hadoop usage with Java code, and touch on some of the more useful features of and projects under the Hadoop umbrella.
Statistics
- Likes
- 11
- Downloads
- 207
- Comments
- 0
- Embed Views
- Views on SlideShare
- 3,755
- Total Views
- 3,758

Looking to scale up to 50-100 nodes in the next year
Hadoop can support other filesystems, like Amazon S3, but we are going to focus the on filesystem that is part of Hadoop Common.
A block is the minimum amount of data that can be read or written.
All writes to HDFS are made by a single writer to the end of the file
Let’s do this live. And create a directory for all of Fred’s baby pictures. That should give us a nice multi-terabyte dataset, eh?
You can download a VMWare image or other stuff to get you started.
Has an interactive shell called Grunt
Use this when you need real-time random read/write capabilities to your Big Data. We currently use HBase to serve up data to web pages much faster than a traditional RDBMS could with the amount of data we have.