Playing with Hadoop 2013-10-31
Upcoming SlideShare
Loading in...5
×
 

Playing with Hadoop 2013-10-31

on

  • 287 views

 

Statistics

Views

Total Views
287
Views on SlideShare
287
Embed Views
0

Actions

Likes
0
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Playing with Hadoop 2013-10-31 Playing with Hadoop 2013-10-31 Presentation Transcript

    • Visma Consulting 2013-10-30 Playing with Hadoop Søren Lund (slu) slu@369.dk
    • Needed to run Hadoop  You need the following to run Hadoop   Java JDK   Linux server Hadoop tarball I'm using the following   JDK 1.6.24 64 bit   Ubuntu 12.04 LTS 64 bit Hadoop 1.0.4 Could not get JDK7 + Hadoop 2.2 to work
    • Installing Hadoop
    • Install Java
    • Setup Java home and path
    • Add hadoop user
    • Install Hadoop and add to path
    • Create SSH key for hadoop user
    • Accept SSH key
    • Disable IPv6
    • Reboot and check installation
    • Running an example job
    • Calculate Pi
    • Estimated value of Pi
    • Three modes of operation  Pi was calculated in Local standalone mode    it is the default mode (i.e. no configuration needed) all components of Hadoop run in a single JVM Pseudo-distributed mode   components communicate using sockets   a separate JVM is spawned for each component it is a minicluster on a single host Fully distributed mode  components are spread across multiple machines
    • Configuring for pseudo distributed mode
    • Create base directory for HDFS
    • Set JAVA_HOME
    • Edit core-site.xml
    • Edit hdfs-site.xml
    • Edit mapred-site.xml
    • Log out and log on as hadoop
    • Format HDFS
    • Start HDFS
    • Start Map Reduce
    • Create home directory & test data
    • Running Word Count
    • First let's try the example jar
    • Inspect the result
    • Compile and run our own jar https://gist.github.com/soren/7213273
    • Inspect result
    • Run improved version https://gist.github.com/soren/7213453
    • Inspect (improved) result
    • The Web User Interface  HDFS   MapReduce   http://localhost:8070/ File Browser   http://localhost:8030/ http://localhost:8075/browseDirectory.jsp?namenodeInfoPort Note: this is with port forwarding in VirtualBox  50030 → 8030, 50070 → 8070, 50075 → 8075
    • Now you can go play with Hadoop...