Playing with Hadoop 2013-10-31
Upcoming SlideShare
Loading in...5
×
 

Playing with Hadoop 2013-10-31

on

  • 332 views

 

Statistics

Views

Total Views
332
Views on SlideShare
332
Embed Views
0

Actions

Likes
0
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Playing with Hadoop 2013-10-31 Playing with Hadoop 2013-10-31 Presentation Transcript

  • Visma Consulting 2013-10-30 Playing with Hadoop Søren Lund (slu) slu@369.dk
  • Needed to run Hadoop  You need the following to run Hadoop   Java JDK   Linux server Hadoop tarball I'm using the following   JDK 1.6.24 64 bit   Ubuntu 12.04 LTS 64 bit Hadoop 1.0.4 Could not get JDK7 + Hadoop 2.2 to work
  • Installing Hadoop
  • Install Java
  • Setup Java home and path
  • Add hadoop user
  • Install Hadoop and add to path
  • Create SSH key for hadoop user
  • Accept SSH key
  • Disable IPv6
  • Reboot and check installation
  • Running an example job
  • Calculate Pi
  • Estimated value of Pi
  • Three modes of operation  Pi was calculated in Local standalone mode    it is the default mode (i.e. no configuration needed) all components of Hadoop run in a single JVM Pseudo-distributed mode   components communicate using sockets   a separate JVM is spawned for each component it is a minicluster on a single host Fully distributed mode  components are spread across multiple machines
  • Configuring for pseudo distributed mode
  • Create base directory for HDFS
  • Set JAVA_HOME
  • Edit core-site.xml
  • Edit hdfs-site.xml
  • Edit mapred-site.xml
  • Log out and log on as hadoop
  • Format HDFS
  • Start HDFS
  • Start Map Reduce
  • Create home directory & test data
  • Running Word Count
  • First let's try the example jar
  • Inspect the result
  • Compile and run our own jar https://gist.github.com/soren/7213273
  • Inspect result
  • Run improved version https://gist.github.com/soren/7213453
  • Inspect (improved) result
  • The Web User Interface  HDFS   MapReduce   http://localhost:8070/ File Browser   http://localhost:8030/ http://localhost:8075/browseDirectory.jsp?namenodeInfoPort Note: this is with port forwarding in VirtualBox  50030 → 8030, 50070 → 8070, 50075 → 8075
  • Now you can go play with Hadoop...