Playing with Hadoop 2013-10-31

  • 170 views
Uploaded on

 

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
170
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
7
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Visma Consulting 2013-10-30 Playing with Hadoop Søren Lund (slu) slu@369.dk
  • 2. Needed to run Hadoop  You need the following to run Hadoop   Java JDK   Linux server Hadoop tarball I'm using the following   JDK 1.6.24 64 bit   Ubuntu 12.04 LTS 64 bit Hadoop 1.0.4 Could not get JDK7 + Hadoop 2.2 to work
  • 3. Installing Hadoop
  • 4. Install Java
  • 5. Setup Java home and path
  • 6. Add hadoop user
  • 7. Install Hadoop and add to path
  • 8. Create SSH key for hadoop user
  • 9. Accept SSH key
  • 10. Disable IPv6
  • 11. Reboot and check installation
  • 12. Running an example job
  • 13. Calculate Pi
  • 14. Estimated value of Pi
  • 15. Three modes of operation  Pi was calculated in Local standalone mode    it is the default mode (i.e. no configuration needed) all components of Hadoop run in a single JVM Pseudo-distributed mode   components communicate using sockets   a separate JVM is spawned for each component it is a minicluster on a single host Fully distributed mode  components are spread across multiple machines
  • 16. Configuring for pseudo distributed mode
  • 17. Create base directory for HDFS
  • 18. Set JAVA_HOME
  • 19. Edit core-site.xml
  • 20. Edit hdfs-site.xml
  • 21. Edit mapred-site.xml
  • 22. Log out and log on as hadoop
  • 23. Format HDFS
  • 24. Start HDFS
  • 25. Start Map Reduce
  • 26. Create home directory & test data
  • 27. Running Word Count
  • 28. First let's try the example jar
  • 29. Inspect the result
  • 30. Compile and run our own jar https://gist.github.com/soren/7213273
  • 31. Inspect result
  • 32. Run improved version https://gist.github.com/soren/7213453
  • 33. Inspect (improved) result
  • 34. The Web User Interface  HDFS   MapReduce   http://localhost:8070/ File Browser   http://localhost:8030/ http://localhost:8075/browseDirectory.jsp?namenodeInfoPort Note: this is with port forwarding in VirtualBox  50030 → 8030, 50070 → 8070, 50075 → 8075
  • 35. Now you can go play with Hadoop...