Your SlideShare is downloading. ×
Playing with Hadoop 2013-10-31
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Playing with Hadoop 2013-10-31

195
views

Published on

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
195
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Visma Consulting 2013-10-30 Playing with Hadoop Søren Lund (slu) slu@369.dk
  • 2. Needed to run Hadoop  You need the following to run Hadoop   Java JDK   Linux server Hadoop tarball I'm using the following   JDK 1.6.24 64 bit   Ubuntu 12.04 LTS 64 bit Hadoop 1.0.4 Could not get JDK7 + Hadoop 2.2 to work
  • 3. Installing Hadoop
  • 4. Install Java
  • 5. Setup Java home and path
  • 6. Add hadoop user
  • 7. Install Hadoop and add to path
  • 8. Create SSH key for hadoop user
  • 9. Accept SSH key
  • 10. Disable IPv6
  • 11. Reboot and check installation
  • 12. Running an example job
  • 13. Calculate Pi
  • 14. Estimated value of Pi
  • 15. Three modes of operation  Pi was calculated in Local standalone mode    it is the default mode (i.e. no configuration needed) all components of Hadoop run in a single JVM Pseudo-distributed mode   components communicate using sockets   a separate JVM is spawned for each component it is a minicluster on a single host Fully distributed mode  components are spread across multiple machines
  • 16. Configuring for pseudo distributed mode
  • 17. Create base directory for HDFS
  • 18. Set JAVA_HOME
  • 19. Edit core-site.xml
  • 20. Edit hdfs-site.xml
  • 21. Edit mapred-site.xml
  • 22. Log out and log on as hadoop
  • 23. Format HDFS
  • 24. Start HDFS
  • 25. Start Map Reduce
  • 26. Create home directory & test data
  • 27. Running Word Count
  • 28. First let's try the example jar
  • 29. Inspect the result
  • 30. Compile and run our own jar https://gist.github.com/soren/7213273
  • 31. Inspect result
  • 32. Run improved version https://gist.github.com/soren/7213453
  • 33. Inspect (improved) result
  • 34. The Web User Interface  HDFS   MapReduce   http://localhost:8070/ File Browser   http://localhost:8030/ http://localhost:8075/browseDirectory.jsp?namenodeInfoPort Note: this is with port forwarding in VirtualBox  50030 → 8030, 50070 → 8070, 50075 → 8075
  • 35. Now you can go play with Hadoop...

×