Your SlideShare is downloading. ×
0
Visma Consulting 2013-10-30

Playing with Hadoop
Søren Lund (slu)
slu@369.dk
Needed to run Hadoop


You need the following to run Hadoop



Java JDK





Linux server
Hadoop tarball

I'm using t...
Installing Hadoop
Install Java
Setup Java home and path
Add hadoop user
Install Hadoop and add to path
Create SSH key for hadoop user
Accept SSH key
Disable IPv6
Reboot and check installation
Running an example job
Calculate Pi
Estimated value of Pi
Three modes of operation


Pi was calculated in Local standalone mode





it is the default mode (i.e. no configurati...
Configuring for pseudo distributed mode
Create base directory for HDFS
Set JAVA_HOME
Edit core-site.xml
Edit hdfs-site.xml
Edit mapred-site.xml
Log out and log on as hadoop
Format HDFS
Start HDFS
Start Map Reduce
Create home directory & test data
Running Word Count
First let's try the example jar
Inspect the result
Compile and run our own jar
https://gist.github.com/soren/7213273
Inspect result
Run improved version
https://gist.github.com/soren/7213453
Inspect (improved) result
The Web User Interface


HDFS




MapReduce




http://localhost:8070/

File Browser




http://localhost:8030/

ht...
Now you can go play with Hadoop...
Upcoming SlideShare
Loading in...5
×

Playing with Hadoop 2013-10-31

255

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
255
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Playing with Hadoop 2013-10-31"

  1. 1. Visma Consulting 2013-10-30 Playing with Hadoop Søren Lund (slu) slu@369.dk
  2. 2. Needed to run Hadoop  You need the following to run Hadoop   Java JDK   Linux server Hadoop tarball I'm using the following   JDK 1.6.24 64 bit   Ubuntu 12.04 LTS 64 bit Hadoop 1.0.4 Could not get JDK7 + Hadoop 2.2 to work
  3. 3. Installing Hadoop
  4. 4. Install Java
  5. 5. Setup Java home and path
  6. 6. Add hadoop user
  7. 7. Install Hadoop and add to path
  8. 8. Create SSH key for hadoop user
  9. 9. Accept SSH key
  10. 10. Disable IPv6
  11. 11. Reboot and check installation
  12. 12. Running an example job
  13. 13. Calculate Pi
  14. 14. Estimated value of Pi
  15. 15. Three modes of operation  Pi was calculated in Local standalone mode    it is the default mode (i.e. no configuration needed) all components of Hadoop run in a single JVM Pseudo-distributed mode   components communicate using sockets   a separate JVM is spawned for each component it is a minicluster on a single host Fully distributed mode  components are spread across multiple machines
  16. 16. Configuring for pseudo distributed mode
  17. 17. Create base directory for HDFS
  18. 18. Set JAVA_HOME
  19. 19. Edit core-site.xml
  20. 20. Edit hdfs-site.xml
  21. 21. Edit mapred-site.xml
  22. 22. Log out and log on as hadoop
  23. 23. Format HDFS
  24. 24. Start HDFS
  25. 25. Start Map Reduce
  26. 26. Create home directory & test data
  27. 27. Running Word Count
  28. 28. First let's try the example jar
  29. 29. Inspect the result
  30. 30. Compile and run our own jar https://gist.github.com/soren/7213273
  31. 31. Inspect result
  32. 32. Run improved version https://gist.github.com/soren/7213453
  33. 33. Inspect (improved) result
  34. 34. The Web User Interface  HDFS   MapReduce   http://localhost:8070/ File Browser   http://localhost:8030/ http://localhost:8075/browseDirectory.jsp?namenodeInfoPort Note: this is with port forwarding in VirtualBox  50030 → 8030, 50070 → 8070, 50075 → 8075
  35. 35. Now you can go play with Hadoop...
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×