Your SlideShare is downloading. ×
0
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Hadoop gfarm使い方
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop gfarm使い方

1,203

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,203
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Hadoopによる分散並列データ処理<br />2010/07/02<br />三上俊輔(筑波大)<br />
  • 2. Hadoop-Gfarmプラグインの概要<br />HadoopからGfarm上のファイルへのアクセスを可能にするプラグイン<br />GfarmをHDFSの代わりとして使用可能<br />HDFSとGfarmを両方起動して使用することも可能<br />HadoopではURIを使ってどのファイルシステムを使用するか判断<br />HDFSなら hfds://hostname:port <br />S3なら s3://ID:SECRET@BUCKET<br />本プラグインによって gfarm:/// でGfarmへアクセス可能になる<br />
  • 3. Hadoop-Gfarm software stack<br />HadoopMapReduce applications<br />Hadoop File System Shell<br />File System API<br />HDFS client library<br />Gfarm JNI shim layer<br />Gfarm client library<br />HDFS servers<br />Gfarm servers<br />
  • 4. Hadoopのインストール<br />GNU/Linux<br />JDK 1.6.x のインストール<br />http://hadoop.apache.org/<br />Apacheのディストリビューションをサポート<br />hadoop-0.20.2/conf以下の設定ファイルを編集<br />$ wget [url]<br />$ tar zxf hadoop-0.20.2.tar.gz<br />$ cd hadoop-0.20.2<br />
  • 5. Hadoop-Gfarmセットアップ方法<br />Sourceforgeのレポジトリからチェックアウト<br />build.shを編集<br />build.shを実行するとhadoopのライブラリディレクトリへ自動的にコピーされる<br />実際には hadoop-gfarm.jar と libGfarmFSNative.soが生成され,Hadoop-gfarm.jarは{HADOOP_HOME}/libへlibGfarmFSNative.soは{HADOP_HOME}/lib/native/Linux-amd-64 or Linux-i386-32<br />svn co https://gfarm.svn.sourceforge.net/svnroot/gfarm/gfarm_hadoop/trunkgfarm_hadoop<br />export JAVA_HOME=/usr/java/default<br />export HADOOP_HOME=/home/mikami/hadoop-0.20.2<br />export GFARM_HOME=/usr/local/gfarm_v2<br />./build.sh<br />
  • 6. Hadoop側の設定<br />hadoop-0.20/conf/core-site.xml<br />
  • 7. Hadoopの設定<br />conf/hadoop-env.sh<br />conf/masters<br />conf/slaves<br />export JAVA_HOME=/usr/java/default<br />les00<br />les02<br />les03<br />…<br />$ ./bin/start-all.sh <br />
  • 8. 設定<br />.bashrcに追記<br />export GFARM_HOME=/usr/local/gfarm_v2<br />export LD_LIBRARY_PATH=$GFARM_HOME/lib<br />
  • 9. 使い方(Hadoop file system shell)<br />gfarm:///path/nameでアクセス可能<br />% hadoopfs -lsgfarm:///home/mikami/<br />Found 1 items<br />drwxrwxrwx - 0 2010-06-30 15:09 /home/mikami/system<br />% hadoopfs -mkdirgfarm:///home/mikami/dir<br />% hadoopfs -lsgfarm:///home/mikami/<br />Found 2 items<br />drwxrwxrwx - 0 2010-06-30 16:05 /home/mikami/dir<br />drwxrwxrwx - 0 2010-06-30 15:09 /home/mikami/system<br />fs.default.nameがgfarm:///ならグレー部分は省略可能<br />
  • 10. 使い方(サンプルプログラムの実行)<br />% hadoop jar hadoop-0.20.2-examples.jar teragen 10000gfarm:///home/mikami/input<br />…<br />% hadoop jar hadoop-0.20.2-examples.jar grepgfarm:///home/mikami/input gfarm:///home/mikami/output AAA<br />…<br />% hadoopfs -lsgfarm:///home/mikami/<br />Found 2 items<br />drwxrwxrwx - 0 2010-06-30 16:11 /home/mikami/input<br />drwxrwxrwx - 0 2010-06-30 16:11 /home/mikami/output<br />
  • 11. おわりに<br />https://gfarm.svn.sourceforge.net/svnroot/gfarm/gfarm_hadoop/trunk/README<br />[#HADOOP-5635]distributed cache doesn't work with other distributed file systems<br />https://issues.apache.org/jira/browse/HADOOP-5635<br />mikami@hpcs.cs.tsukuba.ac.jp<br />

×