This report presents data derived from a comparative analysis that indicates gaps and security risks between multi-day test administrations with smaller numbers of test takers and single-format testing of large numbers of test takers. Additionally, the report highlights the use of standard best practices adopted around the world by leading test sponsor organisations.
This report presents data derived from a comparative analysis that indicates gaps and security risks between multi-day test administrations with smaller numbers of test takers and single-format testing of large numbers of test takers. Additionally, the report highlights the use of standard best practices adopted around the world by leading test sponsor organisations.
Propuesta de estrategias trabajadas por el Centro de estudios y promoción comunal del oriente (CEPCO) para mejorar las estrategias de la enseñanza de la lectura y escritura.
Silicon Valley x 日本 / Tech x Business Meetup #12 (2015/04/17)
『並列分散処理基盤Hadoopの紹介と、開発者が語るHadoopの使いどころ』
NTTデータ 基盤システム事業本部
システム方式技術事業部 OSSプロフェッショナルサービス
鯵坂 明
Data-Intensive Text Processing with MapReduce(Ch1,Ch2)Sho Shimauchi
This document is written about "Data-Intensive Text Processing with MapReduce" Chapter 1 and 2.
It includes basic topics about MapReduce programming model.
Next document will be more advanced topics.
12. HDFSの構成
DataNode
ブロック ① ① ③ ②
状態監視
管理 Heartbeat
② ① ②
メタ情報
管理
③ ① ② ①
NameNode
(MASTER) ② ① ③ ①
3 1 2
① ① ②
ファイル
ブロック
Hadoopクライアント DataNode (HDFS上で保持)
(SLAVE) 12
OSC 2010 Tokyo/Spring
13. HDFSの異常時
× ×
DataNode
ブロック ① ① ③ ②
状態監視
× ××
管理
× ×
② ① ②
メタ情報
管理
③ ① ② ① ②
NameNode
② ① ③ ① ②
HDFS停止 HDFS破壊
① ① ① ②
NameNode=単一障害点
レプリケーションの発生
DataNode
DataNode=単体で故障しても大丈夫 13
OSC 2010 Tokyo/Spring
14. MapReduceの流れ
This is a pen. <Pen,1>
I played tennis. <Tennis,1>
・・・ <This,1> <Pen: [1, 1, 1]>
・・・ <This: [1, 1]>
Map
<Pen, 3>
Reduce
<This, 2>
・・・
Map
Reduce
Shuffle ・・・
Map Keyで集約
<Key, Value>の形でデータを管理
14
OSC 2010 Tokyo/Spring
16. MapReduceの構成
MapReduceジョブ
TaskTracker Map Map Reduce
管理
状態監視
タスク
① ① ③ ②
管理
Heartbeat
JobTracker M R R
(MASTER)
R M M M
NameNode
M R
MapReduceジョブ
M R R
TaskTracker タスク実行待ち
Hadoopクライアント
(SLAVE) タスク実行中
16
OSC 2010 Tokyo/Spring タスク実行(競争)中
17. MapReduce(TaskTracker)異常時
Map MapReduce
JobTracker Heartbeat
① ① ③ ②
(MASTER)
M R R
R M M M
NameNode
M R
××
MapReduceジョブ
T R R
TaskTracker タスク再割り当て
Hadoopクライアント
17
OSC 2010 Tokyo/Spring
18. MapReduce(JobTracker)異常時
× ×
MapReduceジョブ
×
TaskTracker Map MapReduce
管理
状態監視
タスク
① ① ③ ②
管理
M M R
JobTracker
R M M R
MapReduceジョブ・・・失敗 M R
M R R
TaskTracker
Hadoopクライアント
18
OSC 2010 Tokyo/Spring
26. WordCountのソースコード
public c la s s WordCount {
public s tatic c la s s TokenizerMapper
e x te nds Mapper<Object, Text, Text, IntWritable>{ public s tatic void main(String[] args) throw s Exception {
Configuration conf = ne w Configuration();
private fina l s ta tic IntWritable one = ne w IntWritable(1); String[] otherArgs =
private Text word = ne w Text(); ne w GenericOptionsParser(conf,
args).getRemainingArgs();
public void map(Object key, Text value, Context context if (otherArgs.length != 2) {
) throw s IOException, InterruptedException { System.err.println("Usage: wordcount <in> <out>");
StringTokenizer itr = ne w System.exit(2);
StringTokenizer(value.toString()); }
w hile (itr.hasMoreTokens()) { Job job = ne w Job(conf, "word count");
word.set(itr.nextToken()); job.setJarByClass(WordCount.c la s s );
job.setMapperClass(TokenizerMapper.c la s s );
Map
context.write(word, one);
} job.setCombinerClass(IntSumReducer.c la s s );
} job.setReducerClass(IntSumReducer.c la s s );
}
job.setOutputKeyClass(Text.c la s s );
public s tatic c la s s IntSumReducer job.setOutputValueClass(IntWritable.c la s s );
e x te nds Reducer<Text,IntWritable,Text,IntWritable> { FileInputFormat.addInputP ath(job, ne w Path(otherArgs[0]));
private IntWritable result = ne w IntWritable(); FileOutputFormat.setO utputP ath(job, ne w Path(otherArgs[1]));
System.exit(job.waitForCompletion(true ) ? 0 : 1);
public void reduce(Text key, Iterable<IntWritable> values, }
Context context }
int sum = 0;
) throw s IOException, InterruptedException {
Job
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
Reduce 約50行
context.write(key, result); 26
}
} OSC 2010 Tokyo/Spring
31. WordCountをPigで書いたら、、、
Rawdata = LOAD '/tmp/' USING PigStorage(',') AS (row:chararray);
Words = FOREACH Rawdata GENERATE FLATTEN (TOKENIZE((chararray)$0));
Grouped = GROUP Words BY $0;
Counts = FOREACH Grouped GENERATE COUNT(Words), group;
Ordered = ORDER Counts by $0 DESC;
STORE Ordered INTO 'pig-wordcount';
6行でWordCountを実現!!!
31
OSC 2010 Tokyo/Spring