MapReduce아키텍트를 꿈꾸는 사람들cafe.naver.com/architect1현수명  soomong.net#soomong
MapReduce?Map + ReduceExecution Overview
MapReduce?mechanism for processing large datadeveloped within Google from a functional language<key,value> pair
Execution Overview	Google
Execution OverviewHadoop
Map<Only value>Map takes as input a function anda sequence of values. It then applies the function to each value in the sequencefuncMap(func,values)appliedvaluesvalues
Map<Only value>f(x)=x*xMap(func,values)(1,4,9)(1,2,3)Clojure(map (fn [x] (x*x)) [1 2 3])
Reduce<Only value>A reduce combinesall the elements of a sequence using a binary operationfuncReduce(func,values)appliedvaluevalues
Reduce<Only value>+Reduce(func,values)(14)(1,4,9)Clojure(reduce + [1 4 9])
Map<key,value>f(x)=x*xMap(func,values)(keyA,1)(keyB,4)(keyA,9)(1,2,3)
Reduce<key,value>+Reduce(func,(key,value))(keyA,10)(keyB,4)(keyA,1)(keyB,4)(keyA,9)
Map + Reduce(1,2,3…99998,99999,100000)f(x)=x*xf(x)=x*xf(x)=x*xMap(func,values)Map(func,values)Map(func,values)++Reduce(func,values)Reduce(func,values)Reduce(func,values)(keyA,107)(keyB,29)(…,…,…)(…)(result)(keyA,1)(keyB,4)(keyA,9)(1,2,3)(4,5,6)(99998,99999,…)(keyA,16)(keyB,25)(keyA,81)+…
Execution Example“Stay Hungry Stay Foolish Don’t settle”Word CounterStay 2Hungry 1Foolish 1Don’t 1Settle 1
Mapmap(String key, String value): // key: document name // value: document contents for each word w in value: EmitIntermediate(w, "1");
Reducereduce(String key, Iterator values): // key: a word 	// values: a list of counts int result = 0; 	for each v in values: 		result +=ParseInt(v);	Emit(AsString(result));
Stay Hungry Stay FoolishDon’t settleStay Hungry Stay FoolishDon’t settle(1) Shards the input files
Stay Hungry Stay FoolishDon’t settle(2) Master pick idle workers	assign a map or reduce task
<Stay,1><Hungry,1><Stay,1><Foolish,1>Stay Hungry Stay FoolishDon’t settle<Don’t,1><settle,1>(3) Map worker reads input shard,do Map func -> intermediate <key,value>
<Stay,1><Hungry,1><Stay,1><Foolish,1><Don’t,1><settle,1>(4) Write intermediate <key,value> on local disk
<Stay,1><Hungry,1><Stay,1><Foolish,1><Foolish,1><Hungry,1><Stay,(1,1)><Don’t,1><settle,1><Don’t,1><settle,1>(5) Reduce worker reads intermediate datasort by key
<Foolish,1><Hungry,1><Stay,(1,1)><Foolish,1><Hungry,1><Stay,2><Don’t,1><settle,1><Don’t,1><settle,1>(6) Reduce worker do reduce func	-> wrtie output
<Foolish,1><Hungry,1><Stay,2><Don’t,1><settle,1><Foolish,1><Hungry,1><Stay,2><Don’t,1><settle,1>(7) All map tasks and reduce tasks have been completed	master wakes up the user program -> return Result
Mapmap = function () {    for (var key in this) {        emit(key, {count:1});    }
Reducereduce = function (key, emits) {    total = 0;    for (vari in emits) {        total += emits[i].count;    }    return {count:total};
Executionmr = db.foo.mapReduce(map,reduce,{out:"mongoDBmapReduce"}){        "result" : "mongoDBmapReduce",        "timeMillis" : 11,        "counts" : {                "input" : 4,                "emit" : 8,                "output" : 5        },        "ok" : 1,}
ReferenceGoogle MapReducehttp://code.google.com/intl/ko-KR/edu/parallel/mapreduce-tutorial.htmlMongoDBMapReducehttp://kylebanker.com/blog/2009/12/mongodb-map-reduce-basics/HadoopMapReducehttp://hadoop.apache.org/common/docs/current/mapred_tutorial.html
감사합니다

MapReduce