Submit Search
Upload
サンプルから見るMapReduceコード
•
5 likes
•
1,780 views
Shinpei Ohtani
Follow
Mapperしか出来ませんでしたが、とりあえず。
Read less
Read more
Technology
Report
Share
Report
Share
1 of 22
Download now
Download to read offline
Recommended
Introduction to Apache Pig
Introduction to Apache Pig
Jason Shao
Apache Hadoop for System Administrators
Apache Hadoop for System Administrators
Allen Wittenauer
Terraform infraestructura como código
Terraform infraestructura como código
Victor Adsuar
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReduce
Pedro Figueiredo
Hadoop on osx
Hadoop on osx
Devopam Mittra
My life as a beekeeper
My life as a beekeeper
Pedro Figueiredo
Introduction to Apache Hive
Introduction to Apache Hive
Avkash Chauhan
Introduction to Apache Pig
Introduction to Apache Pig
Anshul Bhatnagar
Recommended
Introduction to Apache Pig
Introduction to Apache Pig
Jason Shao
Apache Hadoop for System Administrators
Apache Hadoop for System Administrators
Allen Wittenauer
Terraform infraestructura como código
Terraform infraestructura como código
Victor Adsuar
Perl on Amazon Elastic MapReduce
Perl on Amazon Elastic MapReduce
Pedro Figueiredo
Hadoop on osx
Hadoop on osx
Devopam Mittra
My life as a beekeeper
My life as a beekeeper
Pedro Figueiredo
Introduction to Apache Hive
Introduction to Apache Hive
Avkash Chauhan
Introduction to Apache Pig
Introduction to Apache Pig
Anshul Bhatnagar
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
ragho
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Piotr Wikiel
SQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
Hortonworks
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Rupak Roy
Hive commands
Hive commands
Ganesh Sanap
Sql cheat sheet
Sql cheat sheet
solgenomics
Shark - Lab Assignment
Shark - Lab Assignment
Farzad Nozarian
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
ThoughtWorks
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Tatsuya Sasaki
Introduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
Infrastructure as Code with Terraform
Infrastructure as Code with Terraform
Mario IC
Lua: the world's most infuriating language
Lua: the world's most infuriating language
jgrahamc
HBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User Group
gethue
Build your own_map_by_yourself
Build your own_map_by_yourself
Marc Huang
REST Active Resource - 7º Encontro do GURU Sorocaba
REST Active Resource - 7º Encontro do GURU Sorocaba
Lucas Renan
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Zheng Shao
Using spaces (Drupal)
Using spaces (Drupal)
Stijn De Meyere
Advanced Sqoop
Advanced Sqoop
Yogesh Kulkarni
What's New In JDK 10
What's New In JDK 10
Vladimir Tsanev
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
Hanborq Inc.
Lecture 2 part 3
Lecture 2 part 3
Jazan University
More Related Content
What's hot
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
ragho
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Piotr Wikiel
SQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
Hortonworks
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Rupak Roy
Hive commands
Hive commands
Ganesh Sanap
Sql cheat sheet
Sql cheat sheet
solgenomics
Shark - Lab Assignment
Shark - Lab Assignment
Farzad Nozarian
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
Mitsuharu Hamba
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
ThoughtWorks
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Tatsuya Sasaki
Introduction to scoop and its functions
Introduction to scoop and its functions
Rupak Roy
Infrastructure as Code with Terraform
Infrastructure as Code with Terraform
Mario IC
Lua: the world's most infuriating language
Lua: the world's most infuriating language
jgrahamc
HBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User Group
gethue
Build your own_map_by_yourself
Build your own_map_by_yourself
Marc Huang
REST Active Resource - 7º Encontro do GURU Sorocaba
REST Active Resource - 7º Encontro do GURU Sorocaba
Lucas Renan
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Zheng Shao
Using spaces (Drupal)
Using spaces (Drupal)
Stijn De Meyere
Advanced Sqoop
Advanced Sqoop
Yogesh Kulkarni
What's New In JDK 10
What's New In JDK 10
Vladimir Tsanev
What's hot
(20)
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
SQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
Hive commands
Hive commands
Sql cheat sheet
Sql cheat sheet
Shark - Lab Assignment
Shark - Lab Assignment
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
HadoopThe Hadoop Java Software Framework
HadoopThe Hadoop Java Software Framework
Hadoop導入事例 in クックパッド
Hadoop導入事例 in クックパッド
Introduction to scoop and its functions
Introduction to scoop and its functions
Infrastructure as Code with Terraform
Infrastructure as Code with Terraform
Lua: the world's most infuriating language
Lua: the world's most infuriating language
HBase + Hue - LA HBase User Group
HBase + Hue - LA HBase User Group
Build your own_map_by_yourself
Build your own_map_by_yourself
REST Active Resource - 7º Encontro do GURU Sorocaba
REST Active Resource - 7º Encontro do GURU Sorocaba
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Using spaces (Drupal)
Using spaces (Drupal)
Advanced Sqoop
Advanced Sqoop
What's New In JDK 10
What's New In JDK 10
Similar to サンプルから見るMapReduceコード
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
Hanborq Inc.
Lecture 2 part 3
Lecture 2 part 3
Jazan University
mapreduce ppt.ppt
mapreduce ppt.ppt
TAGADPALLEWARPARTHVA
L3.fa14.ppt
L3.fa14.ppt
Tushar557668
Osd ctw spark
Osd ctw spark
Wisely chen
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
Map Reduce
Map Reduce
Prashant Gupta
Hadoop Overview kdd2011
Hadoop Overview kdd2011
Milind Bhandarkar
Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
Hive Anatomy
Hive Anatomy
nzhang
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Carol McDonald
Hadoop london
Hadoop london
Yahoo Developer Network
Hadoop first mr job - inverted index construction
Hadoop first mr job - inverted index construction
Subhas Kumar Ghosh
Large Scale Data Processing & Storage
Large Scale Data Processing & Storage
Ilayaraja P
Elephant in the cloud
Elephant in the cloud
rhatr
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
IndicThreads
Brust hadoopecosystem
Brust hadoopecosystem
Andrew Brust
MapReduce Paradigm
MapReduce Paradigm
Dilip Reddy
MapReduce Paradigm
MapReduce Paradigm
Dilip Reddy
Hadoop M/R Pig Hive
Hadoop M/R Pig Hive
zahid-mian
Similar to サンプルから見るMapReduceコード
(20)
Hadoop MapReduce Streaming and Pipes
Hadoop MapReduce Streaming and Pipes
Lecture 2 part 3
Lecture 2 part 3
mapreduce ppt.ppt
mapreduce ppt.ppt
L3.fa14.ppt
L3.fa14.ppt
Osd ctw spark
Osd ctw spark
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
Map Reduce
Map Reduce
Hadoop Overview kdd2011
Hadoop Overview kdd2011
Hadoop Overview & Architecture
Hadoop Overview & Architecture
Hive Anatomy
Hive Anatomy
Introduction to Spark on Hadoop
Introduction to Spark on Hadoop
Hadoop london
Hadoop london
Hadoop first mr job - inverted index construction
Hadoop first mr job - inverted index construction
Large Scale Data Processing & Storage
Large Scale Data Processing & Storage
Elephant in the cloud
Elephant in the cloud
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Processing massive amount of data with Map Reduce using Apache Hadoop - Indi...
Brust hadoopecosystem
Brust hadoopecosystem
MapReduce Paradigm
MapReduce Paradigm
MapReduce Paradigm
MapReduce Paradigm
Hadoop M/R Pig Hive
Hadoop M/R Pig Hive
More from Shinpei Ohtani
Amazon Aurora
Amazon Aurora
Shinpei Ohtani
AWS Lambda and Amazon API Gateway
AWS Lambda and Amazon API Gateway
Shinpei Ohtani
ECS for Docker Meetup #4
ECS for Docker Meetup #4
Shinpei Ohtani
JVM的な何か@JVM Operation Casual Talk
JVM的な何か@JVM Operation Casual Talk
Shinpei Ohtani
Amazon kinesisで広がるリアルタイムデータプロセッシングとその未来
Amazon kinesisで広がるリアルタイムデータプロセッシングとその未来
Shinpei Ohtani
Amazon Elastic MapReduce@Hadoop Conference Japan 2011 Fall
Amazon Elastic MapReduce@Hadoop Conference Japan 2011 Fall
Shinpei Ohtani
プログラマブルクラウドの薦め
プログラマブルクラウドの薦め
Shinpei Ohtani
Hadoopソースリーディング第1回アジェンダ
Hadoopソースリーディング第1回アジェンダ
Shinpei Ohtani
サンプルから見るMap reduceコード
サンプルから見るMap reduceコード
Shinpei Ohtani
Hadoopソースリーディング第1回アジェンダ
Hadoopソースリーディング第1回アジェンダ
Shinpei Ohtani
はやわかりHadoop
はやわかりHadoop
Shinpei Ohtani
T2 Web Framework
T2 Web Framework
Shinpei Ohtani
T2 Hacks
T2 Hacks
Shinpei Ohtani
T2 webframework
T2 webframework
Shinpei Ohtani
Struts2を始めよう!
Struts2を始めよう!
Shinpei Ohtani
Struts2 in a nutshell
Struts2 in a nutshell
Shinpei Ohtani
ASP.NET MVC 1.0
ASP.NET MVC 1.0
Shinpei Ohtani
More from Shinpei Ohtani
(17)
Amazon Aurora
Amazon Aurora
AWS Lambda and Amazon API Gateway
AWS Lambda and Amazon API Gateway
ECS for Docker Meetup #4
ECS for Docker Meetup #4
JVM的な何か@JVM Operation Casual Talk
JVM的な何か@JVM Operation Casual Talk
Amazon kinesisで広がるリアルタイムデータプロセッシングとその未来
Amazon kinesisで広がるリアルタイムデータプロセッシングとその未来
Amazon Elastic MapReduce@Hadoop Conference Japan 2011 Fall
Amazon Elastic MapReduce@Hadoop Conference Japan 2011 Fall
プログラマブルクラウドの薦め
プログラマブルクラウドの薦め
Hadoopソースリーディング第1回アジェンダ
Hadoopソースリーディング第1回アジェンダ
サンプルから見るMap reduceコード
サンプルから見るMap reduceコード
Hadoopソースリーディング第1回アジェンダ
Hadoopソースリーディング第1回アジェンダ
はやわかりHadoop
はやわかりHadoop
T2 Web Framework
T2 Web Framework
T2 Hacks
T2 Hacks
T2 webframework
T2 webframework
Struts2を始めよう!
Struts2を始めよう!
Struts2 in a nutshell
Struts2 in a nutshell
ASP.NET MVC 1.0
ASP.NET MVC 1.0
Recently uploaded
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Roshan Dwivedi
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
The Digital Insurer
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
gurkirankumar98700
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Delhi Call girls
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Enterprise Knowledge
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Paola De la Torre
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Delhi Call girls
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Sinan KOZAK
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
V3cube
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Pooja Nehwal
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
Recently uploaded
(20)
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
サンプルから見るMapReduceコード
1.
MapReduce @shot6
2.
Cloudera
Avro Sqoop Desktop Pig Hive HBase Chukwa Map Zoo HDFS Reduce Keeper Core
3.
Cloudera
Avro Sqoop Desktop Pig Hive HBase Chukwa Map Zoo HDFS Reduce Keeper Core
4.
•
MapReduce – Mapper/Reducer •
5.
MapReduce
• WordCount • • – Mapper/Reducer Job ⾏行行 – InputFormat/OutputFormat ⽅方 – HDFS(FileSystem) – Writable ⽅方
6.
WordCount • Hadoop
Hello World • API (org.apache.hadoop.mapreduce) • API
7.
Grep • grep
– grepJob/sortJob 2 ⾏行行 – JobConf/Mapper/Reducer ⽅方 – Mapper RegexMapper ⾏行行 <Text, Long> SequenceFileFormat – sortJob – ⼒力力 –
8.
Grep
- • JobConf • Mapper • Reducer
9.
o.a.hadoop.mapred.JobConf •
– mapred-default.xml – conf/mapred-site.xml – XML ⾝身 DOM – ⾃自 ⽬目 ⼿手 – ⼦子 • JobConf child = new JobConf( Conf, jar );
10.
mapred-site.xml <configuration> <!–
--> <property> <key>mapred.job.tracker</key> <value>your-site:9001</value> </property> </configuration>
11.
o.a.hadoop.mapred.Mapper • Mapper • InputSplit
Mapper • MapTask/MapRunner • map(KEY, VALUE, COLLECTOR, REPORTER) – KEY:Map VALUE:Map – COLLECTOR: – REPORTER: API • MapReduceBase
12.
o.a.hadoop.mapred.MapTask • Map • initiazlize
(Task Reducer ) – ⽣生 – (o.a.h.mapred.TaskStatus.State) • RUNNING, SUCCEEDED, FAILED, UNASSIGNED, KILLED, COMMIT_PENDING, FAILED_UNCLEAN, KILLED_UNCLEAN – OutputCommiter ⽣生 • Task ⼒力力 ⾏行行 • ⼒力力 – mapred.work.output.dir
13.
o.a.h.mapred.MapTask cont • run
runOldMapper • JobClient InputSplit • RecordReader
14.
o.a.h.mapred.MapTask cont2 • Reduce
– spill (* ) • $mapred.local.dir/taskTracker/jobcache/$ {taskid}/output/spill${spillNumber}.out – Reducer ⼒力力 • Combiner min.num.spills.for.combine combiner – RecordWriter ⼒力力 • MapRunner
15.
o.a.h.mapred.MapRunner • MapRunnable
– mapred.map.runner.class – Hadoop PipeMapRunner – Map MultiThreadedMapRunner
16.
o.a.h.mapred.MapRunner
cont • run(RecordReader, OutputCollector, Reporter) – RecordReader: InputFormat Split Reader(InputFormat/RecordReader ) • – RecordReader – ⾝身 –
17.
MapTask
MapRunner Mapper Record Output Reader Collector Input Split⽣生 Spill & run createKey() SpillThread createValue() next(key, value) EOF Map(key, value, Spill outputCollector, reporter)
18.
m(_ _)m
19.
• Mapper
– JobConf – Mapper/MapRunner/MapTask • – Reducer • Reducer ⾏行行 • Reducer ⾏行行 – InputFormat/RecordReader
20.
o.a.h.mapred.Reducer • Reducer • InputSplit
Mapper • ReduceTask/ReduceRunner • reduce(KEY, Iterator<VALUE>, COLLECTOR, REPORTER) – KEY: Iterator<VALUE>: – COLLECTOR: – REPORTER: API • MapReduceBase
21.
o.a.h.mapred.ReduceTask • SHUFFLE • ReduceTask.ReduceCopier
– fetchOutputs( Merger.MergeQueue) • Map x mapred.reduce.parallel.copies – MapOutputCopier • Map ⾏行行 LocalFSMerger • ⾏行行 InMemFSMergeThread • GetMapEventsThread – Map – < , MapOutputLocation(taskId, host, httpUrl)> • ⼀一 TaskTracker ⼯工
22.
o.a.h.mapred.ReduceTask • run(RecordReader, OutputCollector,
Reporter) • SORT – Memory, disk ⽣生 • RowKeyValueItetator – Reducer ⽣生 – RecordWriter ⽣生 – ReduceValuesIterator ⾏行行
Download now