SlideShare a Scribd company logo
MapReduce





MapReduce
•   Google
•   Google
    •
    •                                        Map    Reduce
• Google                                           map   reduce

•                    MapReduce
    •   Map
        –     [1,2,3,4] – (*2)  [2,3,6,8]
    •   Reduce
        –     [1,2,3,4] – (sum)  10


–                              (Divide and Conquer)




                                                                  Copyright 2009 - Trend Micro Inc.
MapReduce
•   MapReduce        Google
                                                       Map   Reduce
       MapReduce
•
    – Map                      ”   ” key/value                ”   ”
      intermediate key/value
    – Reduce                            intermediate key
      intermediate values                 key/value
•                MapReduce




                                                                      Copyright 2009 - Trend Micro Inc.
•
    –
    –
    –


•

    http://www.dbms2.com/2008/08/26/known-applications-of-mapreduce/




                                                                       Copyright 2009 - Trend Micro Inc.
MapReduce
•
    – map         (K1, V1)  list(K2, V2)
    – reduce       (K2, list(V2))  list(K3, V3)

• grep
    – Map: (offset, line)  [(match, 1)]
    – Reduce: (match, [1, 1, ...])  [(match, n)]

• MapReduce                :




                                                    Copyright 2009 - Trend Micro Inc.
6
Classification   Copyright 2009 - Trend Micro Inc.
‧   ➝
‧       ➝




            Copyright 2009 - Trend Micro Inc.
Word Count




Classification   Copyright 2009 - Trend Micro Inc.
MapReduce

•               (Distributed Grep)
    –                                (pattern)

•               (Distributed Sort)
    –

•        URL               (Count of URL Access Frequency)
    –     Web                  URL




                                                     Copyright 2009 - Trend Micro Inc.
MapReduce




Classification    Copyright 2007 - Trend Micro Inc.
Hadoop      MapReduce

• Apache Hadoop      Google   MapReduce
   –              MapReduce
   –   Java
   –   Hadoop             (HDFS)
• Yahoo!
• Google, Yahoo!, IBM, Amazon          Hadoop
•          (Trend Micro)    Hadoop MapReduce




                                                Copyright 2009 - Trend Micro Inc.
Hadoop MapReduce

 •     Map/Reduce framework
         – JobTracker
         – TaskTracker
 •     JobTracker
         – Job
         –     Job            JobTracker                                Job.
 •     TaskTrackers
        •         Job




                                                                    Copyright 2009 - Trend Micro Inc.
                                Copyright 2007 - Trend Micro Inc.
Classification
Hadoop MapReduce
  class MyJob {

                 class Map {                 //    Map
                 }
                 class Reduce {             //     Reduce
                 }


                 }
                 main() {
                            //        job
                             JobConf conf = new JobConf(“MyJob.class”);
                                 conf.setInputPath(…);
                                 conf.setOutputPath(…);
                                 conf.setMapperClass(Map.class);
                                 conf.setReduceClass(Reduce.class)
                            //       Job
                                 JobClient.runJob(conf);
                 }
Classification                                           Copyright 2007 - Trend Micro Inc.
  }
•
    –
    –
    –

    –
        HDFS                                 MapReduce
•
    –
    –
         •
         •


             , GUID,       ,         ,
         1, 123, 131231231, VSAPI, open file
         2, 456, 123123123, VSAPI, connect internet




                                                      Copyright 2007 - Trend Micro Inc.
Map
•        Mapper                map()
•   Map : (K1, V1)  list(K2, V2)


map( WritableComparable, Writable,
  OutputCollector, Reporter)


•         input                                map()
•   OutputCollector               collect() method

OutputCollector.collect( WritableComparable,Writable )




                                                   Copyright 2007 - Trend Micro Inc.
Map

class MapClass extends MapReduceBase

implements Mapper<LongWritable, Text, Text, IntWritable> {

	       private final static IntWritable one = new IntWritable(1);

	       private Text hour = new Text();

	       public void map( LongWritable key, Text value,
OutputCollector<Text,IntWritable> output, Reporter reporter) throws
IOException {

	             String line = ((Text) value).toString();

               String[] token = line.split(quot;,quot;);

               String timestamp = token[1];

               Calendar c = Calendar.getInstance();

               c.setTimeInMillis(Long.parseLong(timestamp));

               Integer h = c.get(Calendar.HOUR);

               hour.set(h.toString());

               output.collect(hour, one)

}}}                                    Copyright 2007 - Trend Micro Inc.
Reduce
•     Reducer                   reduce() method
• Reduce : (K2, list(V2))  list(K3, V3)

     reduce (WritableComparable, Iterator,
              OutputCollector, Reporter)



• OutputCollector            collect() method

   OutputCollector.collect( WritableComparable,Writable )




                                     Copyright 2007 - Trend Micro Inc.
Reduce

class ReduceClass extends MapReduceBase implements Reducer< Text,
IntWritable, Text, IntWritable> {

	       IntWritable SumValue = new IntWritable();

	       public void reduce( Text key, Iterator<IntWritable> values,

	       OutputCollector<Text, IntWritable> output, Reporter reporter)

	       throws IOException {

	       	       int sum = 0;

	       	       while (values.hasNext())

	       	       	        sum += values.next().get();

	       	       SumValue.set(sum);

	       	       output.collect(key, SumValue);

}}



                                      Copyright 2007 - Trend Micro Inc.
•   JobConf
    – Mapper    Reducer   Inputformat     OutputFormat Combiler Petitioner

    –
    –
    –
        • map    reduce
        •


•                         JobClient                             JobConf


	   JobClient.runJob(conf);
	   JobClient.submitJob(conf);
	   JobClient.setJobEndNotificationURI(URI);



                                        Copyright 2007 - Trend Micro Inc.
Main Function
Class MyJob{
public static void main(String[] args) {
	       JobConf conf = new JobConf(MyJob.class);
	       conf.setJobName(”Caculate feedback log time distributionquot;);
	       // set path
	       conf.setInputPath(new Path(args[0]));
	       conf.setOutputPath(new Path(args[1]));
	       // set map reduce
	       conf.setOutputKeyClass(Text.class);            // set every word as key
	       conf.setOutputValueClass(IntWritable.class); // set 1 as value
	       conf.setMapperClass(MapClass.class);
	       conf.setCombinerClass(Reduce.class);
	       conf.setReducerClass(ReduceClass.class);
	       onf.setInputFormat(TextInputFormat.class);
	       conf.setOutputFormat(TextOutputFormat.class);
	       // run
	       JobClient.runJob(conf);
}}


                                      Copyright 2007 - Trend Micro Inc.
1.
     –   javac -classpath hadoop-*-core.jar -d MyJava
         MyJob.java
2.
     –   jar –cvf MyJob.jar -C MyJava .
3.
     –   bin/hadoop jar MyJob.jar MyJob input/ output/




                                          Copyright 2007 - Trend Micro Inc.
• bin/hadoop jar MyJob.jar MyJob input/ output/




                                                                   Copyright 2009 - Trend Micro Inc.
                               Copyright 2007 - Trend Micro Inc.
Classification
Web Console
http://172.16.203.132:50030/




                                                                         Copyright 2009 - Trend Micro Inc.
                                     Copyright 2007 - Trend Micro Inc.
Classification
Hadoop MapReduce
 • Mapper                  ?
         – Mapper       Input          Input      Hadoop
                        Mapper
         –      JobConf       setNumMapTasks(int)     Hadoop
           Mapper                   Hadoop


 • Reducer                 ?
         –         JobConf     JobConf.setNumReduceTasks(int)
                 Reducer
         –       Reducer                            Reducer
                       MapReduce Map Reduce




                                                                          Copyright 2009 - Trend Micro Inc.
                                      Copyright 2007 - Trend Micro Inc.
Classification
Non-Java Interface
• Hadoop Pipes
  –   MapReduce       C++ API
  – C++             java
• Hadoop Streaming
  –               MapReduce




                                Copyright 2007 - Trend Micro Inc.
• Google MapReduce
  – http://labs.google.com/papers/mapreduce.html
• Google                   MapReduce
  – http://code.google.com/edu/submissions/mapreduce/listing.html
• Google                   MapReduce
  – http://code.google.com/edu/submissions/mapreduce-minilecture/
    listing.html
• Hadoop
  – http://hadoop.apache.org/core/




                                 Copyright 2007 - Trend Micro Inc.
•              Eclipse                 MapReduce                           (IBM
        )
    –          Eclipse          Hadoop
    –
            • http://code.google.com/edu/parallel/tools/hadoopvm/hadoop-
              eclipse-plugin.jar
• Hadoop                       (Google                      )
    –        VMware                                 Hadoop


               VMware                                     Google
    –
            • http://code.google.com/edu/parallel/tools/hadoopvm/
              index.html



                                       Copyright 2007 - Trend Micro Inc.

More Related Content

Similar to Zh Tw Introduction To Map Reduce

Map Reduce introduction
Map Reduce introductionMap Reduce introduction
Map Reduce introduction
Muralidharan Deenathayalan
 
Intermachine Parallelism
Intermachine ParallelismIntermachine Parallelism
Intermachine ParallelismSri Prasanna
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraSomnath Mazumdar
 
Functional Web Development
Functional Web DevelopmentFunctional Web Development
Functional Web Development
FITC
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
Harisankar H
 
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
CloudxLab
 
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspectiveBig Data & Analytics MapReduce/Hadoop – A programmer’s perspective
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
EMC
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
Hadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talkHadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talk
boorad
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
 
Google_A_Behind_the_Scenes_Tour_-_Jeff_Dean
Google_A_Behind_the_Scenes_Tour_-_Jeff_DeanGoogle_A_Behind_the_Scenes_Tour_-_Jeff_Dean
Google_A_Behind_the_Scenes_Tour_-_Jeff_DeanHiroshi Ono
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
David Gleich
 
Zh Tw Introduction To Hadoop And Hdfs
Zh Tw Introduction To Hadoop And HdfsZh Tw Introduction To Hadoop And Hdfs
Zh Tw Introduction To Hadoop And Hdfs
kevin liao
 
GoMR: A MapReduce Framework for Go
GoMR: A MapReduce Framework for GoGoMR: A MapReduce Framework for Go
GoMR: A MapReduce Framework for Go
ConnorZanin
 
Taste Java In The Clouds
Taste Java In The CloudsTaste Java In The Clouds
Taste Java In The CloudsJacky Chu
 
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ..."MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
Adrian Florea
 
ちょっとHadoopについて語ってみるか(仮題)
ちょっとHadoopについて語ってみるか(仮題)ちょっとHadoopについて語ってみるか(仮題)
ちょっとHadoopについて語ってみるか(仮題)moai kids
 
Introduction to Mahout
Introduction to MahoutIntroduction to Mahout
Introduction to Mahout
Ted Dunning
 

Similar to Zh Tw Introduction To Map Reduce (20)

Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
Map Reduce introduction
Map Reduce introductionMap Reduce introduction
Map Reduce introduction
 
Intermachine Parallelism
Intermachine ParallelismIntermachine Parallelism
Intermachine Parallelism
 
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and CassandraBrief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
Brief introduction on Hadoop,Dremel, Pig, FlumeJava and Cassandra
 
Functional Web Development
Functional Web DevelopmentFunctional Web Development
Functional Web Development
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
 
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
 
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspectiveBig Data & Analytics MapReduce/Hadoop – A programmer’s perspective
Big Data & Analytics MapReduce/Hadoop – A programmer’s perspective
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Hadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talkHadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talk
 
MapReduce
MapReduceMapReduce
MapReduce
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
 
Google_A_Behind_the_Scenes_Tour_-_Jeff_Dean
Google_A_Behind_the_Scenes_Tour_-_Jeff_DeanGoogle_A_Behind_the_Scenes_Tour_-_Jeff_Dean
Google_A_Behind_the_Scenes_Tour_-_Jeff_Dean
 
Sparse matrix computations in MapReduce
Sparse matrix computations in MapReduceSparse matrix computations in MapReduce
Sparse matrix computations in MapReduce
 
Zh Tw Introduction To Hadoop And Hdfs
Zh Tw Introduction To Hadoop And HdfsZh Tw Introduction To Hadoop And Hdfs
Zh Tw Introduction To Hadoop And Hdfs
 
GoMR: A MapReduce Framework for Go
GoMR: A MapReduce Framework for GoGoMR: A MapReduce Framework for Go
GoMR: A MapReduce Framework for Go
 
Taste Java In The Clouds
Taste Java In The CloudsTaste Java In The Clouds
Taste Java In The Clouds
 
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ..."MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
"MapReduce: Simplified Data Processing on Large Clusters" Paper Presentation ...
 
ちょっとHadoopについて語ってみるか(仮題)
ちょっとHadoopについて語ってみるか(仮題)ちょっとHadoopについて語ってみるか(仮題)
ちょっとHadoopについて語ってみるか(仮題)
 
Introduction to Mahout
Introduction to MahoutIntroduction to Mahout
Introduction to Mahout
 

Recently uploaded

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

Zh Tw Introduction To Map Reduce

  • 2. MapReduce • Google • Google • • Map Reduce • Google map reduce • MapReduce • Map – [1,2,3,4] – (*2)  [2,3,6,8] • Reduce – [1,2,3,4] – (sum)  10 – (Divide and Conquer) Copyright 2009 - Trend Micro Inc.
  • 3. MapReduce • MapReduce Google Map Reduce MapReduce • – Map ” ” key/value ” ” intermediate key/value – Reduce intermediate key intermediate values key/value • MapReduce Copyright 2009 - Trend Micro Inc.
  • 4. – – – • http://www.dbms2.com/2008/08/26/known-applications-of-mapreduce/ Copyright 2009 - Trend Micro Inc.
  • 5. MapReduce • – map (K1, V1)  list(K2, V2) – reduce (K2, list(V2))  list(K3, V3) • grep – Map: (offset, line)  [(match, 1)] – Reduce: (match, [1, 1, ...])  [(match, n)] • MapReduce : Copyright 2009 - Trend Micro Inc.
  • 6. 6 Classification Copyright 2009 - Trend Micro Inc.
  • 7. ➝ ‧ ➝ Copyright 2009 - Trend Micro Inc.
  • 8. Word Count Classification Copyright 2009 - Trend Micro Inc.
  • 9. MapReduce • (Distributed Grep) – (pattern) • (Distributed Sort) – • URL (Count of URL Access Frequency) – Web URL Copyright 2009 - Trend Micro Inc.
  • 10. MapReduce Classification Copyright 2007 - Trend Micro Inc.
  • 11. Hadoop MapReduce • Apache Hadoop Google MapReduce – MapReduce – Java – Hadoop (HDFS) • Yahoo! • Google, Yahoo!, IBM, Amazon Hadoop • (Trend Micro) Hadoop MapReduce Copyright 2009 - Trend Micro Inc.
  • 12. Hadoop MapReduce • Map/Reduce framework – JobTracker – TaskTracker • JobTracker – Job – Job JobTracker Job. • TaskTrackers • Job Copyright 2009 - Trend Micro Inc. Copyright 2007 - Trend Micro Inc. Classification
  • 13. Hadoop MapReduce class MyJob { class Map { // Map } class Reduce { // Reduce } } main() { // job JobConf conf = new JobConf(“MyJob.class”); conf.setInputPath(…); conf.setOutputPath(…); conf.setMapperClass(Map.class); conf.setReduceClass(Reduce.class) // Job JobClient.runJob(conf); } Classification Copyright 2007 - Trend Micro Inc. }
  • 14. – – – – HDFS MapReduce • – – • • , GUID, , , 1, 123, 131231231, VSAPI, open file 2, 456, 123123123, VSAPI, connect internet Copyright 2007 - Trend Micro Inc.
  • 15. Map • Mapper map() • Map : (K1, V1)  list(K2, V2) map( WritableComparable, Writable, OutputCollector, Reporter) • input map() • OutputCollector collect() method OutputCollector.collect( WritableComparable,Writable ) Copyright 2007 - Trend Micro Inc.
  • 16. Map class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text hour = new Text(); public void map( LongWritable key, Text value, OutputCollector<Text,IntWritable> output, Reporter reporter) throws IOException { String line = ((Text) value).toString(); String[] token = line.split(quot;,quot;); String timestamp = token[1]; Calendar c = Calendar.getInstance(); c.setTimeInMillis(Long.parseLong(timestamp)); Integer h = c.get(Calendar.HOUR); hour.set(h.toString()); output.collect(hour, one) }}} Copyright 2007 - Trend Micro Inc.
  • 17. Reduce • Reducer reduce() method • Reduce : (K2, list(V2))  list(K3, V3) reduce (WritableComparable, Iterator, OutputCollector, Reporter) • OutputCollector collect() method OutputCollector.collect( WritableComparable,Writable ) Copyright 2007 - Trend Micro Inc.
  • 18. Reduce class ReduceClass extends MapReduceBase implements Reducer< Text, IntWritable, Text, IntWritable> { IntWritable SumValue = new IntWritable(); public void reduce( Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) sum += values.next().get(); SumValue.set(sum); output.collect(key, SumValue); }} Copyright 2007 - Trend Micro Inc.
  • 19. JobConf – Mapper Reducer Inputformat OutputFormat Combiler Petitioner – – – • map reduce • • JobClient JobConf JobClient.runJob(conf); JobClient.submitJob(conf); JobClient.setJobEndNotificationURI(URI); Copyright 2007 - Trend Micro Inc.
  • 20. Main Function Class MyJob{ public static void main(String[] args) { JobConf conf = new JobConf(MyJob.class); conf.setJobName(”Caculate feedback log time distributionquot;); // set path conf.setInputPath(new Path(args[0])); conf.setOutputPath(new Path(args[1])); // set map reduce conf.setOutputKeyClass(Text.class); // set every word as key conf.setOutputValueClass(IntWritable.class); // set 1 as value conf.setMapperClass(MapClass.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(ReduceClass.class); onf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); // run JobClient.runJob(conf); }} Copyright 2007 - Trend Micro Inc.
  • 21. 1. – javac -classpath hadoop-*-core.jar -d MyJava MyJob.java 2. – jar –cvf MyJob.jar -C MyJava . 3. – bin/hadoop jar MyJob.jar MyJob input/ output/ Copyright 2007 - Trend Micro Inc.
  • 22. • bin/hadoop jar MyJob.jar MyJob input/ output/ Copyright 2009 - Trend Micro Inc. Copyright 2007 - Trend Micro Inc. Classification
  • 23. Web Console http://172.16.203.132:50030/ Copyright 2009 - Trend Micro Inc. Copyright 2007 - Trend Micro Inc. Classification
  • 24. Hadoop MapReduce • Mapper ? – Mapper Input Input Hadoop Mapper – JobConf setNumMapTasks(int) Hadoop Mapper Hadoop • Reducer ? – JobConf JobConf.setNumReduceTasks(int) Reducer – Reducer Reducer MapReduce Map Reduce Copyright 2009 - Trend Micro Inc. Copyright 2007 - Trend Micro Inc. Classification
  • 25. Non-Java Interface • Hadoop Pipes – MapReduce C++ API – C++ java • Hadoop Streaming – MapReduce Copyright 2007 - Trend Micro Inc.
  • 26. • Google MapReduce – http://labs.google.com/papers/mapreduce.html • Google MapReduce – http://code.google.com/edu/submissions/mapreduce/listing.html • Google MapReduce – http://code.google.com/edu/submissions/mapreduce-minilecture/ listing.html • Hadoop – http://hadoop.apache.org/core/ Copyright 2007 - Trend Micro Inc.
  • 27. Eclipse MapReduce (IBM ) – Eclipse Hadoop – • http://code.google.com/edu/parallel/tools/hadoopvm/hadoop- eclipse-plugin.jar • Hadoop (Google ) – VMware Hadoop VMware Google – • http://code.google.com/edu/parallel/tools/hadoopvm/ index.html Copyright 2007 - Trend Micro Inc.