SlideShare a Scribd company logo
1
©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/
Understanding Big Data Platform from Patents
The Hadoop big data platform is based on the MapReduce framework. Google patent US7650331, titled “System
and method for efficient large-scale data processing,”described the MapReduce framework for the first time. Its
claim 1 clearly explains the function of the Map and Reduce process as follows.
A system for large-scale processingof data, comprising:
a plurality of processes executing on a plurality of interconnected processors; (Distributed Processor (File) System)
the plurality of processesincluding a master process,forcoordinating a data processing job for processing a set of
input data, and worker processes; (JobTracker)
the master process, in responseto a request to perform the data processingjob, assigning input data blocks of the
set of input data to respective ones of the worker processes; (<key, value> pairs)
each of a first plurality of the worker processes including an application-independent map module for retrieving a
respective input data block assigned to the worker process bythe master process and applying an application-
specific map operation to the respective input data block to produceintermediate data values, wherein at least a
subsetof the intermediate data values each comprises a key/value pair, and wherein at least two of the first
plurality of the worker processes operatesimultaneously so as to perform the application-specific map operation in
parallel on distinct, respective input data blocks; (Map Step)
2
©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/
a partition operator for processing the produced intermediate data values to producea plurality of intermediate data
sets, wherein each respective intermediate data set includes all key/value pairs for a distinct set of respective keys,
and wherein at least one of the respective intermediate data sets includes respective ones of the key/value pairs
produced by a plurality of the first plurality of the worker processes;and (IntermediateStep)
each of a second plurality of the worker processesincluding an application-independent reduce module for
retrieving data, the retrieved data comprising at least a subsetof the key/value pairs from a respective intermediate
data set of the plurality of intermediate data sets and applying an application-specific reduce operation to the
retrieved data to producefinal output data correspondingto the distinct set of respective keys in the respective
intermediate data set of the plurality of intermediate data sets, and wherein at least two of the second plurality of
the worker processes operatesimultaneously so as to perform the application-specific reduce operation in parallel
on multiple respective subsets ofthe produced intermediate data values. (Reduce Step)
3
©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/
4
©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/
MAPR’s patent application US20110313973, titled “Map-ReduceReady Distributed File System.” further
developed the MapReduce framework including the shuffle function using the distributed file system (DFS) in its
claim 1 as follows.
A map-reduce compatible shuffle function, comprising:
a distributed file system; and
a map-reduce system, wherein each map function writes to the distributed file system and each reduce function
reads input from the distributed file system.
The shuffle step redistributes the produced intermediate data from the map step, such that all data belonging to one
key is located on the same worker node.
5
©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/
6
©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/
Hadoop is a platform designed for large datasets (datasets measured in the terabytes, petabytes, or even greater data
size) that leverages the MapReduce framework. However, many existing websites and applications are built on
systems that differ greatly from those that can take advantage of large quantities of data. To take advantage of the
Hadoop big data platform, the systems have to be re-engineered for the new Hadoop platform. Treasure Data
patent application US20130124483, titled “System and method for operating a big-data platform,” illustrates a
system for integrating the existing websites and applications with the big-data platform without the re-engineering
process.
The system for operating a big data platform includes a data analysis platform that receives discrete client data
(formatted as a plurality of key-value pairs in row format); a network accessible distributed storage system (hosted
on a distributed cloud storage system such as Amazon's S3/EC2) that stores the client data in a real-time storage
system and merges the client data into a columnar-based distributed archive storage system (using a MapReduce);
a query interface that receives a data query request and selectively interfaces with the client data from the real-time
storage system and archive storage system according to the query.
7
©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/

More Related Content

What's hot

Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computing
huda2018
 
A data aware caching 2415
A data aware caching 2415A data aware caching 2415
A data aware caching 2415
SANTOSH WAYAL
 
Google's Dremel
Google's DremelGoogle's Dremel
Google's Dremel
Maria Stylianou
 
Pig Experience
Pig ExperiencePig Experience
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
An introduction to Hadoop for large scale data analysis
An introduction to Hadoop for large scale data analysisAn introduction to Hadoop for large scale data analysis
An introduction to Hadoop for large scale data analysis
Abhijit Sharma
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reduce
Paladion Networks
 
Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19
ExtremeEarth
 
Hadoop
HadoopHadoop
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
robertlz
 
Fundamental of Big Data with Hadoop and Hive
Fundamental of Big Data with Hadoop and HiveFundamental of Big Data with Hadoop and Hive
Fundamental of Big Data with Hadoop and Hive
Sharjeel Imtiaz
 
Dremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasetsDremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasets
Carl Lu
 
A sql implementation on the map reduce framework
A sql implementation on the map reduce frameworkA sql implementation on the map reduce framework
A sql implementation on the map reduce framework
eldariof
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC Programs
Tanu Malik
 
Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)
PyData
 
Big data & Hadoop
Big data & HadoopBig data & Hadoop
Big data & Hadoop
Ahmed Gamil
 
Hadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreHadoop institutes-in-bangalore
Hadoop institutes-in-bangalore
Kelly Technologies
 
Stacks
StacksStacks
Stacks
Acad
 
Tms training
Tms trainingTms training
Tms training
Chi Lee
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Michel Bruley
 

What's hot (20)

Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computing
 
A data aware caching 2415
A data aware caching 2415A data aware caching 2415
A data aware caching 2415
 
Google's Dremel
Google's DremelGoogle's Dremel
Google's Dremel
 
Pig Experience
Pig ExperiencePig Experience
Pig Experience
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
An introduction to Hadoop for large scale data analysis
An introduction to Hadoop for large scale data analysisAn introduction to Hadoop for large scale data analysis
An introduction to Hadoop for large scale data analysis
 
Analysing of big data using map reduce
Analysing of big data using map reduceAnalysing of big data using map reduce
Analysing of big data using map reduce
 
Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19Big linked geospatial data tools in ExtremeEarth-phiweek19
Big linked geospatial data tools in ExtremeEarth-phiweek19
 
Hadoop
HadoopHadoop
Hadoop
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
 
Fundamental of Big Data with Hadoop and Hive
Fundamental of Big Data with Hadoop and HiveFundamental of Big Data with Hadoop and Hive
Fundamental of Big Data with Hadoop and Hive
 
Dremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasetsDremel interactive analysis of web scale datasets
Dremel interactive analysis of web scale datasets
 
A sql implementation on the map reduce framework
A sql implementation on the map reduce frameworkA sql implementation on the map reduce framework
A sql implementation on the map reduce framework
 
GEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC ProgramsGEN: A Database Interface Generator for HPC Programs
GEN: A Database Interface Generator for HPC Programs
 
Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)
 
Big data & Hadoop
Big data & HadoopBig data & Hadoop
Big data & Hadoop
 
Hadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreHadoop institutes-in-bangalore
Hadoop institutes-in-bangalore
 
Stacks
StacksStacks
Stacks
 
Tms training
Tms trainingTms training
Tms training
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 

Viewers also liked

TESDA Financing Options Prof. B. Diokno
TESDA Financing Options Prof. B. DioknoTESDA Financing Options Prof. B. Diokno
TESDA Financing Options Prof. B. Diokno
Yolynne Medina
 
Honduras Powerpoint
Honduras PowerpointHonduras Powerpoint
Honduras Powerpoint
getitgotitgurr
 
Biweekly Financial Commentary 09 09 14
Biweekly Financial Commentary 09 09 14Biweekly Financial Commentary 09 09 14
Biweekly Financial Commentary 09 09 14Ant Wong
 
LTE Essential Patents Landscape 2009 Q2
LTE Essential Patents Landscape 2009 Q2LTE Essential Patents Landscape 2009 Q2
LTE Essential Patents Landscape 2009 Q2
Alex G. Lee, Ph.D. Esq. CLP
 
IoT (Internet of Things) IP (Patent)-intensive Venture Company Development St...
IoT (Internet of Things) IP (Patent)-intensive Venture Company Development St...IoT (Internet of Things) IP (Patent)-intensive Venture Company Development St...
IoT (Internet of Things) IP (Patent)-intensive Venture Company Development St...
Alex G. Lee, Ph.D. Esq. CLP
 
Prosvjed HUS-a u Zagrebu
Prosvjed HUS-a u ZagrebuProsvjed HUS-a u Zagrebu
Prosvjed HUS-a u Zagrebugrlica22
 
Internet of Things (IoT) Strategic Patent Development 1Q 2016
Internet of Things (IoT) Strategic Patent Development 1Q 2016Internet of Things (IoT) Strategic Patent Development 1Q 2016
Internet of Things (IoT) Strategic Patent Development 1Q 2016
Alex G. Lee, Ph.D. Esq. CLP
 
Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)
Reiner Kraft
 
YESpay Corporate Presentation 2009
YESpay Corporate Presentation 2009YESpay Corporate Presentation 2009
YESpay Corporate Presentation 2009
guest3e40ef
 
Diana CV
Diana CVDiana CV
Diana CV
tamas diana
 
Internet of Things (IoT) Patent Prosecution OA Reject Grounds
Internet of Things (IoT) Patent Prosecution OA Reject Grounds Internet of Things (IoT) Patent Prosecution OA Reject Grounds
Internet of Things (IoT) Patent Prosecution OA Reject Grounds
Alex G. Lee, Ph.D. Esq. CLP
 
Biweekly Financial Commentary 09 09 14
Biweekly Financial Commentary 09 09 14Biweekly Financial Commentary 09 09 14
Biweekly Financial Commentary 09 09 14Ant Wong
 
бизнес возможность
бизнес возможностьбизнес возможность
бизнес возможностьguest4ab0dd
 
IoT Ambient Intelligence for Smart Living Insights from Patents
IoT Ambient Intelligence for Smart Living Insights from PatentsIoT Ambient Intelligence for Smart Living Insights from Patents
IoT Ambient Intelligence for Smart Living Insights from Patents
Alex G. Lee, Ph.D. Esq. CLP
 
LTE Patent for Standard 2010 1 Q
LTE Patent for Standard 2010 1 QLTE Patent for Standard 2010 1 Q
LTE Patent for Standard 2010 1 Q
Alex G. Lee, Ph.D. Esq. CLP
 
Biweekly Financial Commentary 08 03 24
Biweekly Financial Commentary 08 03 24Biweekly Financial Commentary 08 03 24
Biweekly Financial Commentary 08 03 24Ant Wong
 
2 Sindikalna Prava
2 Sindikalna Prava2 Sindikalna Prava
2 Sindikalna Pravagrlica22
 
LG’s LTE Standard Patents Acquisition by NPEs: Potential Litigation Risks?
LG’s LTE Standard Patents Acquisition by NPEs: Potential Litigation Risks?LG’s LTE Standard Patents Acquisition by NPEs: Potential Litigation Risks?
LG’s LTE Standard Patents Acquisition by NPEs: Potential Litigation Risks?
Alex G. Lee, Ph.D. Esq. CLP
 
Eu desisti mesmo!
Eu desisti mesmo!Eu desisti mesmo!
Eu desisti mesmo!
Luiz Carlos Dias
 

Viewers also liked (20)

TESDA Financing Options Prof. B. Diokno
TESDA Financing Options Prof. B. DioknoTESDA Financing Options Prof. B. Diokno
TESDA Financing Options Prof. B. Diokno
 
Honduras Powerpoint
Honduras PowerpointHonduras Powerpoint
Honduras Powerpoint
 
Biweekly Financial Commentary 09 09 14
Biweekly Financial Commentary 09 09 14Biweekly Financial Commentary 09 09 14
Biweekly Financial Commentary 09 09 14
 
LTE Essential Patents Landscape 2009 Q2
LTE Essential Patents Landscape 2009 Q2LTE Essential Patents Landscape 2009 Q2
LTE Essential Patents Landscape 2009 Q2
 
IoT (Internet of Things) IP (Patent)-intensive Venture Company Development St...
IoT (Internet of Things) IP (Patent)-intensive Venture Company Development St...IoT (Internet of Things) IP (Patent)-intensive Venture Company Development St...
IoT (Internet of Things) IP (Patent)-intensive Venture Company Development St...
 
Prosvjed HUS-a u Zagrebu
Prosvjed HUS-a u ZagrebuProsvjed HUS-a u Zagrebu
Prosvjed HUS-a u Zagrebu
 
Internet of Things (IoT) Strategic Patent Development 1Q 2016
Internet of Things (IoT) Strategic Patent Development 1Q 2016Internet of Things (IoT) Strategic Patent Development 1Q 2016
Internet of Things (IoT) Strategic Patent Development 1Q 2016
 
Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)
 
YESpay Corporate Presentation 2009
YESpay Corporate Presentation 2009YESpay Corporate Presentation 2009
YESpay Corporate Presentation 2009
 
Diana CV
Diana CVDiana CV
Diana CV
 
Internet of Things (IoT) Patent Prosecution OA Reject Grounds
Internet of Things (IoT) Patent Prosecution OA Reject Grounds Internet of Things (IoT) Patent Prosecution OA Reject Grounds
Internet of Things (IoT) Patent Prosecution OA Reject Grounds
 
Partner Busines
Partner BusinesPartner Busines
Partner Busines
 
Biweekly Financial Commentary 09 09 14
Biweekly Financial Commentary 09 09 14Biweekly Financial Commentary 09 09 14
Biweekly Financial Commentary 09 09 14
 
бизнес возможность
бизнес возможностьбизнес возможность
бизнес возможность
 
IoT Ambient Intelligence for Smart Living Insights from Patents
IoT Ambient Intelligence for Smart Living Insights from PatentsIoT Ambient Intelligence for Smart Living Insights from Patents
IoT Ambient Intelligence for Smart Living Insights from Patents
 
LTE Patent for Standard 2010 1 Q
LTE Patent for Standard 2010 1 QLTE Patent for Standard 2010 1 Q
LTE Patent for Standard 2010 1 Q
 
Biweekly Financial Commentary 08 03 24
Biweekly Financial Commentary 08 03 24Biweekly Financial Commentary 08 03 24
Biweekly Financial Commentary 08 03 24
 
2 Sindikalna Prava
2 Sindikalna Prava2 Sindikalna Prava
2 Sindikalna Prava
 
LG’s LTE Standard Patents Acquisition by NPEs: Potential Litigation Risks?
LG’s LTE Standard Patents Acquisition by NPEs: Potential Litigation Risks?LG’s LTE Standard Patents Acquisition by NPEs: Potential Litigation Risks?
LG’s LTE Standard Patents Acquisition by NPEs: Potential Litigation Risks?
 
Eu desisti mesmo!
Eu desisti mesmo!Eu desisti mesmo!
Eu desisti mesmo!
 

Similar to Understanding Big Data Platform from Patents

Finding URL pattern with MapReduce and Apache Hadoop
Finding URL pattern with MapReduce and Apache HadoopFinding URL pattern with MapReduce and Apache Hadoop
Finding URL pattern with MapReduce and Apache Hadoop
Nushrat
 
60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt
padalamail
 
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET Journal
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
ijwscjournal
 
DataStage_Whitepaper
DataStage_WhitepaperDataStage_Whitepaper
DataStage_Whitepaper
Sourav Maity
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Cloudera, Inc.
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-training
Geohedrick
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
Urvashi Kataria
 
Map reduce advantages over parallel databases report
Map reduce advantages over parallel databases reportMap reduce advantages over parallel databases report
Map reduce advantages over parallel databases report
Ahmad El Tawil
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
Robert Grossman
 
Paper ijert
Paper ijertPaper ijert
Paper ijert
SANTOSH WAYAL
 
Ijircce publish this paper
Ijircce publish this paperIjircce publish this paper
Ijircce publish this paper
SANTOSH WAYAL
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
Krishna Sujeer
 
B017320612
B017320612B017320612
B017320612
IOSR Journals
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics
iosrjce
 
Big Data Testing Approach - Rohit Kharabe
Big Data Testing Approach - Rohit KharabeBig Data Testing Approach - Rohit Kharabe
Big Data Testing Approach - Rohit Kharabe
ROHIT KHARABE
 
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
IJET - International Journal of Engineering and Techniques
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
lmphuong06
 
Exploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed UpExploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed Up
IJERD Editor
 
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
A Survey on Data Mapping Strategy for data stored in the storage cloud  111A Survey on Data Mapping Strategy for data stored in the storage cloud  111
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
NavNeet KuMar
 

Similar to Understanding Big Data Platform from Patents (20)

Finding URL pattern with MapReduce and Apache Hadoop
Finding URL pattern with MapReduce and Apache HadoopFinding URL pattern with MapReduce and Apache Hadoop
Finding URL pattern with MapReduce and Apache Hadoop
 
60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt60141457-Oracle-Golden-Gate-Presentation.ppt
60141457-Oracle-Golden-Gate-Presentation.ppt
 
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOP
 
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENTLARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
LARGE-SCALE DATA PROCESSING USING MAPREDUCE IN CLOUD COMPUTING ENVIRONMENT
 
DataStage_Whitepaper
DataStage_WhitepaperDataStage_Whitepaper
DataStage_Whitepaper
 
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
Hw09   Hadoop Based Data Mining Platform For The Telecom IndustryHw09   Hadoop Based Data Mining Platform For The Telecom Industry
Hw09 Hadoop Based Data Mining Platform For The Telecom Industry
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-training
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
Map reduce advantages over parallel databases report
Map reduce advantages over parallel databases reportMap reduce advantages over parallel databases report
Map reduce advantages over parallel databases report
 
Sawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data CloudsSawmill - Integrating R and Large Data Clouds
Sawmill - Integrating R and Large Data Clouds
 
Paper ijert
Paper ijertPaper ijert
Paper ijert
 
Ijircce publish this paper
Ijircce publish this paperIjircce publish this paper
Ijircce publish this paper
 
Big Data & Hadoop
Big Data & HadoopBig Data & Hadoop
Big Data & Hadoop
 
B017320612
B017320612B017320612
B017320612
 
Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics Leveraging Map Reduce With Hadoop for Weather Data Analytics
Leveraging Map Reduce With Hadoop for Weather Data Analytics
 
Big Data Testing Approach - Rohit Kharabe
Big Data Testing Approach - Rohit KharabeBig Data Testing Approach - Rohit Kharabe
Big Data Testing Approach - Rohit Kharabe
 
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
[IJET V2I2P18] Authors: Roopa G Yeklaspur, Dr.Yerriswamy.T
 
Mapreduce2008 cacm
Mapreduce2008 cacmMapreduce2008 cacm
Mapreduce2008 cacm
 
Exploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed UpExploiting Multi Core Architectures for Process Speed Up
Exploiting Multi Core Architectures for Process Speed Up
 
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
A Survey on Data Mapping Strategy for data stored in the storage cloud  111A Survey on Data Mapping Strategy for data stored in the storage cloud  111
A Survey on Data Mapping Strategy for data stored in the storage cloud 111
 

More from Alex G. Lee, Ph.D. Esq. CLP

[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...
[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...
[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...
Alex G. Lee, Ph.D. Esq. CLP
 
Metaverse x AI x Web3 x Sustainability Convergence
Metaverse x AI x  Web3 x Sustainability ConvergenceMetaverse x AI x  Web3 x Sustainability Convergence
Metaverse x AI x Web3 x Sustainability Convergence
Alex G. Lee, Ph.D. Esq. CLP
 
Tokenization, Securitization, Monetization of Real-World Assets
Tokenization, Securitization, Monetization of Real-World AssetsTokenization, Securitization, Monetization of Real-World Assets
Tokenization, Securitization, Monetization of Real-World Assets
Alex G. Lee, Ph.D. Esq. CLP
 
Maximizing Innovation through ChatGPT Powered Patent Analysis
Maximizing Innovation through ChatGPT Powered Patent AnalysisMaximizing Innovation through ChatGPT Powered Patent Analysis
Maximizing Innovation through ChatGPT Powered Patent Analysis
Alex G. Lee, Ph.D. Esq. CLP
 
Maximizing AI Business Value Creation Utilizing Patents
Maximizing AI Business Value Creation Utilizing PatentsMaximizing AI Business Value Creation Utilizing Patents
Maximizing AI Business Value Creation Utilizing Patents
Alex G. Lee, Ph.D. Esq. CLP
 
Real-World Assets STO + Institutional DeFi Integration
Real-World Assets STO + Institutional DeFi IntegrationReal-World Assets STO + Institutional DeFi Integration
Real-World Assets STO + Institutional DeFi Integration
Alex G. Lee, Ph.D. Esq. CLP
 
Metaverse x Web3 Interoperability Overview
Metaverse x Web3 Interoperability OverviewMetaverse x Web3 Interoperability Overview
Metaverse x Web3 Interoperability Overview
Alex G. Lee, Ph.D. Esq. CLP
 
AI for Metaverse x Web3 Overview
AI for Metaverse x Web3 OverviewAI for Metaverse x Web3 Overview
AI for Metaverse x Web3 Overview
Alex G. Lee, Ph.D. Esq. CLP
 
NFT Web3 Metaverse Global Leaders Roundtable
NFT Web3 Metaverse Global Leaders RoundtableNFT Web3 Metaverse Global Leaders Roundtable
NFT Web3 Metaverse Global Leaders Roundtable
Alex G. Lee, Ph.D. Esq. CLP
 
Fame Universe Introduction
Fame Universe IntroductionFame Universe Introduction
Fame Universe Introduction
Alex G. Lee, Ph.D. Esq. CLP
 
Metaverse Fashion Overview
Metaverse Fashion OverviewMetaverse Fashion Overview
Metaverse Fashion Overview
Alex G. Lee, Ph.D. Esq. CLP
 
Global Metaverse Fashion Innovators Roadshow
Global Metaverse Fashion Innovators RoadshowGlobal Metaverse Fashion Innovators Roadshow
Global Metaverse Fashion Innovators Roadshow
Alex G. Lee, Ph.D. Esq. CLP
 
NFT Financialization Overview
NFT Financialization OverviewNFT Financialization Overview
NFT Financialization Overview
Alex G. Lee, Ph.D. Esq. CLP
 
Metaverse & Web3 Technology Innovation & Business Development
Metaverse & Web3 Technology Innovation & Business DevelopmentMetaverse & Web3 Technology Innovation & Business Development
Metaverse & Web3 Technology Innovation & Business Development
Alex G. Lee, Ph.D. Esq. CLP
 
NFT Monetization Innovation Webinar
NFT Monetization Innovation WebinarNFT Monetization Innovation Webinar
NFT Monetization Innovation Webinar
Alex G. Lee, Ph.D. Esq. CLP
 
웹3.0기반 메타버스 응용을 위한 NFT 가치개발과 가치평가 특강
웹3.0기반 메타버스 응용을 위한 NFT 가치개발과 가치평가 특강웹3.0기반 메타버스 응용을 위한 NFT 가치개발과 가치평가 특강
웹3.0기반 메타버스 응용을 위한 NFT 가치개발과 가치평가 특강
Alex G. Lee, Ph.D. Esq. CLP
 
NFT for Web3 Based Metaverse Monetization Webinar.pdf
NFT for Web3 Based Metaverse Monetization Webinar.pdfNFT for Web3 Based Metaverse Monetization Webinar.pdf
NFT for Web3 Based Metaverse Monetization Webinar.pdf
Alex G. Lee, Ph.D. Esq. CLP
 
FAME UNIVERSE Fashion NFT Monetization Platform Introduction
FAME UNIVERSE Fashion NFT Monetization Platform IntroductionFAME UNIVERSE Fashion NFT Monetization Platform Introduction
FAME UNIVERSE Fashion NFT Monetization Platform Introduction
Alex G. Lee, Ph.D. Esq. CLP
 
NAVIGATING THE METAVERSE (Wiley) One Page Book Summary
NAVIGATING THE METAVERSE (Wiley)  One Page Book SummaryNAVIGATING THE METAVERSE (Wiley)  One Page Book Summary
NAVIGATING THE METAVERSE (Wiley) One Page Book Summary
Alex G. Lee, Ph.D. Esq. CLP
 
FAME Universe Introduction
FAME Universe IntroductionFAME Universe Introduction
FAME Universe Introduction
Alex G. Lee, Ph.D. Esq. CLP
 

More from Alex G. Lee, Ph.D. Esq. CLP (20)

[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...
[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...
[Presentation] Webinar on Patent Management and Patent Asset STO in the ChatG...
 
Metaverse x AI x Web3 x Sustainability Convergence
Metaverse x AI x  Web3 x Sustainability ConvergenceMetaverse x AI x  Web3 x Sustainability Convergence
Metaverse x AI x Web3 x Sustainability Convergence
 
Tokenization, Securitization, Monetization of Real-World Assets
Tokenization, Securitization, Monetization of Real-World AssetsTokenization, Securitization, Monetization of Real-World Assets
Tokenization, Securitization, Monetization of Real-World Assets
 
Maximizing Innovation through ChatGPT Powered Patent Analysis
Maximizing Innovation through ChatGPT Powered Patent AnalysisMaximizing Innovation through ChatGPT Powered Patent Analysis
Maximizing Innovation through ChatGPT Powered Patent Analysis
 
Maximizing AI Business Value Creation Utilizing Patents
Maximizing AI Business Value Creation Utilizing PatentsMaximizing AI Business Value Creation Utilizing Patents
Maximizing AI Business Value Creation Utilizing Patents
 
Real-World Assets STO + Institutional DeFi Integration
Real-World Assets STO + Institutional DeFi IntegrationReal-World Assets STO + Institutional DeFi Integration
Real-World Assets STO + Institutional DeFi Integration
 
Metaverse x Web3 Interoperability Overview
Metaverse x Web3 Interoperability OverviewMetaverse x Web3 Interoperability Overview
Metaverse x Web3 Interoperability Overview
 
AI for Metaverse x Web3 Overview
AI for Metaverse x Web3 OverviewAI for Metaverse x Web3 Overview
AI for Metaverse x Web3 Overview
 
NFT Web3 Metaverse Global Leaders Roundtable
NFT Web3 Metaverse Global Leaders RoundtableNFT Web3 Metaverse Global Leaders Roundtable
NFT Web3 Metaverse Global Leaders Roundtable
 
Fame Universe Introduction
Fame Universe IntroductionFame Universe Introduction
Fame Universe Introduction
 
Metaverse Fashion Overview
Metaverse Fashion OverviewMetaverse Fashion Overview
Metaverse Fashion Overview
 
Global Metaverse Fashion Innovators Roadshow
Global Metaverse Fashion Innovators RoadshowGlobal Metaverse Fashion Innovators Roadshow
Global Metaverse Fashion Innovators Roadshow
 
NFT Financialization Overview
NFT Financialization OverviewNFT Financialization Overview
NFT Financialization Overview
 
Metaverse & Web3 Technology Innovation & Business Development
Metaverse & Web3 Technology Innovation & Business DevelopmentMetaverse & Web3 Technology Innovation & Business Development
Metaverse & Web3 Technology Innovation & Business Development
 
NFT Monetization Innovation Webinar
NFT Monetization Innovation WebinarNFT Monetization Innovation Webinar
NFT Monetization Innovation Webinar
 
웹3.0기반 메타버스 응용을 위한 NFT 가치개발과 가치평가 특강
웹3.0기반 메타버스 응용을 위한 NFT 가치개발과 가치평가 특강웹3.0기반 메타버스 응용을 위한 NFT 가치개발과 가치평가 특강
웹3.0기반 메타버스 응용을 위한 NFT 가치개발과 가치평가 특강
 
NFT for Web3 Based Metaverse Monetization Webinar.pdf
NFT for Web3 Based Metaverse Monetization Webinar.pdfNFT for Web3 Based Metaverse Monetization Webinar.pdf
NFT for Web3 Based Metaverse Monetization Webinar.pdf
 
FAME UNIVERSE Fashion NFT Monetization Platform Introduction
FAME UNIVERSE Fashion NFT Monetization Platform IntroductionFAME UNIVERSE Fashion NFT Monetization Platform Introduction
FAME UNIVERSE Fashion NFT Monetization Platform Introduction
 
NAVIGATING THE METAVERSE (Wiley) One Page Book Summary
NAVIGATING THE METAVERSE (Wiley)  One Page Book SummaryNAVIGATING THE METAVERSE (Wiley)  One Page Book Summary
NAVIGATING THE METAVERSE (Wiley) One Page Book Summary
 
FAME Universe Introduction
FAME Universe IntroductionFAME Universe Introduction
FAME Universe Introduction
 

Recently uploaded

一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
9gr6pty
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
Vietnam Cotton & Spinning Association
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptxREUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
KiriakiENikolaidou
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
actyx
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
exukyp
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
yuvarajkumar334
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
blueshagoo1
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative ClassifiersML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
MastanaihnaiduYasam
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
ytypuem
 

Recently uploaded (20)

一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics March 2024
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptxREUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
REUSE-SCHOOL-DATA-INTEGRATED-SYSTEMS.pptx
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
一比一原版斯威本理工大学毕业证(swinburne毕业证)如何办理
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
 
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCAModule 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
Module 1 ppt BIG DATA ANALYTICS_NOTES FOR MCA
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
Econ3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdfEcon3060_Screen Time and Success_ final_GroupProject.pdf
Econ3060_Screen Time and Success_ final_GroupProject.pdf
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative ClassifiersML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
一比一原版(曼大毕业证书)曼尼托巴大学毕业证如何办理
 

Understanding Big Data Platform from Patents

  • 1. 1 ©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/ Understanding Big Data Platform from Patents The Hadoop big data platform is based on the MapReduce framework. Google patent US7650331, titled “System and method for efficient large-scale data processing,”described the MapReduce framework for the first time. Its claim 1 clearly explains the function of the Map and Reduce process as follows. A system for large-scale processingof data, comprising: a plurality of processes executing on a plurality of interconnected processors; (Distributed Processor (File) System) the plurality of processesincluding a master process,forcoordinating a data processing job for processing a set of input data, and worker processes; (JobTracker) the master process, in responseto a request to perform the data processingjob, assigning input data blocks of the set of input data to respective ones of the worker processes; (<key, value> pairs) each of a first plurality of the worker processes including an application-independent map module for retrieving a respective input data block assigned to the worker process bythe master process and applying an application- specific map operation to the respective input data block to produceintermediate data values, wherein at least a subsetof the intermediate data values each comprises a key/value pair, and wherein at least two of the first plurality of the worker processes operatesimultaneously so as to perform the application-specific map operation in parallel on distinct, respective input data blocks; (Map Step)
  • 2. 2 ©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/ a partition operator for processing the produced intermediate data values to producea plurality of intermediate data sets, wherein each respective intermediate data set includes all key/value pairs for a distinct set of respective keys, and wherein at least one of the respective intermediate data sets includes respective ones of the key/value pairs produced by a plurality of the first plurality of the worker processes;and (IntermediateStep) each of a second plurality of the worker processesincluding an application-independent reduce module for retrieving data, the retrieved data comprising at least a subsetof the key/value pairs from a respective intermediate data set of the plurality of intermediate data sets and applying an application-specific reduce operation to the retrieved data to producefinal output data correspondingto the distinct set of respective keys in the respective intermediate data set of the plurality of intermediate data sets, and wherein at least two of the second plurality of the worker processes operatesimultaneously so as to perform the application-specific reduce operation in parallel on multiple respective subsets ofthe produced intermediate data values. (Reduce Step)
  • 3. 3 ©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/
  • 4. 4 ©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/ MAPR’s patent application US20110313973, titled “Map-ReduceReady Distributed File System.” further developed the MapReduce framework including the shuffle function using the distributed file system (DFS) in its claim 1 as follows. A map-reduce compatible shuffle function, comprising: a distributed file system; and a map-reduce system, wherein each map function writes to the distributed file system and each reduce function reads input from the distributed file system. The shuffle step redistributes the produced intermediate data from the map step, such that all data belonging to one key is located on the same worker node.
  • 5. 5 ©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/
  • 6. 6 ©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/ Hadoop is a platform designed for large datasets (datasets measured in the terabytes, petabytes, or even greater data size) that leverages the MapReduce framework. However, many existing websites and applications are built on systems that differ greatly from those that can take advantage of large quantities of data. To take advantage of the Hadoop big data platform, the systems have to be re-engineered for the new Hadoop platform. Treasure Data patent application US20130124483, titled “System and method for operating a big-data platform,” illustrates a system for integrating the existing websites and applications with the big-data platform without the re-engineering process. The system for operating a big data platform includes a data analysis platform that receives discrete client data (formatted as a plurality of key-value pairs in row format); a network accessible distributed storage system (hosted on a distributed cloud storage system such as Amazon's S3/EC2) that stores the client data in a real-time storage system and merges the client data into a columnar-based distributed archive storage system (using a MapReduce); a query interface that receives a data query request and selectively interfaces with the client data from the real-time storage system and archive storage system according to the query.
  • 7. 7 ©2016 TechIPm,LLC All RightsReservedhttp://www.techipm.com/