SlideShare a Scribd company logo
Big Data Hadoop – Hands On Workshop
Data Processing Solutions – Comparison Guide
Big Data Workshop Series
Danairat T.
Results
Data Inputs
Cloud
1 2
Data Inputs
Results
Staging
Staging
Staging
Big
DWH
Data
Mart
Data
Mart
Data
Mart
Data
Mart
C
u
b
e
C
u
b
e
C
u
b
e
C
u
b
e
C
u
b
e
Staging
Analy
tic
Resul
ts
Layer
Cube
Layer
Data
Mart
Layer
Data
Warehouse
Layer
Data
Staging
Layer
Data
Source
Layer
1 2 3 4 5 6
Core Hadoop Traditional Data Warehouse
VS.
Big Data Hadoop
Solution 1. Core Hadoop processing
NO data staging transformation and NO data move required!!
Analytic Results
Excel Inputs
Top Benefits
1. Cloud and IoT ready architecture roadmap
2. No data duplication with reduce cost of data store/storage
3. Fast data processing and all processing are built-in fault tolerant
4. Align with unify data architecture and data governance
5. Less steps of data processing comparing with traditional DWH
The Effort Investment:-
1. Learn core Hadoop
Cloud Ready
1 2
Examples
Big Data Hadoop
Solution 2. Using BI Tools to analyze Hadoop data
Required single transformation in Hadoop for BI Tools since there is no BI Tools
built-in POI parser (MS-Office connector) over HDFS protocol.
Hadoop HDFS
(CVS Raw Text)
Excel Inputs
Top Benefits
1. Lower cost with cloud/IoT ready architecture
2. Fast data processing and all processing are built-in fault tolerant
3. Less steps of data processing comparing with traditional DWH
The Effort Investment:-
1. Learn Hadoop
2. Require transformation to
RAW text for BI Tools
Cloud Ready
1 2 3
Examples
Big Data Hadoop
Solution 3. Creating data warehouse in Hadoop
Required single transformation with DWH set up on Hadoop
for BI Tools
Top Benefits
1. Lower cost with cloud/IoT ready architecture
2. Fast data processing and all processing are built-in fault tolerant
3. Less steps of data processing comparing with traditional DWH
The Effort Investment:-
1. Learn core Hadoop
2. Require transformation to RAW text
for BI Tools
3. Require DWH on Hadoop set up
(Hive, Cassandra, HBase)
Hadoop HDFSExcel Inputs
Cloud Ready
Hadoop
DWH
Hive, (or
Cassandra,
Hbase)
1 2 3 4
Examples
Big Data Hadoop
Solution 4. Implementing traditional data warehouse
Staging
Staging
Staging
The more data
grow, the
slower data
processing
Data Mart
Data Mart
Data Mart
Data Mart
Top Concerns from Traditional Data Warehouse Architecture
1. A lot of data duplication lead to cost of data store/storage issue
2. Very slow of data processing and need to restart/roll back the job if any failed
3. Data security issue due to keep data too many copies and various formats
Cube
Cube
Cube
Cube
Cube
Staging
Analytic
Results
Layer
Cube
Layer
Data Mart
Layer
Data
Warehouse
Layer
Data
Staging
Layer
Data Source
Layer
1 2 3 4 5 6
Big Data Hadoop
Benefits Comparison Summary
Benefits
Criteria
Solutions
Cloud
Ready
Archit
ecture
Built-In
Parallel
Proces
sing
IoT
Archite
cture
Roadma
p
Without
DB cube
investm
ent
Witho
ut data
mart
invest
ment
Without
DWH
investme
nt
Without
Staging
data
(RAW
Text)
Unstruct
ured and
RAW
Source
Content
Processin
g
1. Core
Hadoop
Yes Yes Yes Yes Yes Yes Yes Yes
2. Hadoop and
Pentaho/Power
BI
Yes Yes Yes Yes Yes Yes No
(require
CSV)
No
(require
CSV)
3. Hadoop and
Cognos,
RapidMiner,
BO, Cognos,
Tableau
Yes Yes Yes Yes Yes No
(require
Hive
connector)
No
(require
Hive
connector)
No
(require
Hive
connector)
4. Traditional
Data
Warehouse
No No No No No No No No
Big Data Hadoop
Appendix
Big Data Hadoop
Pentaho supports Big Data Inputs
Big Data Hadoop
PowerBI supports Big Data Inputs
Big Data Hadoop
Tableau supports Big Data Inputs
Big Data Hadoop
Rapid Miner supports Big Data Inputs
Big Data Hadoop
Hadoop Cluster Installation and Excel
Parser Processing
Big Data Hadoop
Clone hadoop master to slave1 and slave2
master
slave1
slave2
Big Data Hadoop
At master node: Edit host file
Big Data Hadoop
At master node : Copy key file to slave1 and slave2
scp /home/ubuntu/.ssh/id_dsa.pub ip-172-31-1-8:/home/ubuntu/.ssh/master.pub
scp /home/ubuntu/.ssh/id_dsa.pub 172.31.15.16:/home/ubuntu/.ssh/master.pub
Big Data Hadoop
After this slide, we will use 3 cascaded
windows to represent master node, slave1
node and slave2 node
master node
slave1 node
slave2 node
Big Data Hadoop
At slave1 and slave2: cat /home/ubuntu/.ssh/master.pub >> /home/ubuntu/.ssh/authorized_keys
Big Data Hadoop
At master: Test ssh to slave1 and slave 2
$ ssh ip-172-31-1-8
$ exit
$ ssh ip-172-31-15-16
$ exit
Big Data Hadoop
At master: add slave1 and slave2 to Hadoop slave file
Big Data Hadoop
At master: add slave1 and slave2 to Hadoop slave file
Big Data Hadoop
At master: edit hdfs-site.xml
Big Data Hadoop
At master: edit hdfs-site.xml for 2 replication servers
Big Data Hadoop
At all nodes: remove directories of namenode and datanode
Big Data Hadoop
At master: format namenode
Big Data Hadoop
At master: format namenode
Big Data Hadoop
At master: Execute start-dfs.sh
Big Data Hadoop
At slave1: Check jps result, you will see DataNode has been started
Big Data Hadoop
At slave2: Check jps result, you will see DataNode has been started
Big Data Hadoop
At master: Execute start-yarn.sh
Big Data Hadoop
At slave1: Check jps result, you will see NodeManager has been started
Big Data Hadoop
At slave2: Check jps result, you will see NodeManager has been started
Big Data Hadoop
Importing data into HDFS Cluster
Big Data Hadoop
At master: import data to hdfs
Big Data Hadoop
At slave1: review imported result data from hdfs
Big Data Hadoop
At slave2: review imported result data from hdfs
Big Data Hadoop
Running MapReduce in Cluster Mode
Big Data Hadoop
At master: execute YARN mapreduce program
Big Data Hadoop
At slave1, slave2: you will see Application Master and Yarn Child Container
Big Data Hadoop
At master: review output file from hdfs
Big Data Hadoop
At master: review output file from hdfs
Big Data Hadoop
At slave1, slave2: review output file from hdfs by using command:-
hdfs dfs -cat /outputs/wordcount_output_dir01/part-r-00000
Big Data Hadoop
At master: review output result data from
web console
Big Data Hadoop
At master: review output result data from
web console
Big Data Hadoop
At master: review output result data from
web console
Big Data Hadoop
At master: review output result data from
web console
Big Data Hadoop
Process Excel Worksheet
Big Data Hadoop
1. Create Java Class using POI Libs
Big Data Hadoop
2. Transversal Data in Excel Spreadsheet
Workbook workbook = new XSSFWorkbook(inputStream);
Sheet firstSheet = workbook.getSheetAt(0);
Iterator<Row> iterator = firstSheet.iterator();
while (iterator.hasNext()) {
Row nextRow = iterator.next();
Iterator<Cell> cellIterator = nextRow.cellIterator();
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
Big Data Hadoop
3. Extract Data from Excel Spreadsheet
switch (cell.getCellType()) {
case Cell.CELL_TYPE_STRING:
System.out.print(cell.getStringCellValue());
break;
case Cell.CELL_TYPE_BOOLEAN:
System.out.print(cell.getBooleanCellValue());
break;
case Cell.CELL_TYPE_NUMERIC:
System.out.print(cell.getNumericCellValue());
break;
}
For further integration into HDFS, please emit data to output collector.
Big Data Hadoop
4. Close Excel Spreadsheet
workbook.close();
inputStream.close();
Big Data Hadoop
Excel Processing Results in Hadoop
Big Data Hadoop
Stopping Hadoop Cluster
Big Data Hadoop
At master: execute stop-yarn.sh
Big Data Hadoop
At slave1: use jps to review NodeManager has been stopped
Big Data Hadoop
At slave2: use jps to review NodeManager has been stopped
Big Data Hadoop
At master: execute stop-dfs.sh
Big Data Hadoop
At slave1: use jps to review DataNode has been stopped
Big Data Hadoop
At slave2: use jps to review DataNode has been stopped
Big Data Hadoop
Thank you very much

More Related Content

What's hot

Hadoop Architecture in Depth
Hadoop Architecture in DepthHadoop Architecture in Depth
Hadoop Architecture in Depth
Syed Hadoop
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Ramesh Pabba - seeking new projects
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hari Shankar Sreekumar
 
Hadoop hdfs interview questions
Hadoop hdfs interview questionsHadoop hdfs interview questions
Hadoop hdfs interview questions
Kalyan Hadoop
 
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Hadoop Summit 2015: Hive at Yahoo: Letters from the TrenchesHadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Mithun Radhakrishnan
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
awesomesos
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
Edureka!
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
Edureka!
 
Hadoop World 2011: Hadoop Troubleshooting 101 - Kate Ting - Cloudera
Hadoop World 2011: Hadoop Troubleshooting 101 - Kate Ting - ClouderaHadoop World 2011: Hadoop Troubleshooting 101 - Kate Ting - Cloudera
Hadoop World 2011: Hadoop Troubleshooting 101 - Kate Ting - Cloudera
Cloudera, Inc.
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Simplilearn
 
Hadoop and big data training
Hadoop and big data trainingHadoop and big data training
Hadoop and big data training
agiamas
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
sudhakara st
 
Apache Hadoop In Theory And Practice
Apache Hadoop In Theory And PracticeApache Hadoop In Theory And Practice
Apache Hadoop In Theory And Practice
Adam Kawa
 
Hadoop interview quations1
Hadoop interview quations1Hadoop interview quations1
Hadoop interview quations1Vemula Ravi
 
Mutable Data in Hive's Immutable World
Mutable Data in Hive's Immutable WorldMutable Data in Hive's Immutable World
Mutable Data in Hive's Immutable World
Lester Martin
 
Hadoop architecture by ajay
Hadoop architecture by ajayHadoop architecture by ajay
Hadoop architecture by ajay
Hadoop online training
 
Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)
Ferran Galí Reniu
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
Antonio Silveira
 

What's hot (20)

Hadoop Architecture in Depth
Hadoop Architecture in DepthHadoop Architecture in Depth
Hadoop Architecture in Depth
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
 
Hadoop hdfs interview questions
Hadoop hdfs interview questionsHadoop hdfs interview questions
Hadoop hdfs interview questions
 
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Hadoop Summit 2015: Hive at Yahoo: Letters from the TrenchesHadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
Hadoop Summit 2015: Hive at Yahoo: Letters from the Trenches
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 
Hadoop World 2011: Hadoop Troubleshooting 101 - Kate Ting - Cloudera
Hadoop World 2011: Hadoop Troubleshooting 101 - Kate Ting - ClouderaHadoop World 2011: Hadoop Troubleshooting 101 - Kate Ting - Cloudera
Hadoop World 2011: Hadoop Troubleshooting 101 - Kate Ting - Cloudera
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
 
Hadoop and big data training
Hadoop and big data trainingHadoop and big data training
Hadoop and big data training
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Apache Hadoop In Theory And Practice
Apache Hadoop In Theory And PracticeApache Hadoop In Theory And Practice
Apache Hadoop In Theory And Practice
 
Hadoop interview quations1
Hadoop interview quations1Hadoop interview quations1
Hadoop interview quations1
 
Mutable Data in Hive's Immutable World
Mutable Data in Hive's Immutable WorldMutable Data in Hive's Immutable World
Mutable Data in Hive's Immutable World
 
Hadoop architecture by ajay
Hadoop architecture by ajayHadoop architecture by ajay
Hadoop architecture by ajay
 
Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)Yarn by default (Spark on YARN)
Yarn by default (Spark on YARN)
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 

Similar to Big data hadooop analytic and data warehouse comparison guide

SQL on Hadoop: Defining the New Generation of Analytics Databases
SQL on Hadoop: Defining the New Generation of Analytics Databases  SQL on Hadoop: Defining the New Generation of Analytics Databases
SQL on Hadoop: Defining the New Generation of Analytics Databases
DataWorks Summit
 
Hadoop and Mapreduce Certification
Hadoop and Mapreduce CertificationHadoop and Mapreduce Certification
Hadoop and Mapreduce Certification
Vskills
 
Presentation sreenu dwh-services
Presentation sreenu dwh-servicesPresentation sreenu dwh-services
Presentation sreenu dwh-services
Sreenu Musham
 
Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Søren Lund
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High AvailabilityCloudera, Inc.
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
Ranjith Sekar
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
amrutupre
 
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
Hw09   Rethinking The Data Warehouse With Hadoop And HiveHw09   Rethinking The Data Warehouse With Hadoop And Hive
Hw09 Rethinking The Data Warehouse With Hadoop And HiveCloudera, Inc.
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
Jazan University
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderDmitry Makarchuk
 
Top Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for FresherTop Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for Fresher
JanBask Training
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
Hortonworks
 
Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
Mahmoud Yassin
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
Phil Young
 
Hadoop training by keylabs
Hadoop training by keylabsHadoop training by keylabs
Hadoop training by keylabs
Siva Sankar
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
Roorkee College of Engineering, Roorkee
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
arslanhaneef
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
sonukumar379092
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1Thanh Nguyen
 

Similar to Big data hadooop analytic and data warehouse comparison guide (20)

SQL on Hadoop: Defining the New Generation of Analytics Databases
SQL on Hadoop: Defining the New Generation of Analytics Databases  SQL on Hadoop: Defining the New Generation of Analytics Databases
SQL on Hadoop: Defining the New Generation of Analytics Databases
 
Hadoop content
Hadoop contentHadoop content
Hadoop content
 
Hadoop and Mapreduce Certification
Hadoop and Mapreduce CertificationHadoop and Mapreduce Certification
Hadoop and Mapreduce Certification
 
Presentation sreenu dwh-services
Presentation sreenu dwh-servicesPresentation sreenu dwh-services
Presentation sreenu dwh-services
 
Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)Playing with Hadoop (NPW2013)
Playing with Hadoop (NPW2013)
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High Availability
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
Big-Data Hadoop Tutorials - MindScripts Technologies, Pune
 
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
Hw09   Rethinking The Data Warehouse With Hadoop And HiveHw09   Rethinking The Data Warehouse With Hadoop And Hive
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
Hadoop and mysql by Chris Schneider
Hadoop and mysql by Chris SchneiderHadoop and mysql by Chris Schneider
Hadoop and mysql by Chris Schneider
 
Top Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for FresherTop Hadoop Big Data Interview Questions and Answers for Fresher
Top Hadoop Big Data Interview Questions and Answers for Fresher
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
 
Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
 
Hadoop training by keylabs
Hadoop training by keylabsHadoop training by keylabs
Hadoop training by keylabs
 
List of Engineering Colleges in Uttarakhand
List of Engineering Colleges in UttarakhandList of Engineering Colleges in Uttarakhand
List of Engineering Colleges in Uttarakhand
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1
 

More from Danairat Thanabodithammachari

Thailand State Enterprise - Business Architecture and SE-AM
Thailand State Enterprise - Business Architecture and SE-AMThailand State Enterprise - Business Architecture and SE-AM
Thailand State Enterprise - Business Architecture and SE-AM
Danairat Thanabodithammachari
 
Agile Management
Agile ManagementAgile Management
Agile Organization and Enterprise Architecture v1129 Danairat
Agile Organization and Enterprise Architecture v1129 DanairatAgile Organization and Enterprise Architecture v1129 Danairat
Agile Organization and Enterprise Architecture v1129 Danairat
Danairat Thanabodithammachari
 
Blockchain for Management
Blockchain for ManagementBlockchain for Management
Blockchain for Management
Danairat Thanabodithammachari
 
Enterprise Architecture and Agile Organization Management v1076 Danairat
Enterprise Architecture and Agile Organization Management v1076 DanairatEnterprise Architecture and Agile Organization Management v1076 Danairat
Enterprise Architecture and Agile Organization Management v1076 Danairat
Danairat Thanabodithammachari
 
Agile Enterprise Architecture - Danairat
Agile Enterprise Architecture - DanairatAgile Enterprise Architecture - Danairat
Agile Enterprise Architecture - Danairat
Danairat Thanabodithammachari
 
Digital Transformation, Enterprise Architecture, Big Data by Danairat
Digital Transformation, Enterprise Architecture, Big Data by DanairatDigital Transformation, Enterprise Architecture, Big Data by Danairat
Digital Transformation, Enterprise Architecture, Big Data by Danairat
Danairat Thanabodithammachari
 
Perl for System Automation - 01 Advanced File Processing
Perl for System Automation - 01 Advanced File ProcessingPerl for System Automation - 01 Advanced File Processing
Perl for System Automation - 01 Advanced File Processing
Danairat Thanabodithammachari
 
Perl Programming - 04 Programming Database
Perl Programming - 04 Programming DatabasePerl Programming - 04 Programming Database
Perl Programming - 04 Programming Database
Danairat Thanabodithammachari
 
Perl Programming - 03 Programming File
Perl Programming - 03 Programming FilePerl Programming - 03 Programming File
Perl Programming - 03 Programming File
Danairat Thanabodithammachari
 
Perl Programming - 02 Regular Expression
Perl Programming - 02 Regular ExpressionPerl Programming - 02 Regular Expression
Perl Programming - 02 Regular Expression
Danairat Thanabodithammachari
 
Perl Programming - 01 Basic Perl
Perl Programming - 01 Basic PerlPerl Programming - 01 Basic Perl
Perl Programming - 01 Basic Perl
Danairat Thanabodithammachari
 
JEE Programming - 03 Model View Controller
JEE Programming - 03 Model View ControllerJEE Programming - 03 Model View Controller
JEE Programming - 03 Model View Controller
Danairat Thanabodithammachari
 
JEE Programming - 05 JSP
JEE Programming - 05 JSPJEE Programming - 05 JSP
JEE Programming - 05 JSP
Danairat Thanabodithammachari
 
JEE Programming - 04 Java Servlets
JEE Programming - 04 Java ServletsJEE Programming - 04 Java Servlets
JEE Programming - 04 Java Servlets
Danairat Thanabodithammachari
 
JEE Programming - 08 Enterprise Application Deployment
JEE Programming - 08 Enterprise Application DeploymentJEE Programming - 08 Enterprise Application Deployment
JEE Programming - 08 Enterprise Application Deployment
Danairat Thanabodithammachari
 
JEE Programming - 07 EJB Programming
JEE Programming - 07 EJB ProgrammingJEE Programming - 07 EJB Programming
JEE Programming - 07 EJB Programming
Danairat Thanabodithammachari
 
JEE Programming - 06 Web Application Deployment
JEE Programming - 06 Web Application DeploymentJEE Programming - 06 Web Application Deployment
JEE Programming - 06 Web Application Deployment
Danairat Thanabodithammachari
 
JEE Programming - 01 Introduction
JEE Programming - 01 IntroductionJEE Programming - 01 Introduction
JEE Programming - 01 Introduction
Danairat Thanabodithammachari
 
JEE Programming - 02 The Containers
JEE Programming - 02 The ContainersJEE Programming - 02 The Containers
JEE Programming - 02 The Containers
Danairat Thanabodithammachari
 

More from Danairat Thanabodithammachari (20)

Thailand State Enterprise - Business Architecture and SE-AM
Thailand State Enterprise - Business Architecture and SE-AMThailand State Enterprise - Business Architecture and SE-AM
Thailand State Enterprise - Business Architecture and SE-AM
 
Agile Management
Agile ManagementAgile Management
Agile Management
 
Agile Organization and Enterprise Architecture v1129 Danairat
Agile Organization and Enterprise Architecture v1129 DanairatAgile Organization and Enterprise Architecture v1129 Danairat
Agile Organization and Enterprise Architecture v1129 Danairat
 
Blockchain for Management
Blockchain for ManagementBlockchain for Management
Blockchain for Management
 
Enterprise Architecture and Agile Organization Management v1076 Danairat
Enterprise Architecture and Agile Organization Management v1076 DanairatEnterprise Architecture and Agile Organization Management v1076 Danairat
Enterprise Architecture and Agile Organization Management v1076 Danairat
 
Agile Enterprise Architecture - Danairat
Agile Enterprise Architecture - DanairatAgile Enterprise Architecture - Danairat
Agile Enterprise Architecture - Danairat
 
Digital Transformation, Enterprise Architecture, Big Data by Danairat
Digital Transformation, Enterprise Architecture, Big Data by DanairatDigital Transformation, Enterprise Architecture, Big Data by Danairat
Digital Transformation, Enterprise Architecture, Big Data by Danairat
 
Perl for System Automation - 01 Advanced File Processing
Perl for System Automation - 01 Advanced File ProcessingPerl for System Automation - 01 Advanced File Processing
Perl for System Automation - 01 Advanced File Processing
 
Perl Programming - 04 Programming Database
Perl Programming - 04 Programming DatabasePerl Programming - 04 Programming Database
Perl Programming - 04 Programming Database
 
Perl Programming - 03 Programming File
Perl Programming - 03 Programming FilePerl Programming - 03 Programming File
Perl Programming - 03 Programming File
 
Perl Programming - 02 Regular Expression
Perl Programming - 02 Regular ExpressionPerl Programming - 02 Regular Expression
Perl Programming - 02 Regular Expression
 
Perl Programming - 01 Basic Perl
Perl Programming - 01 Basic PerlPerl Programming - 01 Basic Perl
Perl Programming - 01 Basic Perl
 
JEE Programming - 03 Model View Controller
JEE Programming - 03 Model View ControllerJEE Programming - 03 Model View Controller
JEE Programming - 03 Model View Controller
 
JEE Programming - 05 JSP
JEE Programming - 05 JSPJEE Programming - 05 JSP
JEE Programming - 05 JSP
 
JEE Programming - 04 Java Servlets
JEE Programming - 04 Java ServletsJEE Programming - 04 Java Servlets
JEE Programming - 04 Java Servlets
 
JEE Programming - 08 Enterprise Application Deployment
JEE Programming - 08 Enterprise Application DeploymentJEE Programming - 08 Enterprise Application Deployment
JEE Programming - 08 Enterprise Application Deployment
 
JEE Programming - 07 EJB Programming
JEE Programming - 07 EJB ProgrammingJEE Programming - 07 EJB Programming
JEE Programming - 07 EJB Programming
 
JEE Programming - 06 Web Application Deployment
JEE Programming - 06 Web Application DeploymentJEE Programming - 06 Web Application Deployment
JEE Programming - 06 Web Application Deployment
 
JEE Programming - 01 Introduction
JEE Programming - 01 IntroductionJEE Programming - 01 Introduction
JEE Programming - 01 Introduction
 
JEE Programming - 02 The Containers
JEE Programming - 02 The ContainersJEE Programming - 02 The Containers
JEE Programming - 02 The Containers
 

Recently uploaded

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 

Recently uploaded (20)

From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 

Big data hadooop analytic and data warehouse comparison guide

  • 1. Big Data Hadoop – Hands On Workshop Data Processing Solutions – Comparison Guide Big Data Workshop Series Danairat T. Results Data Inputs Cloud 1 2 Data Inputs Results Staging Staging Staging Big DWH Data Mart Data Mart Data Mart Data Mart C u b e C u b e C u b e C u b e C u b e Staging Analy tic Resul ts Layer Cube Layer Data Mart Layer Data Warehouse Layer Data Staging Layer Data Source Layer 1 2 3 4 5 6 Core Hadoop Traditional Data Warehouse VS.
  • 2. Big Data Hadoop Solution 1. Core Hadoop processing NO data staging transformation and NO data move required!! Analytic Results Excel Inputs Top Benefits 1. Cloud and IoT ready architecture roadmap 2. No data duplication with reduce cost of data store/storage 3. Fast data processing and all processing are built-in fault tolerant 4. Align with unify data architecture and data governance 5. Less steps of data processing comparing with traditional DWH The Effort Investment:- 1. Learn core Hadoop Cloud Ready 1 2 Examples
  • 3. Big Data Hadoop Solution 2. Using BI Tools to analyze Hadoop data Required single transformation in Hadoop for BI Tools since there is no BI Tools built-in POI parser (MS-Office connector) over HDFS protocol. Hadoop HDFS (CVS Raw Text) Excel Inputs Top Benefits 1. Lower cost with cloud/IoT ready architecture 2. Fast data processing and all processing are built-in fault tolerant 3. Less steps of data processing comparing with traditional DWH The Effort Investment:- 1. Learn Hadoop 2. Require transformation to RAW text for BI Tools Cloud Ready 1 2 3 Examples
  • 4. Big Data Hadoop Solution 3. Creating data warehouse in Hadoop Required single transformation with DWH set up on Hadoop for BI Tools Top Benefits 1. Lower cost with cloud/IoT ready architecture 2. Fast data processing and all processing are built-in fault tolerant 3. Less steps of data processing comparing with traditional DWH The Effort Investment:- 1. Learn core Hadoop 2. Require transformation to RAW text for BI Tools 3. Require DWH on Hadoop set up (Hive, Cassandra, HBase) Hadoop HDFSExcel Inputs Cloud Ready Hadoop DWH Hive, (or Cassandra, Hbase) 1 2 3 4 Examples
  • 5. Big Data Hadoop Solution 4. Implementing traditional data warehouse Staging Staging Staging The more data grow, the slower data processing Data Mart Data Mart Data Mart Data Mart Top Concerns from Traditional Data Warehouse Architecture 1. A lot of data duplication lead to cost of data store/storage issue 2. Very slow of data processing and need to restart/roll back the job if any failed 3. Data security issue due to keep data too many copies and various formats Cube Cube Cube Cube Cube Staging Analytic Results Layer Cube Layer Data Mart Layer Data Warehouse Layer Data Staging Layer Data Source Layer 1 2 3 4 5 6
  • 6. Big Data Hadoop Benefits Comparison Summary Benefits Criteria Solutions Cloud Ready Archit ecture Built-In Parallel Proces sing IoT Archite cture Roadma p Without DB cube investm ent Witho ut data mart invest ment Without DWH investme nt Without Staging data (RAW Text) Unstruct ured and RAW Source Content Processin g 1. Core Hadoop Yes Yes Yes Yes Yes Yes Yes Yes 2. Hadoop and Pentaho/Power BI Yes Yes Yes Yes Yes Yes No (require CSV) No (require CSV) 3. Hadoop and Cognos, RapidMiner, BO, Cognos, Tableau Yes Yes Yes Yes Yes No (require Hive connector) No (require Hive connector) No (require Hive connector) 4. Traditional Data Warehouse No No No No No No No No
  • 8. Big Data Hadoop Pentaho supports Big Data Inputs
  • 9. Big Data Hadoop PowerBI supports Big Data Inputs
  • 10. Big Data Hadoop Tableau supports Big Data Inputs
  • 11. Big Data Hadoop Rapid Miner supports Big Data Inputs
  • 12. Big Data Hadoop Hadoop Cluster Installation and Excel Parser Processing
  • 13. Big Data Hadoop Clone hadoop master to slave1 and slave2 master slave1 slave2
  • 14. Big Data Hadoop At master node: Edit host file
  • 15. Big Data Hadoop At master node : Copy key file to slave1 and slave2 scp /home/ubuntu/.ssh/id_dsa.pub ip-172-31-1-8:/home/ubuntu/.ssh/master.pub scp /home/ubuntu/.ssh/id_dsa.pub 172.31.15.16:/home/ubuntu/.ssh/master.pub
  • 16. Big Data Hadoop After this slide, we will use 3 cascaded windows to represent master node, slave1 node and slave2 node master node slave1 node slave2 node
  • 17. Big Data Hadoop At slave1 and slave2: cat /home/ubuntu/.ssh/master.pub >> /home/ubuntu/.ssh/authorized_keys
  • 18. Big Data Hadoop At master: Test ssh to slave1 and slave 2 $ ssh ip-172-31-1-8 $ exit $ ssh ip-172-31-15-16 $ exit
  • 19. Big Data Hadoop At master: add slave1 and slave2 to Hadoop slave file
  • 20. Big Data Hadoop At master: add slave1 and slave2 to Hadoop slave file
  • 21. Big Data Hadoop At master: edit hdfs-site.xml
  • 22. Big Data Hadoop At master: edit hdfs-site.xml for 2 replication servers
  • 23. Big Data Hadoop At all nodes: remove directories of namenode and datanode
  • 24. Big Data Hadoop At master: format namenode
  • 25. Big Data Hadoop At master: format namenode
  • 26. Big Data Hadoop At master: Execute start-dfs.sh
  • 27. Big Data Hadoop At slave1: Check jps result, you will see DataNode has been started
  • 28. Big Data Hadoop At slave2: Check jps result, you will see DataNode has been started
  • 29. Big Data Hadoop At master: Execute start-yarn.sh
  • 30. Big Data Hadoop At slave1: Check jps result, you will see NodeManager has been started
  • 31. Big Data Hadoop At slave2: Check jps result, you will see NodeManager has been started
  • 32. Big Data Hadoop Importing data into HDFS Cluster
  • 33. Big Data Hadoop At master: import data to hdfs
  • 34. Big Data Hadoop At slave1: review imported result data from hdfs
  • 35. Big Data Hadoop At slave2: review imported result data from hdfs
  • 36. Big Data Hadoop Running MapReduce in Cluster Mode
  • 37. Big Data Hadoop At master: execute YARN mapreduce program
  • 38. Big Data Hadoop At slave1, slave2: you will see Application Master and Yarn Child Container
  • 39. Big Data Hadoop At master: review output file from hdfs
  • 40. Big Data Hadoop At master: review output file from hdfs
  • 41. Big Data Hadoop At slave1, slave2: review output file from hdfs by using command:- hdfs dfs -cat /outputs/wordcount_output_dir01/part-r-00000
  • 42. Big Data Hadoop At master: review output result data from web console
  • 43. Big Data Hadoop At master: review output result data from web console
  • 44. Big Data Hadoop At master: review output result data from web console
  • 45. Big Data Hadoop At master: review output result data from web console
  • 46. Big Data Hadoop Process Excel Worksheet
  • 47. Big Data Hadoop 1. Create Java Class using POI Libs
  • 48. Big Data Hadoop 2. Transversal Data in Excel Spreadsheet Workbook workbook = new XSSFWorkbook(inputStream); Sheet firstSheet = workbook.getSheetAt(0); Iterator<Row> iterator = firstSheet.iterator(); while (iterator.hasNext()) { Row nextRow = iterator.next(); Iterator<Cell> cellIterator = nextRow.cellIterator(); while (cellIterator.hasNext()) { Cell cell = cellIterator.next();
  • 49. Big Data Hadoop 3. Extract Data from Excel Spreadsheet switch (cell.getCellType()) { case Cell.CELL_TYPE_STRING: System.out.print(cell.getStringCellValue()); break; case Cell.CELL_TYPE_BOOLEAN: System.out.print(cell.getBooleanCellValue()); break; case Cell.CELL_TYPE_NUMERIC: System.out.print(cell.getNumericCellValue()); break; } For further integration into HDFS, please emit data to output collector.
  • 50. Big Data Hadoop 4. Close Excel Spreadsheet workbook.close(); inputStream.close();
  • 51. Big Data Hadoop Excel Processing Results in Hadoop
  • 52. Big Data Hadoop Stopping Hadoop Cluster
  • 53. Big Data Hadoop At master: execute stop-yarn.sh
  • 54. Big Data Hadoop At slave1: use jps to review NodeManager has been stopped
  • 55. Big Data Hadoop At slave2: use jps to review NodeManager has been stopped
  • 56. Big Data Hadoop At master: execute stop-dfs.sh
  • 57. Big Data Hadoop At slave1: use jps to review DataNode has been stopped
  • 58. Big Data Hadoop At slave2: use jps to review DataNode has been stopped
  • 59. Big Data Hadoop Thank you very much