Enterprise data science learning solution 
A practical approach to big data learning 
CloneSkills, Inc. 
(916)-296-0228 
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Objective 
 Educate various key components that’s are typically used to deliver enterprise data sciences 
 Demonstrate the steps to move data between Oracle 12C and HADOOP using Sqoop 
 Review data flow between SAP HANA and HADOOP using smart data access 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
What’s involved in building enterprise data science? 
Our Enterprise Data Science Platform 
HADOOP Distribution 
SAP HANA Oracle 12C 
Social | Forum | Blog | 
Web 
File | Text 
Analytics 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
Our enterprise data science platform components - Our lab(CSLAB) 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Enterprise Components 
 SAP HANA 
 SAP BOBJ 
 Oracle 12C 
 Oracle ODI 
HADOOP Components 
 HDFS 
 HBase 
 Hive 
 Impala 
 Pig 
 Search 
 Shell 
 Mapreduce 
 Sqoop 
 OOIZE 
 ZOOKEEPER 
 Hue 
 Dashboard 
 Editor 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
Our enterprise data science platform technical components 
Our (CSLAB) On demand Lab Infrastructure 
__________________________________ 
 SAP HANA 
 SAP BOBJ 
 Oracle 12C 
 Oracle ODI 
 HADOOP 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
 
 
 
	 

 
 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
Our three (3) node 
HADOOP cluster 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data science platform - HADOOP infrastructure 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
Our HADOOP core 
components 
________________ 
 Hive 
 Impala 
 Pig 
 Search 
 Hbase 
 Shell 
 Mapreduce 
 Sqoop 
 Hue 
 HDFS 
 OOIZE 
 ZOOKEEPER 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data science platform - HADOOP components 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
Our HADOOP core 
components 
________________ 
 Hive 
 Impala 
 Pig 
 Search 
 Hbase 
 Shell 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data science platform - Hue components 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
Our Oracle 12 C 
Infrastructure 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data science platform - Oracle 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
Our Oracle 12 C 
Infrastructure 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data science platform - Oracle 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
Our Oracle ODI ( 
Oracle Data 
Integrator) 
Infrastructure 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data science platform - Oracle data integrator (ODI) 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data science platform – SAP HANA 
SAP HANA 
_______________ 
Smart Data Access 
Connects SAP HANA 
and HADOOP 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data science platform - SAP HANA and HADOOP integration 
SAP HANA 
_______________ 
Smart Data Access 
Connects SAP HANA 
and HADOOP 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Steps to move data between Oracle and HADOOP using Sqoop 
HADOOP Distribution 
Import 
Oracle 12C Sqoop 
Export 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Oracle table and it’s 
data 
Review Oracle table – EMPLOYEE_JP 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop Job 
Sqoop job creation
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop Job 
____________ 
Create connection to 
Oracle 
Sqoop job creation - Create connection to Oracle 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop Job 
____________ 
Oracle source table 
details 
Sqoop job creation - Configure source table 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop job creation - Configure source table and the primary key of the table 
Sqoop Job 
____________ 
Oracle source table 
and column details 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop job creation - Configure data target , HDFS files (output files) 
Sqoop Job 
____________ 
Destination in 
HADOOP ( HDFS 
output files) 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop Job 
____________ 
Job extraction log 
Run Sqoop job - review job log 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop job output - HDFS output file, destination files 
Sqoop Job 
____________ 
HDFS destination 
files 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop job output - Oracle data in HADOOP HDFS files 
Sqoop Job 
____________ 
Oracle data in 
HADOOP - preview 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Sqoop job output - Data has been moved from Oracle to HADOOP 
Sqoop Job 
____________ 
Data has been 
imported from Oracle 
to HADOOP 
Sqoop Job 
____________ 
We can also export 
data from HADOOP 
and then load them 
into Oracle 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Our enterprise data 
sciences use case 
CloneSkills, Inc. 
(916)-296-0228
Learn to lead big data - Enterprise data science 
a practical approach 
CloneSkills, Inc. 
http://www.CloneSkills.com 
Architect : Karthik Rajamanickam 
Stay tuned, more to come 
Thank You ! 
CloneSkills, Inc. 
(916)-296-0228

Enterprise data science - What it takes to build?

  • 1.
    Enterprise data sciencelearning solution A practical approach to big data learning CloneSkills, Inc. (916)-296-0228 Learn to lead big data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam
  • 2.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Objective Educate various key components that’s are typically used to deliver enterprise data sciences Demonstrate the steps to move data between Oracle 12C and HADOOP using Sqoop Review data flow between SAP HANA and HADOOP using smart data access CloneSkills, Inc. (916)-296-0228
  • 3.
    Learn to leadbig data - Enterprise data science a practical approach What’s involved in building enterprise data science? Our Enterprise Data Science Platform HADOOP Distribution SAP HANA Oracle 12C Social | Forum | Blog | Web File | Text Analytics CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam CloneSkills, Inc. (916)-296-0228
  • 4.
    Learn to leadbig data - Enterprise data science a practical approach Our enterprise data science platform components - Our lab(CSLAB) CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Enterprise Components SAP HANA SAP BOBJ Oracle 12C Oracle ODI HADOOP Components HDFS HBase Hive Impala Pig Search Shell Mapreduce Sqoop OOIZE ZOOKEEPER Hue Dashboard Editor CloneSkills, Inc. (916)-296-0228
  • 5.
    Learn to leadbig data - Enterprise data science Our enterprise data science platform technical components Our (CSLAB) On demand Lab Infrastructure __________________________________ SAP HANA SAP BOBJ Oracle 12C Oracle ODI HADOOP a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam CloneSkills, Inc. (916)-296-0228
  • 6.
    Learn to leadbig data - Enterprise data science a practical approach Our three (3) node HADOOP cluster CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data science platform - HADOOP infrastructure CloneSkills, Inc. (916)-296-0228
  • 7.
    Learn to leadbig data - Enterprise data science a practical approach Our HADOOP core components ________________ Hive Impala Pig Search Hbase Shell Mapreduce Sqoop Hue HDFS OOIZE ZOOKEEPER CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data science platform - HADOOP components CloneSkills, Inc. (916)-296-0228
  • 8.
    Learn to leadbig data - Enterprise data science Our HADOOP core components ________________ Hive Impala Pig Search Hbase Shell a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data science platform - Hue components CloneSkills, Inc. (916)-296-0228
  • 9.
    Learn to leadbig data - Enterprise data science a practical approach Our Oracle 12 C Infrastructure CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data science platform - Oracle CloneSkills, Inc. (916)-296-0228
  • 10.
    Learn to leadbig data - Enterprise data science a practical approach Our Oracle 12 C Infrastructure CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data science platform - Oracle CloneSkills, Inc. (916)-296-0228
  • 11.
    Learn to leadbig data - Enterprise data science Our Oracle ODI ( Oracle Data Integrator) Infrastructure a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data science platform - Oracle data integrator (ODI) CloneSkills, Inc. (916)-296-0228
  • 12.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data science platform – SAP HANA SAP HANA _______________ Smart Data Access Connects SAP HANA and HADOOP CloneSkills, Inc. (916)-296-0228
  • 13.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data science platform - SAP HANA and HADOOP integration SAP HANA _______________ Smart Data Access Connects SAP HANA and HADOOP CloneSkills, Inc. (916)-296-0228
  • 14.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Steps to move data between Oracle and HADOOP using Sqoop HADOOP Distribution Import Oracle 12C Sqoop Export CloneSkills, Inc. (916)-296-0228
  • 15.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Oracle table and it’s data Review Oracle table – EMPLOYEE_JP CloneSkills, Inc. (916)-296-0228
  • 16.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop Job Sqoop job creation
  • 17.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop Job ____________ Create connection to Oracle Sqoop job creation - Create connection to Oracle CloneSkills, Inc. (916)-296-0228
  • 18.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop Job ____________ Oracle source table details Sqoop job creation - Configure source table CloneSkills, Inc. (916)-296-0228
  • 19.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop job creation - Configure source table and the primary key of the table Sqoop Job ____________ Oracle source table and column details CloneSkills, Inc. (916)-296-0228
  • 20.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop job creation - Configure data target , HDFS files (output files) Sqoop Job ____________ Destination in HADOOP ( HDFS output files) CloneSkills, Inc. (916)-296-0228
  • 21.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop Job ____________ Job extraction log Run Sqoop job - review job log CloneSkills, Inc. (916)-296-0228
  • 22.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop job output - HDFS output file, destination files Sqoop Job ____________ HDFS destination files CloneSkills, Inc. (916)-296-0228
  • 23.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop job output - Oracle data in HADOOP HDFS files Sqoop Job ____________ Oracle data in HADOOP - preview CloneSkills, Inc. (916)-296-0228
  • 24.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Sqoop job output - Data has been moved from Oracle to HADOOP Sqoop Job ____________ Data has been imported from Oracle to HADOOP Sqoop Job ____________ We can also export data from HADOOP and then load them into Oracle CloneSkills, Inc. (916)-296-0228
  • 25.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Our enterprise data sciences use case CloneSkills, Inc. (916)-296-0228
  • 26.
    Learn to leadbig data - Enterprise data science a practical approach CloneSkills, Inc. http://www.CloneSkills.com Architect : Karthik Rajamanickam Stay tuned, more to come Thank You ! CloneSkills, Inc. (916)-296-0228