SlideShare a Scribd company logo
2
Vision
To become leading consulting and
training provider in the field of Data
Analytics, Machine Learning, Big Data in
India & Overseas.
Mission
To create value for our customers by
providing consulting services and to
impart high quality training & skill
enhancement programs for employability.
About Us
Emerging India is promoted by professionals from IIT’s, IIM’s, MBAs and experts from Education and IT Industry. We are one of the India’s
fastest growing Analytics/ IT consulting and training companies. We offer services in both consulting and training domain including
NASSCOM certified professional programs (designed to bridge the gap between academics and Industry) and Data Analytics/ Cyber
Security/ IoT/ Robotics/ AI/ Blockchain consulting solutions. We are also proud NASSCOM member and NASSCOM
SSC Licensed Training Partner for the northern region in India.. As NASSCOM licensed training partner, Emerging
India is proudly taking NASSCOM SSC initiatives to the next level in the field of Data Analytics to enhance the technical skills of students
& working professionals.
Mrs. Rakhi Singh, Delivery Head (NASSCOM
Certified Trainer)
Mr. Mayank Jain, Big Data Developer and
Analyst
Mr. Kapil Sharma Center-Head cum Trainer
(Certified by North-western University)
Speakers:
What is HDFS
 HDFS stands for Hadoop Distributed File System
 Built on top of Ext3/Ext4 file System
 Designed to store large amount of data reliably and efficiently
 Ensure 100% data availability (High Availability Cluster)
 Do not permit Update Operation
 Built for OLAP, not for OLTP
Architecture of HDFS
Ext3/Ext4
SecondaryName
Node
Name Node
Data Node
Data Node
Data Node
Data Node
Data Node
Data Node
Data Node
Data Node
Data Node
Data Node
Data Node
Data Node
HDFS
Master
Slave
MR Operation
A.txt 100mb
Client1 a.txt code1
A
.
t
x
t
A
.t
x
t
MR Framwork
Mapper Reducer
< Key_In,Val_In,Key_Out,Val_out> < Key_In,Val_In,Key_Out,Val_out>
Input
Patitioner
Output
Hi
Hello
Hi
Hadoop
World
Hive
Hadoop
Hive
Hi
Hello
World
a.txt
Hi
Hello
Hi
Hadoop
World
Hive
Hadoop
Hive
Hi
Hello
World
block1
block2
Mapper1
Mapper2
1,Hi
2,Hello
3,Hi
4,Hadoop
5,World
1,Hive
2,Hadoop
3,Hive
4,Hi
5,Hello
6,World
Reducer1
Hi,1
Hello,1
Hi,1
Hadoop,1
World,1
Hive,1
Hadoop,1
Hive,1
Hi,1
Hello,1
World,1
MR Framwork
Mapper Reducer
< Key_In,Val_In,Key_Out,Val_out> < Key_In,Val_In,Key_Out,Val_out>
Input
Patitioner
Output
Reducer1
Hi,1
Hello,1
Hi,1
Hadoop,1
World,1
Hive,1
Hadoop,1
Hive,1
Hi,1
Hello,1
World,1
Hi,<1,1,1>
Hello,<1,1>
Hadoop,<1,1>
World,<1,1>
Hive,<1,1>
Hi,3
Hello,2
Hadoop,2
World,2
Hive,2
HIVE
Driver
Execution Engine
=MapReduce
Compiler Translator
Client Submits SQL
Convert SQL to Map Reduce
Metastore
Derby
Hive
 Database
 Types – Internal (Managed table) & External
 Internal(default): In case of drop operation, it will delete
data + metadata
 External Table: It drop only metadata
 Optimization Technique
 Partitioning
 Bucketing
Introduction to HBase
HBase is a Nosql, non-relational, distributed column-oriented database on top of
Hadoop.
NoSQL - NoSQL database are databases that doesn't use SQL engine as query engine.
Hbase Daemons
Daemons are services that run on individual machines and communicate with each other
HMaster — Master server of HBase, contains all meta data.
HRegionserver — Slave server of Hbase, contains the actual data.
HQuorumpeer — Zookeeper daemons for co-ordination service.
Advantages of using HBase
Provides a highly scalable database with nativity with hadoop.
Nodes can be added on the fly.
HBase ( LSM Tree)
Normalization vs Denormalization
HBase Data Model
Introduction to Spark
Introduction to Spark
Key Features
• RDD
• DAG
• Dataframe
• Lazzy
Questions &
Feedback !!!!
Our Location A
H-196,304,Iind Floor
Sector 63, ,Noida –
201301
Our Phone
+91 120-4169097
+91 8860599698
Email / Website
info@emergingindiagroup.com
https://www.emergingindiagro
up.com
Get in Touch with Us
We would be glad to hear from you !

More Related Content

Similar to Big data technologies by Emerging India Analytics

Apache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceApache Hivemall and my OSS experience
Apache Hivemall and my OSS experience
Makoto Yui
 
NTT Data - Shinichi Yamada - Hadoop World 2010
NTT Data - Shinichi Yamada - Hadoop World 2010NTT Data - Shinichi Yamada - Hadoop World 2010
NTT Data - Shinichi Yamada - Hadoop World 2010
Cloudera, Inc.
 
A Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.pptA Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.ppt
Sanket Shikhar
 
Syed Iqbal haider_updated
Syed Iqbal haider_updatedSyed Iqbal haider_updated
Syed Iqbal haider_updatedSyed Haider
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Revolution Analytics
 
Top 5 In-demand technologies to Learn in 2020
Top 5 In-demand technologies to Learn in 2020Top 5 In-demand technologies to Learn in 2020
Top 5 In-demand technologies to Learn in 2020
Intellipaat
 
Top 5 In-demand Technologies to Learn in 2020
Top 5 In-demand Technologies to Learn in 2020Top 5 In-demand Technologies to Learn in 2020
Top 5 In-demand Technologies to Learn in 2020
Intellipaat
 
Bhadale group of companies Red Hat partner services catalogue
Bhadale group of companies Red Hat partner services catalogueBhadale group of companies Red Hat partner services catalogue
Bhadale group of companies Red Hat partner services catalogue
Vijayananda Mohire
 
Ramesh kutumbaka resume
Ramesh kutumbaka resumeRamesh kutumbaka resume
Ramesh kutumbaka resume
Ramesh Kutumbaka
 
Jayaram_Parida- Big Data Architect and Technical Scrum Master
Jayaram_Parida- Big Data Architect and Technical Scrum MasterJayaram_Parida- Big Data Architect and Technical Scrum Master
Jayaram_Parida- Big Data Architect and Technical Scrum MasterJayaram Parida
 
Technovalley RedHat
Technovalley RedHatTechnovalley RedHat
Technovalley RedHat
ABIN VARGHESE
 
SAP_BASIS & HANA_with_Yrs_Exp-10.7_Aravind_Kumar
SAP_BASIS & HANA_with_Yrs_Exp-10.7_Aravind_KumarSAP_BASIS & HANA_with_Yrs_Exp-10.7_Aravind_Kumar
SAP_BASIS & HANA_with_Yrs_Exp-10.7_Aravind_Kumararavindkvs
 
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...Indus Khaitan
 
Resume
ResumeResume
Resume
Shiv Dutta
 
Partner Ecosystem Showcase for Apache Ranger and Apache Atlas
Partner Ecosystem Showcase for Apache Ranger and Apache AtlasPartner Ecosystem Showcase for Apache Ranger and Apache Atlas
Partner Ecosystem Showcase for Apache Ranger and Apache Atlas
DataWorks Summit
 
Basecamp Startups Company Profile
Basecamp Startups Company ProfileBasecamp Startups Company Profile
Basecamp Startups Company Profile
Rupesh Patil
 
Kalyan Hadoop
Kalyan HadoopKalyan Hadoop
Kalyan HadoopCanarys
 

Similar to Big data technologies by Emerging India Analytics (20)

Apache Hivemall and my OSS experience
Apache Hivemall and my OSS experienceApache Hivemall and my OSS experience
Apache Hivemall and my OSS experience
 
Resume
ResumeResume
Resume
 
NTT Data - Shinichi Yamada - Hadoop World 2010
NTT Data - Shinichi Yamada - Hadoop World 2010NTT Data - Shinichi Yamada - Hadoop World 2010
NTT Data - Shinichi Yamada - Hadoop World 2010
 
Madhu
MadhuMadhu
Madhu
 
A Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.pptA Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.ppt
 
Syed Iqbal haider_updated
Syed Iqbal haider_updatedSyed Iqbal haider_updated
Syed Iqbal haider_updated
 
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
Big Data Predictive Analytics with Revolution R Enterprise (Gartner BI Summit...
 
Top 5 In-demand technologies to Learn in 2020
Top 5 In-demand technologies to Learn in 2020Top 5 In-demand technologies to Learn in 2020
Top 5 In-demand technologies to Learn in 2020
 
Top 5 In-demand Technologies to Learn in 2020
Top 5 In-demand Technologies to Learn in 2020Top 5 In-demand Technologies to Learn in 2020
Top 5 In-demand Technologies to Learn in 2020
 
New-RajeshNaspoori_profile
New-RajeshNaspoori_profileNew-RajeshNaspoori_profile
New-RajeshNaspoori_profile
 
Bhadale group of companies Red Hat partner services catalogue
Bhadale group of companies Red Hat partner services catalogueBhadale group of companies Red Hat partner services catalogue
Bhadale group of companies Red Hat partner services catalogue
 
Ramesh kutumbaka resume
Ramesh kutumbaka resumeRamesh kutumbaka resume
Ramesh kutumbaka resume
 
Jayaram_Parida- Big Data Architect and Technical Scrum Master
Jayaram_Parida- Big Data Architect and Technical Scrum MasterJayaram_Parida- Big Data Architect and Technical Scrum Master
Jayaram_Parida- Big Data Architect and Technical Scrum Master
 
Technovalley RedHat
Technovalley RedHatTechnovalley RedHat
Technovalley RedHat
 
SAP_BASIS & HANA_with_Yrs_Exp-10.7_Aravind_Kumar
SAP_BASIS & HANA_with_Yrs_Exp-10.7_Aravind_KumarSAP_BASIS & HANA_with_Yrs_Exp-10.7_Aravind_Kumar
SAP_BASIS & HANA_with_Yrs_Exp-10.7_Aravind_Kumar
 
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
 
Resume
ResumeResume
Resume
 
Partner Ecosystem Showcase for Apache Ranger and Apache Atlas
Partner Ecosystem Showcase for Apache Ranger and Apache AtlasPartner Ecosystem Showcase for Apache Ranger and Apache Atlas
Partner Ecosystem Showcase for Apache Ranger and Apache Atlas
 
Basecamp Startups Company Profile
Basecamp Startups Company ProfileBasecamp Startups Company Profile
Basecamp Startups Company Profile
 
Kalyan Hadoop
Kalyan HadoopKalyan Hadoop
Kalyan Hadoop
 

Recently uploaded

Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 

Recently uploaded (20)

Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 

Big data technologies by Emerging India Analytics

  • 1.
  • 2. 2 Vision To become leading consulting and training provider in the field of Data Analytics, Machine Learning, Big Data in India & Overseas. Mission To create value for our customers by providing consulting services and to impart high quality training & skill enhancement programs for employability. About Us Emerging India is promoted by professionals from IIT’s, IIM’s, MBAs and experts from Education and IT Industry. We are one of the India’s fastest growing Analytics/ IT consulting and training companies. We offer services in both consulting and training domain including NASSCOM certified professional programs (designed to bridge the gap between academics and Industry) and Data Analytics/ Cyber Security/ IoT/ Robotics/ AI/ Blockchain consulting solutions. We are also proud NASSCOM member and NASSCOM SSC Licensed Training Partner for the northern region in India.. As NASSCOM licensed training partner, Emerging India is proudly taking NASSCOM SSC initiatives to the next level in the field of Data Analytics to enhance the technical skills of students & working professionals.
  • 3. Mrs. Rakhi Singh, Delivery Head (NASSCOM Certified Trainer) Mr. Mayank Jain, Big Data Developer and Analyst Mr. Kapil Sharma Center-Head cum Trainer (Certified by North-western University) Speakers:
  • 4. What is HDFS  HDFS stands for Hadoop Distributed File System  Built on top of Ext3/Ext4 file System  Designed to store large amount of data reliably and efficiently  Ensure 100% data availability (High Availability Cluster)  Do not permit Update Operation  Built for OLAP, not for OLTP
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Architecture of HDFS Ext3/Ext4 SecondaryName Node Name Node Data Node Data Node Data Node Data Node Data Node Data Node Data Node Data Node Data Node Data Node Data Node Data Node HDFS Master Slave
  • 23. MR Operation A.txt 100mb Client1 a.txt code1 A . t x t A .t x t
  • 24. MR Framwork Mapper Reducer < Key_In,Val_In,Key_Out,Val_out> < Key_In,Val_In,Key_Out,Val_out> Input Patitioner Output Hi Hello Hi Hadoop World Hive Hadoop Hive Hi Hello World a.txt Hi Hello Hi Hadoop World Hive Hadoop Hive Hi Hello World block1 block2 Mapper1 Mapper2 1,Hi 2,Hello 3,Hi 4,Hadoop 5,World 1,Hive 2,Hadoop 3,Hive 4,Hi 5,Hello 6,World Reducer1 Hi,1 Hello,1 Hi,1 Hadoop,1 World,1 Hive,1 Hadoop,1 Hive,1 Hi,1 Hello,1 World,1
  • 25. MR Framwork Mapper Reducer < Key_In,Val_In,Key_Out,Val_out> < Key_In,Val_In,Key_Out,Val_out> Input Patitioner Output Reducer1 Hi,1 Hello,1 Hi,1 Hadoop,1 World,1 Hive,1 Hadoop,1 Hive,1 Hi,1 Hello,1 World,1 Hi,<1,1,1> Hello,<1,1> Hadoop,<1,1> World,<1,1> Hive,<1,1> Hi,3 Hello,2 Hadoop,2 World,2 Hive,2
  • 26. HIVE Driver Execution Engine =MapReduce Compiler Translator Client Submits SQL Convert SQL to Map Reduce Metastore Derby
  • 27. Hive  Database  Types – Internal (Managed table) & External  Internal(default): In case of drop operation, it will delete data + metadata  External Table: It drop only metadata  Optimization Technique  Partitioning  Bucketing
  • 28. Introduction to HBase HBase is a Nosql, non-relational, distributed column-oriented database on top of Hadoop. NoSQL - NoSQL database are databases that doesn't use SQL engine as query engine. Hbase Daemons Daemons are services that run on individual machines and communicate with each other HMaster — Master server of HBase, contains all meta data. HRegionserver — Slave server of Hbase, contains the actual data. HQuorumpeer — Zookeeper daemons for co-ordination service. Advantages of using HBase Provides a highly scalable database with nativity with hadoop. Nodes can be added on the fly.
  • 29. HBase ( LSM Tree)
  • 33. Introduction to Spark Key Features • RDD • DAG • Dataframe • Lazzy
  • 35. Our Location A H-196,304,Iind Floor Sector 63, ,Noida – 201301 Our Phone +91 120-4169097 +91 8860599698 Email / Website info@emergingindiagroup.com https://www.emergingindiagro up.com Get in Touch with Us We would be glad to hear from you !