SlideShare a Scribd company logo
FURTHERMORE:
Big Data Hadoop Certification Training
BIG DATA HADOOP
CHEAT SHEETBig Data
Comprises of large datasets that cannot be processed using traditional computing
techniques, which includes huge volumes, high velocity and extensible variety of data.
Hadoop
An Apache open source framework written in JAVA which allows distributed processing
of large datasets across clusters of computers using simple programming models.
Hadoop Common
These are the JAVA libraries and utilities required by other Hadoop modules which
contains the necessary scripts and files required to start Hadoop
Hadoop YARN
A framework used for job scheduling and managing the cluster resources
Hadoop Distributed File System
A Java based file system that provides scalable and reliable data storage and it provides
high throughput access to the application data
Hadoop MapReduce
A software framework, which is used for writing the applications easily which process
big amount of data in parallel on large clusters
Apache hive
An infrastructure for data warehousing for Hadoop
Apache oozie
An application in Java responsible for scheduling Hadoop jobs
Apache Pig
A data flow platform that is responsible for the execution of the MapReduce jobs
Apache Spark
An open source framework used for cluster computing
Flume
An open source aggregation service responsible for collection and transport of data
from source to destination
Hbase
A column-oriented database of Hadoop that stores big data in a scalable way
Sqoop
An interface application that is used to transfer data between Hadoop and relational
database through commands
Hadoop Distributed File System
Map-Reduce
HBASE
Database with Real-
Time Access
PIG Hive Sqoop
ETL BI Reporting RDBMSTop Level
Interfaces
Top Level
Abstractions
Distributed
Data
Processing
At the base is a Self-
healing clustered
storage system
Different components of Hadoop Architecture involves:
Hadoop File Automation Commands
Commands Task Syntax
cat
Used to copy the source path to the
destination or the standard output
hdfsdfs –cat URI [URI- – -]
chgrp Used to change the group of the files hdfsdfs –chgrp [-R] GROUP URI [URI—]
chmod Used to change the permissions of the file
hdfsdfs –chmod [-R] <MODE[,MODE]- – -:
OCTALMODE> URI [URI – – -]
chown Used to change the owner of the file
hdfsdfs –chown [-
R][OWNER][:{GROUP]]URI[URI]
count Used to count the number of directories hdfs dfs –count [-q] <paths>
cp
Used to copy one or more than one files from
the source to destination path
hdfsdfs –cp URI[URI – – -]<dest>
Du Used to display the size of directories or files hdfsdfs –du [-s][-h]URI [URI – – -]
get Used to copy files to the local file system hdfs dfs –get[-ignorecrc][-crc]<src><localdst>
ls
Used to display the statistics of any file or
directory
hdfsdfs –ls <args>
mkdir Used to create one or more directories hdfsdfs –mkdir<path>
mv
Used to move one or more files from one
location to other
hdfs dfs –mv URI[URI – – -]<dest>
put Used to read from one file system to other hdfsdfs –put<localsrc>- – -<dest>
rm Used to delete one or more than one files hdfsdfs –rmr[-skipTrash]URI[URI- – – ]
stat
Used to display the information of any specific
path
hdfsdfs –stat URI[URI – – -]
help
Used to display the usage information of the
command
help<cmd-name>
Commands Task
Balancer To run cluster balancing utility
daemonlog To get or set the log level of each daemon
dfsadmin To run many HDFS administrative operations
Datanode To run HDFS datanode service
mradmin
To run a number of mapReduce administrative
operations
Jobtracker To run mapReduce job tracker
Namenode To run name node
Tasktracker To run mapReduce task tracker node
Secondary namenode To run secondary namenode
Hadoop Administration Commands
• Learn from industry experts and be sought-after by the industry!
• Learn any technology, show exemplary skills and have an
unmatched career!
• The most trending technology courses to help you fast-track your
career!
• Logical modules for both beginners and mid-level learners
• All recorded sessions available in LMS for lifetime
• 24*7 Support for Lifetime
• Learn Anytime, Anywhere

More Related Content

What's hot

Cloudera Hadoop Distribution
Cloudera Hadoop DistributionCloudera Hadoop Distribution
Cloudera Hadoop Distribution
Thisara Pramuditha
 
InfluxDb
InfluxDbInfluxDb
InfluxDb
Guamaral Vasil
 
Building Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at CernerBuilding Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at Cerner
Databricks
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Kai Wähner
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
ELT vs. ETL - How they’re different and why it matters
ELT vs. ETL - How they’re different and why it mattersELT vs. ETL - How they’re different and why it matters
ELT vs. ETL - How they’re different and why it matters
Matillion
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Manasi Vartak
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
Rommel Garcia
 
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
Chris Hoyean Song
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
Databricks
 
I Love APIs 2015 : Zero to Thousands TPS Private Cloud Operations Workshop
I Love APIs 2015 : Zero to Thousands TPS Private Cloud Operations WorkshopI Love APIs 2015 : Zero to Thousands TPS Private Cloud Operations Workshop
I Love APIs 2015 : Zero to Thousands TPS Private Cloud Operations Workshop
Apigee | Google Cloud
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
Dan Gunter
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
Jordan Birdsell
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Ghulam Imaduddin
 
Big data ppt
Big data pptBig data ppt
Big data ppt
Deepika ParthaSarathy
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
ScyllaDB
 
Data ops in practice
Data ops in practiceData ops in practice
Data ops in practice
Lars Albertsson
 
IoT:what about data storage?
IoT:what about data storage?IoT:what about data storage?
IoT:what about data storage?
DataWorks Summit/Hadoop Summit
 
BIG DATA-Seminar Report
BIG DATA-Seminar ReportBIG DATA-Seminar Report
BIG DATA-Seminar Report
josnapv
 

What's hot (20)

Cloudera Hadoop Distribution
Cloudera Hadoop DistributionCloudera Hadoop Distribution
Cloudera Hadoop Distribution
 
InfluxDb
InfluxDbInfluxDb
InfluxDb
 
Building Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at CernerBuilding Data Quality Audit Framework using Delta Lake at Cerner
Building Data Quality Audit Framework using Delta Lake at Cerner
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
ELT vs. ETL - How they’re different and why it matters
ELT vs. ETL - How they’re different and why it mattersELT vs. ETL - How they’re different and why it matters
ELT vs. ETL - How they’re different and why it matters
 
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and PrometheusRobust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
Robust MLOps with Open-Source: ModelDB, Docker, Jenkins, and Prometheus
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
[EN] Building modern data pipeline with Snowflake + DBT + Airflow.pdf
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
I Love APIs 2015 : Zero to Thousands TPS Private Cloud Operations Workshop
I Love APIs 2015 : Zero to Thousands TPS Private Cloud Operations WorkshopI Love APIs 2015 : Zero to Thousands TPS Private Cloud Operations Workshop
I Love APIs 2015 : Zero to Thousands TPS Private Cloud Operations Workshop
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Data ops in practice
Data ops in practiceData ops in practice
Data ops in practice
 
IoT:what about data storage?
IoT:what about data storage?IoT:what about data storage?
IoT:what about data storage?
 
BIG DATA-Seminar Report
BIG DATA-Seminar ReportBIG DATA-Seminar Report
BIG DATA-Seminar Report
 

Similar to Big data-cheat-sheet

MapReduce1.pptx
MapReduce1.pptxMapReduce1.pptx
MapReduce1.pptx
ashimashahi1
 
BIG DATA: Apache Hadoop
BIG DATA: Apache HadoopBIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
Oleksiy Krotov
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
DanishMahmood23
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoop
Frank Y
 
Hadoop vs Apache Spark
Hadoop vs Apache SparkHadoop vs Apache Spark
Hadoop vs Apache Spark
ALTEN Calsoft Labs
 
Managing Big data with Hadoop
Managing Big data with HadoopManaging Big data with Hadoop
Managing Big data with Hadoop
Nalini Mehta
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
Chirag Ahuja
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
ManiMaran230751
 
Hadoop, Map Reduce and Apache Pig tutorial
Hadoop, Map Reduce and Apache Pig tutorialHadoop, Map Reduce and Apache Pig tutorial
Hadoop, Map Reduce and Apache Pig tutorial
Pranamesh Chakraborty
 
Bd class 2 complete
Bd class 2 completeBd class 2 complete
Bd class 2 complete
JigsawAcademy2014
 
Design and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on RaspberryDesign and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on Raspberry
IJRESJOURNAL
 
Big Data and Hadoop Guide
Big Data and Hadoop GuideBig Data and Hadoop Guide
Big Data and Hadoop Guide
Simplilearn
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
Uday Vakalapudi
 
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
AyeeshaParveen
 
Hadoop
HadoopHadoop
Hadoop
avnishagr
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
Derek Chen
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
chunkypandey12
 

Similar to Big data-cheat-sheet (20)

MapReduce1.pptx
MapReduce1.pptxMapReduce1.pptx
MapReduce1.pptx
 
BIG DATA: Apache Hadoop
BIG DATA: Apache HadoopBIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptxTopic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
 
Data analysis on hadoop
Data analysis on hadoopData analysis on hadoop
Data analysis on hadoop
 
Hadoop vs Apache Spark
Hadoop vs Apache SparkHadoop vs Apache Spark
Hadoop vs Apache Spark
 
Managing Big data with Hadoop
Managing Big data with HadoopManaging Big data with Hadoop
Managing Big data with Hadoop
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Unit 1
Unit 1Unit 1
Unit 1
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
 
Hadoop, Map Reduce and Apache Pig tutorial
Hadoop, Map Reduce and Apache Pig tutorialHadoop, Map Reduce and Apache Pig tutorial
Hadoop, Map Reduce and Apache Pig tutorial
 
Bd class 2 complete
Bd class 2 completeBd class 2 complete
Bd class 2 complete
 
Design and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on RaspberryDesign and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on Raspberry
 
Big Data and Hadoop Guide
Big Data and Hadoop GuideBig Data and Hadoop Guide
Big Data and Hadoop Guide
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science  Bon Secours...
Hadoop ecosystem; J.Ayeesha parveen 2 nd M.sc., computer science Bon Secours...
 
Hadoop
HadoopHadoop
Hadoop
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
 

Recently uploaded

guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptxLiving-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
TristanJasperRamos
 
ER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAEER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAE
Himani415946
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
Output determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CCOutput determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CC
ShahulHameed54211
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 

Recently uploaded (16)

guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptxLiving-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
Living-in-IT-era-Module-7-Imaging-and-Design-for-Social-Impact.pptx
 
ER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAEER(Entity Relationship) Diagram for online shopping - TAE
ER(Entity Relationship) Diagram for online shopping - TAE
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
Output determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CCOutput determination SAP S4 HANA SAP SD CC
Output determination SAP S4 HANA SAP SD CC
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 

Big data-cheat-sheet

  • 1. FURTHERMORE: Big Data Hadoop Certification Training BIG DATA HADOOP CHEAT SHEETBig Data Comprises of large datasets that cannot be processed using traditional computing techniques, which includes huge volumes, high velocity and extensible variety of data. Hadoop An Apache open source framework written in JAVA which allows distributed processing of large datasets across clusters of computers using simple programming models. Hadoop Common These are the JAVA libraries and utilities required by other Hadoop modules which contains the necessary scripts and files required to start Hadoop Hadoop YARN A framework used for job scheduling and managing the cluster resources Hadoop Distributed File System A Java based file system that provides scalable and reliable data storage and it provides high throughput access to the application data Hadoop MapReduce A software framework, which is used for writing the applications easily which process big amount of data in parallel on large clusters Apache hive An infrastructure for data warehousing for Hadoop Apache oozie An application in Java responsible for scheduling Hadoop jobs Apache Pig A data flow platform that is responsible for the execution of the MapReduce jobs Apache Spark An open source framework used for cluster computing Flume An open source aggregation service responsible for collection and transport of data from source to destination Hbase A column-oriented database of Hadoop that stores big data in a scalable way Sqoop An interface application that is used to transfer data between Hadoop and relational database through commands Hadoop Distributed File System Map-Reduce HBASE Database with Real- Time Access PIG Hive Sqoop ETL BI Reporting RDBMSTop Level Interfaces Top Level Abstractions Distributed Data Processing At the base is a Self- healing clustered storage system Different components of Hadoop Architecture involves: Hadoop File Automation Commands Commands Task Syntax cat Used to copy the source path to the destination or the standard output hdfsdfs –cat URI [URI- – -] chgrp Used to change the group of the files hdfsdfs –chgrp [-R] GROUP URI [URI—] chmod Used to change the permissions of the file hdfsdfs –chmod [-R] <MODE[,MODE]- – -: OCTALMODE> URI [URI – – -] chown Used to change the owner of the file hdfsdfs –chown [- R][OWNER][:{GROUP]]URI[URI] count Used to count the number of directories hdfs dfs –count [-q] <paths> cp Used to copy one or more than one files from the source to destination path hdfsdfs –cp URI[URI – – -]<dest> Du Used to display the size of directories or files hdfsdfs –du [-s][-h]URI [URI – – -] get Used to copy files to the local file system hdfs dfs –get[-ignorecrc][-crc]<src><localdst> ls Used to display the statistics of any file or directory hdfsdfs –ls <args> mkdir Used to create one or more directories hdfsdfs –mkdir<path> mv Used to move one or more files from one location to other hdfs dfs –mv URI[URI – – -]<dest> put Used to read from one file system to other hdfsdfs –put<localsrc>- – -<dest> rm Used to delete one or more than one files hdfsdfs –rmr[-skipTrash]URI[URI- – – ] stat Used to display the information of any specific path hdfsdfs –stat URI[URI – – -] help Used to display the usage information of the command help<cmd-name> Commands Task Balancer To run cluster balancing utility daemonlog To get or set the log level of each daemon dfsadmin To run many HDFS administrative operations Datanode To run HDFS datanode service mradmin To run a number of mapReduce administrative operations Jobtracker To run mapReduce job tracker Namenode To run name node Tasktracker To run mapReduce task tracker node Secondary namenode To run secondary namenode Hadoop Administration Commands • Learn from industry experts and be sought-after by the industry! • Learn any technology, show exemplary skills and have an unmatched career! • The most trending technology courses to help you fast-track your career! • Logical modules for both beginners and mid-level learners • All recorded sessions available in LMS for lifetime • 24*7 Support for Lifetime • Learn Anytime, Anywhere