SlideShare a Scribd company logo
www.edureka.in/hadoop
www.edureka.in/hadoop
How It Works…
 LIVE classes
 Class recordings
 Module wise Quizzes, Coding Assignments
 24x7 on-demand technical support
 Project work on large Datasets
 Online certification exam
 Lifetime access to the Learning Management System
Complimentary Java Classes
www.edureka.in/hadoop
Course Topics
 Week 1
– Understanding Big Data
– Introduction to HDFS
 Week 2
– Playing around with Cluster
– Data loading Techniques
 Week 3
– Map-Reduce Basics, types and formats
– Use-cases for Map-Reduce
 Week 4
– Analytics using Pig
– Understanding Pig Latin
 Week 5
– Analytics using Hive
– Understanding HIVE QL
 Week 6
– NoSQL Databases
– Understanding HBASE
 Week 7
– Data loading Techniques in Hbase
– Zookeeper
 Week 8
– Real world Datasets and Analysis
– Hadoop Project Environment
www.edureka.in/hadoop
What Is Big Data?
 Lots of Data(Terabytes or Petabytes)
 Systems / Enterprises generate huge amount of data from Terabytes to and even Petabytes of
information.
A airline jet collects 10 terabytes of sensor data
for every 30 minutes of flying time.
NYSE generates about one terabyte of new trade data
per day to Perform stock trading analytics to determine
trends for optimal trades.
www.edureka.in/hadoop
Facebook Example
 Facebook users spend 10.5 billion minutes
(almost 20,000 years) online on the social network.
 Facebook has an average of 3.2 billion likes and
comments are posted every day.
www.edureka.in/hadoop
Twitter Example
 Twitter has over 500 million registered users.
 The USA, whose 141.8 million accounts represents 27.4
percent of all Twitter users, good enough to finish well ahead
of Brazil, Japan, the UK and Indonesia.
 79% of US Twitter users are more like to recommend brands
they follow .
 67% of US Twitter users are more likely to buy from brands
they follow .
 57% of all companies that use social media for business use
Twitter.
www.edureka.in/hadoop
IBM’s Definition
 IBM’s definition – Big Data Characteristics
http://www-01.ibm.com/software/data/bigdata/
www.edureka.in/hadoop
 Estimated Global Data Volume:
 2011: 1.8 ZB
 2015: 7.9 ZB
 The world's information doubles every two years
 Over the next 10 years:
 The number of servers worldwide will grow by 10x
 Amount of information managed by enterprise data
centers will grow by 50x
 Number of “files” enterprise data center handle will
grow by 75x
Source: http://www.emc.com/leadership/programs/digital-universe.htm,
which was based on the 2011 IDC Digital Universe Study
Data Volume Is Growing Exponentially
www.edureka.in/hadoop
Un-Structured Data is Exploding
www.edureka.in/hadoop
Common Big Data Customer Scenarios
Industry/Vertical Scenarios
Financial Services  Modeling True Risk
 Threat Analysis
 Fraud Detection
 Trade Surveillance
 Credit Scoring And Analysis
Web & E-Tailing  Recommendation Engines
 Ad Targeting
 Search Quality
 Abuse and Click Fraud Detection
Retail  Point of sales Transaction Analysis
 Customer Churn Analysis
 Sentiment Analysis
www.edureka.in/hadoop
Industry/Vertical Scenarios
Telecommunications  Customer Churn Prevention
 Network Performance Optimization
 Call Detail Record (CDR) Analysis
 Analyzing Network to Predict Failure
Government  Fraud Detection And Cyber Security
General
(Cross Vertical)
 ETL & Processing Engine
Common Big Data Customer Scenarios (Contd.)
www.edureka.in/hadoop
Hidden Treasure
 Insight into data can provide Business Advantage.
 Some key early indicators can mean Fortunes to Business.
 More Precise Analysis with more data.
www.edureka.in/hadoop
What Big Companies Have To Say…
“Analyzing Big Data sets will become a key basis for competition.”
“Leaders in every sector will have to grapple the implications of Big Data.”
McKinsey
Gartner
Forrester
Research
“Big Data analytics are rapidly emerging as the preferred solution to business and
technology trends that are disrupting.”
“Enterprises should not delay implementation of Big Data Analytics.”
“Use Hadoop to gain a competitive advantage over more risk-averse enterprises.”
“Prioritize Big Data projects that might benefit from Hadoop.”
www.edureka.in/hadoop
Limitations of Existing Data Analytics Architecture
www.edureka.in/hadoop
Solution: A Combined Storage Computer Layer
www.edureka.in/hadoop
Differentiating Factors
www.edureka.in/hadoop
Some Of the Hadoop Users
www.edureka.in/hadoop
Hadoop Users – In Detail
http://wiki.apache.org/hadoop/PoweredBy
www.edureka.in/hadoop
Hadoop Users – In Detail
http://wiki.apache.org/hadoop/PoweredBy
www.edureka.in/hadoop
Read 1 TB Data
10 Machines
 4 I/O Channels
 Each Channel – 100 MB/s
 4 I/O Channels
 Each Channel – 100 MB/s
1 Machine
Why DFS?
www.edureka.in/hadoop
10 Machines
 4 I/O Channels
 Each Channel – 100 MB/s
 4 I/O Channels
 Each Channel – 100 MB/s
1 Machine
Read 1 TB Data
45 Minutes
Why DFS?
www.edureka.in/hadoop
4.5 Minutes45 Minutes
10 Machines
 4 I/O Channels
 Each Channel – 100 MB/s
 4 I/O Channels
 Each Channel – 100 MB/s
1 Machine
Read 1 TB Data
Why DFS?
www.edureka.in/hadoop
 Apache Hadoop is a framework that allows for the distributed processing of large data sets
across clusters of commodity computers using a simple programming model.
 It is an Open-source Data Management with scale-out storage & distributed processing.
What Is Hadoop?
www.edureka.in/hadoop
Hadoop Key Characteristics
www.edureka.in/hadoop
Hadoop History
Doug Cutting & Mike Cafarella
started working on Nutch
NY Times converts 4TB of
Image archives over 100 EC2s
Fastest sort of a TB,
62secs over 1,460 nodes
Sorted a PB in 16.25hours
Over 3.658 nodes
Fastest sort of a TB, 3.5mins
over 910 nodes
Doug Cutting adds DFS &
MapReduce support to Nutch
Google publishes GFS &
MapReduce papers Yahoo! hires Cutting,
Hadoop spins out of Nutch
Facebook launches Hive:
SQL Support for Hadoop
Doug Cutting
Joins Cloudera
Hadoop Summit 2009,
750 attendees
Founded
www.edureka.in/hadoop
Hadoop Eco-System
www.edureka.in/hadoop
 HDFS – Hadoop Distributed File System (Storage)
 Distributed across “nodes”
 Natively redundant
 NameNode tracks locations.
 MapReduce (Processing)
 Splits a task across processors
 “near” the data & assembles results
 Self-Healing, High Bandwidth
 Clustered storage
Hadoop Core Components
www.edureka.in/hadoop
Hadoop Core Components (Contd.)
www.edureka.in/hadoop
HDFS Architecture
www.edureka.in/hadoop
 NameNode:
 master of the system
 maintains and manages the blocks which are present on the
DataNodes
 DataNodes:
 slaves which are deployed on each machine and provide the
actual storage
 responsible for serving read and write requests for the clients
Main Components Of HDFS
www.edureka.in/hadoop
 Secondary NameNode:
 Not a hot standby for the NameNode
 Connects to NameNode every hour*
 Housekeeping, backup of NemeNode metadata
 Saved metadata can build a failed NameNode
You give me
metadata every
hour, I will make it
secure
Single Point
Failure
Secondary
NameNode
NameNode
metadata
metadata
Secondary NameNode
www.edureka.in/hadoop
NameNode Metadata
 Meta-data in Memory
 The entire metadata is in main memory
 No demand paging of FS meta-data
 Types of Metadata
 List of files
 List of Blocks for each file
 List of DataNode for each block
 File attributes, e.g. access time, replication factor
 A Transaction Log
 Records file creations, file deletions. etc
www.edureka.in/hadoop
JobTracker
www.edureka.in/hadoop
JobTracker (Contd.)
www.edureka.in/hadoop
JobTracker (Contd.)
www.edureka.in/hadoop
Anatomy of A File Write
www.edureka.in/hadoop
Anatomy of A File Read
www.edureka.in/hadoop
Replication and Rack Awareness
www.edureka.in/hadoop
Big Data – It’s about Scale And Structure
Structured Data Types Multi and Unstructured
Limited, No Data Processing Processing Processing coupled with Data
Standards & Structured Governance Loosely Structured
Required On write Schema Required On Read
Reads are Fast Speed Writes are Fast
Software License Cost Support Only
Known Entity Resources Growing, Complexities, Wide
Interactive OLAP Analytics
Complex ACID Transactions
Operational Data Store
Best Fit Use Data Discovery
Processing Unstructured Data
Massive Storage/Processing
RDBMSRDBMS EDW MPP NoSQL HADOOP
www.edureka.in/hadoop
 Attempt the following Assignments using the documents present in the LMS:
 Hadoop Installation - Cloudera CDH3
 Execute Linux Basic Commands
 Execute HDFS Hands On
Assignments
Thank You
See You in Class Next Week

More Related Content

What's hot

HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
Simplilearn
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQL
kristinferrier
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Mahantesh Angadi
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
Amazon Web Services
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
Arnab Mitra
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache Ambari
DataWorks Summit
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
sudhakara st
 
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
Edureka!
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureHadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
DataWorks Summit
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
Big data ppt
Big data pptBig data ppt
Big data ppt
Thirunavukkarasu Ps
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Simplilearn
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
DataWorks Summit
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
Cloudera, Inc.
 
Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop Technology
Manish Borkar
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
Stanley Wang
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Databricks
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
Prashant Gupta
 

What's hot (20)

HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
 
Introduction to HiveQL
Introduction to HiveQLIntroduction to HiveQL
Introduction to HiveQL
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache Ambari
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Hadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and FutureHadoop Infrastructure @Uber Past, Present and Future
Hadoop Infrastructure @Uber Past, Present and Future
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
 
ORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big DataORC File - Optimizing Your Big Data
ORC File - Optimizing Your Big Data
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
 
Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop Technology
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 

Viewers also liked

Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
Rommel Garcia
 
Webinar: Big Data & Hadoop - When not to use Hadoop
Webinar: Big Data & Hadoop - When not to use HadoopWebinar: Big Data & Hadoop - When not to use Hadoop
Webinar: Big Data & Hadoop - When not to use Hadoop
Edureka!
 
Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015
Edureka!
 
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Cloudera, Inc.
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
Edureka!
 
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaHadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Edureka!
 
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Edureka!
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
Edureka!
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reduce
rantav
 

Viewers also liked (9)

Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
 
Webinar: Big Data & Hadoop - When not to use Hadoop
Webinar: Big Data & Hadoop - When not to use HadoopWebinar: Big Data & Hadoop - When not to use Hadoop
Webinar: Big Data & Hadoop - When not to use Hadoop
 
Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015Webinar: Ways to Succeed with Hadoop in 2015
Webinar: Ways to Succeed with Hadoop in 2015
 
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
Hadoop World 2011: Hadoop and RDBMS with Sqoop and Other Tools - Guy Harrison...
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaHadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
 
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reduce
 

Similar to Learn Big Data & Hadoop

Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
Edureka!
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdf
Edureka!
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
Edureka!
 
Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)
Edureka!
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
nzhang
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Flavio Vit
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
Ranjith Sekar
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
Hortonworks
 
BIG DATA
BIG DATABIG DATA
BIG DATA
Shashank Shetty
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
sudhakara st
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
Prashant Gupta
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
eduarderwee
 
Bigdata and Hadoop Bootcamp
Bigdata and Hadoop BootcampBigdata and Hadoop Bootcamp
Bigdata and Hadoop Bootcamp
Spotle.ai
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
Edureka!
 
No sql databases
No sql databasesNo sql databases
Big data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and HealthcareBig data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and Healthcare
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
Big data and hadoop product page
Big data and hadoop product pageBig data and hadoop product page
Big data and hadoop product page
Janu Jahnavi
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
Hortonworks
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Cloudera, Inc.
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
NetApp
 

Similar to Learn Big Data & Hadoop (20)

Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdf
 
Introduction to Hadoop Administration
Introduction to Hadoop AdministrationIntroduction to Hadoop Administration
Introduction to Hadoop Administration
 
Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
 
BIG DATA
BIG DATABIG DATA
BIG DATA
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
Hadoop File system (HDFS)
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
Bigdata and Hadoop Bootcamp
Bigdata and Hadoop BootcampBigdata and Hadoop Bootcamp
Bigdata and Hadoop Bootcamp
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Big data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and HealthcareBig data technologies with Case Study Finance and Healthcare
Big data technologies with Case Study Finance and Healthcare
 
Big data and hadoop product page
Big data and hadoop product pageBig data and hadoop product page
Big data and hadoop product page
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
 

More from Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
Edureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Edureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Edureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
Edureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
Edureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
Edureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
Edureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
Edureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
Edureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
Edureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
Edureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
Edureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
Edureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
Edureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Edureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
Edureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Edureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
Edureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
Edureka!
 

More from Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Recently uploaded

Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Leena Ghag-Sakpal
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
MysoreMuleSoftMeetup
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
S. Raj Kumar
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
Constructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective CommunicationConstructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective Communication
Chevonnese Chevers Whyte, MBA, B.Sc.
 
Solutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptxSolutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptx
spdendr
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
PsychoTech Services
 

Recently uploaded (20)

Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
Bed Making ( Introduction, Purpose, Types, Articles, Scientific principles, N...
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47Mule event processing models | MuleSoft Mysore Meetup #47
Mule event processing models | MuleSoft Mysore Meetup #47
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching AptitudeUGC NET Exam Paper 1- Unit 1:Teaching Aptitude
UGC NET Exam Paper 1- Unit 1:Teaching Aptitude
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
Constructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective CommunicationConstructing Your Course Container for Effective Communication
Constructing Your Course Container for Effective Communication
 
Solutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptxSolutons Maths Escape Room Spatial .pptx
Solutons Maths Escape Room Spatial .pptx
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...Gender and Mental Health - Counselling and Family Therapy Applications and In...
Gender and Mental Health - Counselling and Family Therapy Applications and In...
 

Learn Big Data & Hadoop

  • 2. www.edureka.in/hadoop How It Works…  LIVE classes  Class recordings  Module wise Quizzes, Coding Assignments  24x7 on-demand technical support  Project work on large Datasets  Online certification exam  Lifetime access to the Learning Management System Complimentary Java Classes
  • 3. www.edureka.in/hadoop Course Topics  Week 1 – Understanding Big Data – Introduction to HDFS  Week 2 – Playing around with Cluster – Data loading Techniques  Week 3 – Map-Reduce Basics, types and formats – Use-cases for Map-Reduce  Week 4 – Analytics using Pig – Understanding Pig Latin  Week 5 – Analytics using Hive – Understanding HIVE QL  Week 6 – NoSQL Databases – Understanding HBASE  Week 7 – Data loading Techniques in Hbase – Zookeeper  Week 8 – Real world Datasets and Analysis – Hadoop Project Environment
  • 4. www.edureka.in/hadoop What Is Big Data?  Lots of Data(Terabytes or Petabytes)  Systems / Enterprises generate huge amount of data from Terabytes to and even Petabytes of information. A airline jet collects 10 terabytes of sensor data for every 30 minutes of flying time. NYSE generates about one terabyte of new trade data per day to Perform stock trading analytics to determine trends for optimal trades.
  • 5. www.edureka.in/hadoop Facebook Example  Facebook users spend 10.5 billion minutes (almost 20,000 years) online on the social network.  Facebook has an average of 3.2 billion likes and comments are posted every day.
  • 6. www.edureka.in/hadoop Twitter Example  Twitter has over 500 million registered users.  The USA, whose 141.8 million accounts represents 27.4 percent of all Twitter users, good enough to finish well ahead of Brazil, Japan, the UK and Indonesia.  79% of US Twitter users are more like to recommend brands they follow .  67% of US Twitter users are more likely to buy from brands they follow .  57% of all companies that use social media for business use Twitter.
  • 7. www.edureka.in/hadoop IBM’s Definition  IBM’s definition – Big Data Characteristics http://www-01.ibm.com/software/data/bigdata/
  • 8. www.edureka.in/hadoop  Estimated Global Data Volume:  2011: 1.8 ZB  2015: 7.9 ZB  The world's information doubles every two years  Over the next 10 years:  The number of servers worldwide will grow by 10x  Amount of information managed by enterprise data centers will grow by 50x  Number of “files” enterprise data center handle will grow by 75x Source: http://www.emc.com/leadership/programs/digital-universe.htm, which was based on the 2011 IDC Digital Universe Study Data Volume Is Growing Exponentially
  • 10. www.edureka.in/hadoop Common Big Data Customer Scenarios Industry/Vertical Scenarios Financial Services  Modeling True Risk  Threat Analysis  Fraud Detection  Trade Surveillance  Credit Scoring And Analysis Web & E-Tailing  Recommendation Engines  Ad Targeting  Search Quality  Abuse and Click Fraud Detection Retail  Point of sales Transaction Analysis  Customer Churn Analysis  Sentiment Analysis
  • 11. www.edureka.in/hadoop Industry/Vertical Scenarios Telecommunications  Customer Churn Prevention  Network Performance Optimization  Call Detail Record (CDR) Analysis  Analyzing Network to Predict Failure Government  Fraud Detection And Cyber Security General (Cross Vertical)  ETL & Processing Engine Common Big Data Customer Scenarios (Contd.)
  • 12. www.edureka.in/hadoop Hidden Treasure  Insight into data can provide Business Advantage.  Some key early indicators can mean Fortunes to Business.  More Precise Analysis with more data.
  • 13. www.edureka.in/hadoop What Big Companies Have To Say… “Analyzing Big Data sets will become a key basis for competition.” “Leaders in every sector will have to grapple the implications of Big Data.” McKinsey Gartner Forrester Research “Big Data analytics are rapidly emerging as the preferred solution to business and technology trends that are disrupting.” “Enterprises should not delay implementation of Big Data Analytics.” “Use Hadoop to gain a competitive advantage over more risk-averse enterprises.” “Prioritize Big Data projects that might benefit from Hadoop.”
  • 14. www.edureka.in/hadoop Limitations of Existing Data Analytics Architecture
  • 18. www.edureka.in/hadoop Hadoop Users – In Detail http://wiki.apache.org/hadoop/PoweredBy
  • 19. www.edureka.in/hadoop Hadoop Users – In Detail http://wiki.apache.org/hadoop/PoweredBy
  • 20. www.edureka.in/hadoop Read 1 TB Data 10 Machines  4 I/O Channels  Each Channel – 100 MB/s  4 I/O Channels  Each Channel – 100 MB/s 1 Machine Why DFS?
  • 21. www.edureka.in/hadoop 10 Machines  4 I/O Channels  Each Channel – 100 MB/s  4 I/O Channels  Each Channel – 100 MB/s 1 Machine Read 1 TB Data 45 Minutes Why DFS?
  • 22. www.edureka.in/hadoop 4.5 Minutes45 Minutes 10 Machines  4 I/O Channels  Each Channel – 100 MB/s  4 I/O Channels  Each Channel – 100 MB/s 1 Machine Read 1 TB Data Why DFS?
  • 23. www.edureka.in/hadoop  Apache Hadoop is a framework that allows for the distributed processing of large data sets across clusters of commodity computers using a simple programming model.  It is an Open-source Data Management with scale-out storage & distributed processing. What Is Hadoop?
  • 25. www.edureka.in/hadoop Hadoop History Doug Cutting & Mike Cafarella started working on Nutch NY Times converts 4TB of Image archives over 100 EC2s Fastest sort of a TB, 62secs over 1,460 nodes Sorted a PB in 16.25hours Over 3.658 nodes Fastest sort of a TB, 3.5mins over 910 nodes Doug Cutting adds DFS & MapReduce support to Nutch Google publishes GFS & MapReduce papers Yahoo! hires Cutting, Hadoop spins out of Nutch Facebook launches Hive: SQL Support for Hadoop Doug Cutting Joins Cloudera Hadoop Summit 2009, 750 attendees Founded
  • 27. www.edureka.in/hadoop  HDFS – Hadoop Distributed File System (Storage)  Distributed across “nodes”  Natively redundant  NameNode tracks locations.  MapReduce (Processing)  Splits a task across processors  “near” the data & assembles results  Self-Healing, High Bandwidth  Clustered storage Hadoop Core Components
  • 30. www.edureka.in/hadoop  NameNode:  master of the system  maintains and manages the blocks which are present on the DataNodes  DataNodes:  slaves which are deployed on each machine and provide the actual storage  responsible for serving read and write requests for the clients Main Components Of HDFS
  • 31. www.edureka.in/hadoop  Secondary NameNode:  Not a hot standby for the NameNode  Connects to NameNode every hour*  Housekeeping, backup of NemeNode metadata  Saved metadata can build a failed NameNode You give me metadata every hour, I will make it secure Single Point Failure Secondary NameNode NameNode metadata metadata Secondary NameNode
  • 32. www.edureka.in/hadoop NameNode Metadata  Meta-data in Memory  The entire metadata is in main memory  No demand paging of FS meta-data  Types of Metadata  List of files  List of Blocks for each file  List of DataNode for each block  File attributes, e.g. access time, replication factor  A Transaction Log  Records file creations, file deletions. etc
  • 39. www.edureka.in/hadoop Big Data – It’s about Scale And Structure Structured Data Types Multi and Unstructured Limited, No Data Processing Processing Processing coupled with Data Standards & Structured Governance Loosely Structured Required On write Schema Required On Read Reads are Fast Speed Writes are Fast Software License Cost Support Only Known Entity Resources Growing, Complexities, Wide Interactive OLAP Analytics Complex ACID Transactions Operational Data Store Best Fit Use Data Discovery Processing Unstructured Data Massive Storage/Processing RDBMSRDBMS EDW MPP NoSQL HADOOP
  • 40. www.edureka.in/hadoop  Attempt the following Assignments using the documents present in the LMS:  Hadoop Installation - Cloudera CDH3  Execute Linux Basic Commands  Execute HDFS Hands On Assignments
  • 41. Thank You See You in Class Next Week

Editor's Notes

  1. Accessible: Hadoop runs on large clusters of commodity machines or cloud computing services such as Amazon EC2Robust: Since Hadoop can run on commodity cluster, its designed with the assumption of frequent hardware failure, it can gracefully handle such failure and computation don’t stop because of few failed devices / systemsScalable:Hadoop scales linearly to handle large data by adding more slave nodes to the clusterSimple : Its easy to write efficient parallel programming with Hadoop
  2. Data transferred from DataNode to MapTask process. DBlk is the file data block; CBlk is the file checksum block. File data are transferred to the client through Java niotransferTo (aka UNIX sendfilesyscall). Checksum data are first fetched to DataNode JVM buffer, and then pushed to the client (details are not shown). Both file data and checksum data are bundled in an HDFS packet (typically 64KB) in the format of: {packet header | checksum bytes | data bytes}.2. Data received from the socket are buffered in a BufferedInputStream, presumably for the purpose of reducing the number of syscalls to the kernel. This actually involves two buffer-copies: first, data are copied from kernel buffers into a temporary direct buffer in JDK code; second, data are copied from the temporary direct buffer to the byte[] buffer owned by the BufferedInputStream. The size of the byte[] in BufferedInputStream is controlled by configuration property "io.file.buffer.size", and is default to 4K. In our production environment, this parameter is customized to 128K.3. Through the BufferedInputStream, the checksum bytes are saved into an internal ByteBuffer (whose size is roughly (PacketSize / 512 * 4) or 512B), and file bytes (compressed data) are deposited into the byte[] buffer supplied by the decompression layer. Since the checksum calculation requires a full 512 byte chunk while a user's request may not be aligned with a chunk boundary, a 512B byte[] buffer is used to align the input before copying partial chunks into user-supplied byte[] buffer. Also note that data are copied to the buffer in 512-byte pieces (as required by FSInputChecker API). Finally, all checksum bytes are copied to a 4-byte array for FSInputChecker to perform checksum verification. Overall, this step involves an extra buffer-copy.4. The decompression layer uses a byte[] buffer to receive data from the DFSClient layer. The DecompressorStream copies the data from the byte[] buffer to a 64K direct buffer, calls the native library code to decompress the data and stores the uncompressed bytes in another 64K direct buffer. This step involves two buffer-copies.5.LineReader maintains an internal buffer to absorb data from the downstream. From the buffer, line separators are discovered and line bytes are copied to form Text objects. This step requires two buffer-copies.The client creates the file by calling create() on Distributed FileSystem (step 1). Distributed FileSystem makes an RPC call to the namenode to create a new file in the filesystem’s namespace, with no blocks associated with it (step 2). The namenode performs various checks to make sure the file doesn’t already exist, and that the client has the right permissions to create the file.
  3. The client opens the file it wishes to read by calling open() on the FileSystemobject,which for HDFS is an instance of DFS(step 1).DistributedFileSystem calls the namenode, using RPC, to determine the locations of the blocks for the first few blocks in the File (step 2). For each block, the namenode returns the addresses of the datanodes that have a copy of that block
  4. PIG is procedural and SQL is declarative.While fields within a SQL record must be atomic (contain one single value), fields within a PIG tuple can be multi-valued, e.g. a collection of another PIG tuples, or a map with key be an atomic data and value be anything.Unlike SQL query where the input data need to be physically loaded into the DB tables, PIG extract the data from its original data sources directly during execution.PIG is lazily executed. It use a backtracking mechanism from its "store" statement to determine which statement needs to be executed.