SlideShare a Scribd company logo
1 of 19
Download to read offline
Bon Secours College for Women
Accredited with A++ Grade by NAAC in Cycle-II
Recognized by 2(f) and 12(B) Institution, Vilar, Bypass,
Thanjavur.
Dr.M.FLORENCE DAYANA
Assistant Professor
Department of Computer Applications
Year : 2023 – 2024 Class : II-MSc. CS
Semester : III
Course : Big Data Analytics (PP22CSCC31 )
Unit : IV
Hadoop Foundation for Analytics
 History pf Hadoop
 Features
 Key Advantages of Hadoop
 Why Hadoop
 Versions of Hadoop
 Essential of Hadoop ecosystem
 RDBMS versus Hadoop
 Key Aspects of Hadoop
 Components of Hadoop
Hadoop Foundation for Analytics
• Hadoop was created by Doug Cutting and Mike
Cafarella in 2005.
• It was originally developed to support distribution
for the Nutch search engine project.
• In 2006, Hadoop was released by Yahoo and today
is maintained and distributed by Apache Software
Foundation (ASF).
History of Hadoop
 Handles massive quantities of structured, semi structured
and unstructured data using commodity h/w
 Has shared nothing architecture
 Replicates data across multiple computers-Replica
 For high throughput rather than latency
 Batch processing therefore response time is not immediate
 Complements OLTP and OLAP
 Not a replacement for RDBMS
 Not good when work cannot be parallelized
 Not good for processing small files
Features
Key Advantages of Hadoop
1. Stores data in its native form(HDFS)
 No structure that is imposed in keying or storing data
 Schema less
 Only when data needs to be processed that structure
is imposed on new data
2. Scalable
 Can store and distribute very large data sets across
hundred of inexpensive servers that operate in
parallel
3. Cost Effective
 Has a much reduced cost/terabyte of storage and
processing
Key Advantages of Hadoop
4. Resilient to Failure
 Fault tolerant. Practices replication of data. When
 data is sent, it is replicated.
5. Flexibility
 Works with all type of data structures. Helps drive
 meaningful information from email, social media.
 ClickStreamData.
 Put to several purpose such as log analysis, data
 mining, recommendation systems, market campaign
 analysis etc.
6. Fast
Extremely fast. Moves code to data.
Why Hadoop
Why
Hadoop
Inherent
Data
Protection
Low
Cost
Computing
Power
Scalability
Storage
Flexibility
• Hadoop 1.0
• • Data storage Framework
• • Data processing
Versions of Hadoop
Hadoop 1.0
Hadoop 2.0
Hadoop 1.0
Data storage Framework
• HDFS is schemaless. Stores data files in data format.
• Stores files close to original form.
Data processing framework:
• Uses two functions MAP and REDUCE to process data.
• “Mappers” take in a set of key value pairs and generate
intermediate data.
• “Reducers” act on this input to produce the output data.
Two functions work in isolation enabling high distributed in
a high parallel, fault tolerant and scalable way
Versions of Hadoop
Limitations
• Requires MapReduce programming expertise with
proficiency required in other programming languages
like Java
• Supported batch processing suitable for tasks such as
log analysis, large scale data mining projects.
• Tightly computationally coupled with MapReduce.
Either rewrite their functionality in MapReduce so that
it could be executed in Hadoop or extract the data from
HDFS and process it outside of Hadoop. None of the
options were viable as a Hadoop. Led to process
inefficiencies caused by the data being moved in and
out of Hadoop cluster.
Hadoop 1.0
• HDFS continues to be the data storage framework.
• Yet Another Resource Negotiator(YARN) has been
added
• Any application capable of dividing itself into
parallel tasks is supported by YARN
• YARN co ordinates the allocation of the subtasks of
the submitted applications thereby enhancing
flexibility, scalability and efficiency of the
applications
Hadoop 2.0
• It works by having ApplicationMaster in place of
the JobTracker , Running applications on resources
governed by a new NodeManager
• MapReduce programming expertise is no longer
required
• It supports Batch Processing and also Real time
processing
• Data Processing Functions such as Data
Standardisation, Master Data Management can
now be performed in HDFS.
Hadoop 2.0
Supports projects to enhance the functionality of
Hadoop Core Components
The Eco projects
• HIVE
• PIG
• SQOOP
• HBASE
• FLUME.
• OOZIE
• MAHOUT
Essential of Hadoop Ecosystems
Essential of Hadoop Ecosystems
The Eco projects are
• HIVE: It enables analysis of large data sets using a
language similar to standard ANSI SQL. Enables to access
data stored on a Hadoop Cluster
• PIG: Easy to understand data flow language. Helps with
the analysis of large data sets. Even without the
proficiency in MapReduce, the data in the Hadoop cluster
can be analysed as PIG scripts are automatically
converted into MapReduce jobs by the PIG interpreter
• SQOOP: Used to transfer bulk data between Hadoop and
structured data stores as RDBMS
• HBASE: It is Hadoop’s database and compares well with
an RDBMS. It supports structured data storage for large
tables
• FLUME:Is a distributed, reliable and available software
for efficiently collecting, aggregating and moving large
amounts of log data. Has simple and flexible
architecture.
• OOZIE: It is a workflow scheduler system to manage
Apache Hadoop jobs
• MAHOUT: It is a scalable machine learning and data
mining library
Essential of Hadoop Ecosystems
RDBMS versus HADOOP
PARAMETERS RDBMS HADOOP
System Relational database
Management System
Node Based Flat Structure
Data Suitable for structured
data
Suitable for structured, unstructured data,
Supports variety of data formats in real time
such as XML, JSON, text based flat file
formats etc.
Processing OLTP Analytical, Big Data Processing
Choice When the data needs
consistent Relationship
Big Data processing, which does not require
any consistent relationships between data
Processor Needs expensive
hardware or high-end
processors to store
huge volumes of data
In a HADOOP cluster, a node requires only a
processor, a network card and few hard
drives
Cost Cost around $10,000
to $14,000 per
terabytes of storage
Cost around $4,000 per terabytes of storage
1
• Open Source Software
• It is free to download, use and contribute
2
• Framework
• The requirements to develop and execute and application is
provided-program tools etc.
3
• Distributed
• Divides and stores data across multiple computers.
• Computation/Processing is done in parallel across multiple
connected nodes
4
• Massive Storage
• Stores colossal amounts of data across nodes of low-cost commodity
hardware
5
• Faster Processing
• Large amounts of data is processed in parallel yielding quick
response
Key Aspects of Hadoop
Components of Hadoop
HDFS
Storage Components
Distribute data across several
nodes
Natively redundant
MapReduce
Computational framework
Splits a task across several nodes
Process data in parallel
Core Components Hadoop Ecosystem
• HIVE
• PIG
• SQOOP
• HBASE
• FLUME
• OOZIE
• MAHOUT

More Related Content

Similar to M. Florence Dayana - Hadoop Foundation for Analytics.pptx

Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop TechnologyRahul Sharma
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenHadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenmaharajothip1
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoopOmar Jaber
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdataTom Rogers
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDYVenneladonthireddy1
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud ComputingFarzad Nozarian
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentationArvind Kumar
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptManiMaran230751
 

Similar to M. Florence Dayana - Hadoop Foundation for Analytics.pptx (20)

Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop info
Hadoop infoHadoop info
Hadoop info
 
Big Data Hadoop Technology
Big Data Hadoop TechnologyBig Data Hadoop Technology
Big Data Hadoop Technology
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for womenHadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
Hadoop Maharajathi,II-M.sc.,Computer Science,Bonsecours college for women
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoop
 
Hadoop in a Nutshell
Hadoop in a NutshellHadoop in a Nutshell
Hadoop in a Nutshell
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdata
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3
 
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
 
Hadoop hive presentation
Hadoop hive presentationHadoop hive presentation
Hadoop hive presentation
 
Hadoop An Introduction
Hadoop An IntroductionHadoop An Introduction
Hadoop An Introduction
 
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.pptHADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
HADOOP AND MAPREDUCE ARCHITECTURE-Unit-5.ppt
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 

More from Dr.Florence Dayana

Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdfDr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdfDr.Florence Dayana
 
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdfDr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdfDr.Florence Dayana
 
M. FLORENCE DAYANA/unit - II logic gates and circuits.pdf
M. FLORENCE DAYANA/unit - II logic gates and circuits.pdfM. FLORENCE DAYANA/unit - II logic gates and circuits.pdf
M. FLORENCE DAYANA/unit - II logic gates and circuits.pdfDr.Florence Dayana
 
M.FLORENCE DAYANA/electronic mail security.pdf
M.FLORENCE DAYANA/electronic mail security.pdfM.FLORENCE DAYANA/electronic mail security.pdf
M.FLORENCE DAYANA/electronic mail security.pdfDr.Florence Dayana
 
M. FLORENCE DAYANA - INPUT & OUTPUT DEVICES.pdf
M. FLORENCE DAYANA - INPUT & OUTPUT DEVICES.pdfM. FLORENCE DAYANA - INPUT & OUTPUT DEVICES.pdf
M. FLORENCE DAYANA - INPUT & OUTPUT DEVICES.pdfDr.Florence Dayana
 
Professional English - Reading
Professional English - ReadingProfessional English - Reading
Professional English - ReadingDr.Florence Dayana
 
Professional English - Speaking
Professional English - SpeakingProfessional English - Speaking
Professional English - SpeakingDr.Florence Dayana
 
Professional English - Listening
Professional English - ListeningProfessional English - Listening
Professional English - ListeningDr.Florence Dayana
 
Network Security- Secure Socket Layer
Network Security- Secure Socket LayerNetwork Security- Secure Socket Layer
Network Security- Secure Socket LayerDr.Florence Dayana
 
M.florence dayana dream weaver
M.florence dayana   dream weaverM.florence dayana   dream weaver
M.florence dayana dream weaverDr.Florence Dayana
 
M.florence dayana computer networks transport layer
M.florence dayana   computer networks transport layerM.florence dayana   computer networks transport layer
M.florence dayana computer networks transport layerDr.Florence Dayana
 
M.Florence Dayana Computer Networks Types
M.Florence Dayana  Computer Networks TypesM.Florence Dayana  Computer Networks Types
M.Florence Dayana Computer Networks TypesDr.Florence Dayana
 
M.Florence Dayana Computer Networks Introduction
M.Florence Dayana   Computer Networks IntroductionM.Florence Dayana   Computer Networks Introduction
M.Florence Dayana Computer Networks IntroductionDr.Florence Dayana
 
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMM. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMDr.Florence Dayana
 
M.Florence Dayana / Basics of C Language
M.Florence Dayana / Basics of C LanguageM.Florence Dayana / Basics of C Language
M.Florence Dayana / Basics of C LanguageDr.Florence Dayana
 
M.Florence Dayana/Cryptography and Network security
M.Florence Dayana/Cryptography and Network securityM.Florence Dayana/Cryptography and Network security
M.Florence Dayana/Cryptography and Network securityDr.Florence Dayana
 
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
M.FLORENCE DAYANA WEB DESIGN -Unit 5   XMLM.FLORENCE DAYANA WEB DESIGN -Unit 5   XML
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XMLDr.Florence Dayana
 

More from Dr.Florence Dayana (20)

Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdfDr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
Dr.M.Florence Dayana-Cloud Computing-unit - 4.pdf
 
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdfDr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
Dr.M.Florence Dayana-Cloud Computing-Unit - 1.pdf
 
M. FLORENCE DAYANA/unit - II logic gates and circuits.pdf
M. FLORENCE DAYANA/unit - II logic gates and circuits.pdfM. FLORENCE DAYANA/unit - II logic gates and circuits.pdf
M. FLORENCE DAYANA/unit - II logic gates and circuits.pdf
 
M.FLORENCE DAYANA/electronic mail security.pdf
M.FLORENCE DAYANA/electronic mail security.pdfM.FLORENCE DAYANA/electronic mail security.pdf
M.FLORENCE DAYANA/electronic mail security.pdf
 
M. FLORENCE DAYANA - INPUT & OUTPUT DEVICES.pdf
M. FLORENCE DAYANA - INPUT & OUTPUT DEVICES.pdfM. FLORENCE DAYANA - INPUT & OUTPUT DEVICES.pdf
M. FLORENCE DAYANA - INPUT & OUTPUT DEVICES.pdf
 
Professional English - Reading
Professional English - ReadingProfessional English - Reading
Professional English - Reading
 
Professional English - Speaking
Professional English - SpeakingProfessional English - Speaking
Professional English - Speaking
 
Professional English - Listening
Professional English - ListeningProfessional English - Listening
Professional English - Listening
 
INPUT AND OUTPUT DEVICES.pdf
INPUT  AND OUTPUT DEVICES.pdfINPUT  AND OUTPUT DEVICES.pdf
INPUT AND OUTPUT DEVICES.pdf
 
NETWORK SECURITY-SET.pptx
NETWORK SECURITY-SET.pptxNETWORK SECURITY-SET.pptx
NETWORK SECURITY-SET.pptx
 
Network Security- Secure Socket Layer
Network Security- Secure Socket LayerNetwork Security- Secure Socket Layer
Network Security- Secure Socket Layer
 
M.florence dayana dream weaver
M.florence dayana   dream weaverM.florence dayana   dream weaver
M.florence dayana dream weaver
 
M.florence dayana computer networks transport layer
M.florence dayana   computer networks transport layerM.florence dayana   computer networks transport layer
M.florence dayana computer networks transport layer
 
M.Florence Dayana Computer Networks Types
M.Florence Dayana  Computer Networks TypesM.Florence Dayana  Computer Networks Types
M.Florence Dayana Computer Networks Types
 
M.Florence Dayana Computer Networks Introduction
M.Florence Dayana   Computer Networks IntroductionM.Florence Dayana   Computer Networks Introduction
M.Florence Dayana Computer Networks Introduction
 
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEMM. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
M. FLORENCE DAYANA/DATABASE MANAGEMENT SYSYTEM
 
M.Florence Dayana
M.Florence DayanaM.Florence Dayana
M.Florence Dayana
 
M.Florence Dayana / Basics of C Language
M.Florence Dayana / Basics of C LanguageM.Florence Dayana / Basics of C Language
M.Florence Dayana / Basics of C Language
 
M.Florence Dayana/Cryptography and Network security
M.Florence Dayana/Cryptography and Network securityM.Florence Dayana/Cryptography and Network security
M.Florence Dayana/Cryptography and Network security
 
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
M.FLORENCE DAYANA WEB DESIGN -Unit 5   XMLM.FLORENCE DAYANA WEB DESIGN -Unit 5   XML
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
 

Recently uploaded

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 

Recently uploaded (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 

M. Florence Dayana - Hadoop Foundation for Analytics.pptx

  • 1. Bon Secours College for Women Accredited with A++ Grade by NAAC in Cycle-II Recognized by 2(f) and 12(B) Institution, Vilar, Bypass, Thanjavur. Dr.M.FLORENCE DAYANA Assistant Professor Department of Computer Applications Year : 2023 – 2024 Class : II-MSc. CS Semester : III Course : Big Data Analytics (PP22CSCC31 ) Unit : IV Hadoop Foundation for Analytics
  • 2.  History pf Hadoop  Features  Key Advantages of Hadoop  Why Hadoop  Versions of Hadoop  Essential of Hadoop ecosystem  RDBMS versus Hadoop  Key Aspects of Hadoop  Components of Hadoop Hadoop Foundation for Analytics
  • 3. • Hadoop was created by Doug Cutting and Mike Cafarella in 2005. • It was originally developed to support distribution for the Nutch search engine project. • In 2006, Hadoop was released by Yahoo and today is maintained and distributed by Apache Software Foundation (ASF). History of Hadoop
  • 4.  Handles massive quantities of structured, semi structured and unstructured data using commodity h/w  Has shared nothing architecture  Replicates data across multiple computers-Replica  For high throughput rather than latency  Batch processing therefore response time is not immediate  Complements OLTP and OLAP  Not a replacement for RDBMS  Not good when work cannot be parallelized  Not good for processing small files Features
  • 5. Key Advantages of Hadoop 1. Stores data in its native form(HDFS)  No structure that is imposed in keying or storing data  Schema less  Only when data needs to be processed that structure is imposed on new data 2. Scalable  Can store and distribute very large data sets across hundred of inexpensive servers that operate in parallel 3. Cost Effective  Has a much reduced cost/terabyte of storage and processing
  • 6. Key Advantages of Hadoop 4. Resilient to Failure  Fault tolerant. Practices replication of data. When  data is sent, it is replicated. 5. Flexibility  Works with all type of data structures. Helps drive  meaningful information from email, social media.  ClickStreamData.  Put to several purpose such as log analysis, data  mining, recommendation systems, market campaign  analysis etc. 6. Fast Extremely fast. Moves code to data.
  • 8. • Hadoop 1.0 • • Data storage Framework • • Data processing Versions of Hadoop Hadoop 1.0 Hadoop 2.0
  • 9. Hadoop 1.0 Data storage Framework • HDFS is schemaless. Stores data files in data format. • Stores files close to original form. Data processing framework: • Uses two functions MAP and REDUCE to process data. • “Mappers” take in a set of key value pairs and generate intermediate data. • “Reducers” act on this input to produce the output data. Two functions work in isolation enabling high distributed in a high parallel, fault tolerant and scalable way Versions of Hadoop
  • 10. Limitations • Requires MapReduce programming expertise with proficiency required in other programming languages like Java • Supported batch processing suitable for tasks such as log analysis, large scale data mining projects. • Tightly computationally coupled with MapReduce. Either rewrite their functionality in MapReduce so that it could be executed in Hadoop or extract the data from HDFS and process it outside of Hadoop. None of the options were viable as a Hadoop. Led to process inefficiencies caused by the data being moved in and out of Hadoop cluster. Hadoop 1.0
  • 11. • HDFS continues to be the data storage framework. • Yet Another Resource Negotiator(YARN) has been added • Any application capable of dividing itself into parallel tasks is supported by YARN • YARN co ordinates the allocation of the subtasks of the submitted applications thereby enhancing flexibility, scalability and efficiency of the applications Hadoop 2.0
  • 12. • It works by having ApplicationMaster in place of the JobTracker , Running applications on resources governed by a new NodeManager • MapReduce programming expertise is no longer required • It supports Batch Processing and also Real time processing • Data Processing Functions such as Data Standardisation, Master Data Management can now be performed in HDFS. Hadoop 2.0
  • 13. Supports projects to enhance the functionality of Hadoop Core Components The Eco projects • HIVE • PIG • SQOOP • HBASE • FLUME. • OOZIE • MAHOUT Essential of Hadoop Ecosystems
  • 14. Essential of Hadoop Ecosystems The Eco projects are • HIVE: It enables analysis of large data sets using a language similar to standard ANSI SQL. Enables to access data stored on a Hadoop Cluster • PIG: Easy to understand data flow language. Helps with the analysis of large data sets. Even without the proficiency in MapReduce, the data in the Hadoop cluster can be analysed as PIG scripts are automatically converted into MapReduce jobs by the PIG interpreter • SQOOP: Used to transfer bulk data between Hadoop and structured data stores as RDBMS
  • 15. • HBASE: It is Hadoop’s database and compares well with an RDBMS. It supports structured data storage for large tables • FLUME:Is a distributed, reliable and available software for efficiently collecting, aggregating and moving large amounts of log data. Has simple and flexible architecture. • OOZIE: It is a workflow scheduler system to manage Apache Hadoop jobs • MAHOUT: It is a scalable machine learning and data mining library Essential of Hadoop Ecosystems
  • 16. RDBMS versus HADOOP PARAMETERS RDBMS HADOOP System Relational database Management System Node Based Flat Structure Data Suitable for structured data Suitable for structured, unstructured data, Supports variety of data formats in real time such as XML, JSON, text based flat file formats etc. Processing OLTP Analytical, Big Data Processing Choice When the data needs consistent Relationship Big Data processing, which does not require any consistent relationships between data Processor Needs expensive hardware or high-end processors to store huge volumes of data In a HADOOP cluster, a node requires only a processor, a network card and few hard drives Cost Cost around $10,000 to $14,000 per terabytes of storage Cost around $4,000 per terabytes of storage
  • 17. 1 • Open Source Software • It is free to download, use and contribute 2 • Framework • The requirements to develop and execute and application is provided-program tools etc. 3 • Distributed • Divides and stores data across multiple computers. • Computation/Processing is done in parallel across multiple connected nodes 4 • Massive Storage • Stores colossal amounts of data across nodes of low-cost commodity hardware 5 • Faster Processing • Large amounts of data is processed in parallel yielding quick response Key Aspects of Hadoop
  • 19. HDFS Storage Components Distribute data across several nodes Natively redundant MapReduce Computational framework Splits a task across several nodes Process data in parallel Core Components Hadoop Ecosystem • HIVE • PIG • SQOOP • HBASE • FLUME • OOZIE • MAHOUT