SlideShare a Scribd company logo
Presented By
Sarita Bagul
TE Computer
Seat No.T120414208
Under the guidance
Asst.Prof.B.A.Khivsara
A Seminar On
Introduction
Literature Survey
Working of Hadoop in Big Data Analytics
Advantages and Disadvantages of Hadoop
Application of Big Data Analytics Using Hadoop
Conclusion
References
Outlines
BIG DATA
What is Big Data?
“A massive volume of both structured and
unstructured data that is so large that it's difficult to
process with traditional database and software
techniques”.
5 Vs of Big Data
 Big data analytics is the process of collecting,
organizing and analyzing large sets of data (called big
data) to discover patterns and other useful information.

Big Data Analytics
 In this illustrated that in olden days through RDBMS tools
,the data was less and easily handled by RDBMS but
recently it is difficult to handle huge data, which is
preferred as “big data”.
Relational database management system
Relational Databases Are Not Designed To Handle Change
Cost
No support for complex object such as documents,video,images etc.
Relational databases have limits on field lengths.
No support for unstructured data.
 2006 - Yahoo! created Hadoop based on GFS and MapReduce (with Doug Cutting
and team)
 2007 - Yahoo started using Hadoop on a 1000 node cluster
 Jan 2008 - Apache took over Hadoop
 Jul 2008 - Tested a 4000 node cluster with Hadoop successfully
 2009 - Hadoop successfully sorted a petabyte of data in less than 17 hours to
handle billions of searches and indexing millions of web pages.
 Dec 2011 - Hadoop releases version 1.0
 Aug 2013 - Version 2.0.6 is available
 Nov 2014: Release 2.6.0 available
 Dec, 2015: Release 2.6.3 available
 Oct, 2016: Release 2.6.5 available
Old Version Of Hadoop
 It limits scalability
 Availability Issue
 Problem with Resource Utilization
 Limitation in running non-MapReduce Application
Disadvantages of old versions of hadoop
 25 January, 2017: Release 3.0.0-alpha2
available
 This is the second alpha in a series of planned
alphas and betas leading up to a 3.0.0 GA
release. The intention is to "release early,
release often" to quickly iterate on feedback
collected from downstream users.
Latest Version Of Hadoop
 To overcome the disadvantages of RDBMS, Hadoop is
introduced in market.
 Hadoop is an open source, Java-based programming
framework that supports the processing and storage of
extremely large data sets in a distributed computing
environment.
HADOOP
 There are many old technologies already present used for big
data handling but each one of them has some advantages and
disadvantages. There are number of technologies are there few of
them are mentioned below:
 Column-oriented databases
 NoSQL databases
 MapReduce
 Hive
 Pig
 WibiData
 PLATFORA
 Apache Zeppelin
 Hadoop
Working Of Hadoop In Big Data Analytics
Architecture Of Hadoop
There are
two main
components
of Hadoop.
• MapReduce
• HDFS
Components Of Hadoop
 NoSQL (originally referring to SQL. or relational.)
database provides a mechanism for storage and
retrieval of data that is modeled in means other than the
tabular relations used in relation databases (RDBMS).
 This is backend database of hadoop.
NoSQL
Health Care Applications
IOT
Social Media
Applications of Hadoop
Scalable
Cost effective
Flexible
Fast
Resilient to failure
Advantages of Hadoop
Security Concerns
Not Fit for Small Data
Vulnerable By Nature
Disadvantages of Hadoop
 Hadoop which is an open source software is a popular
framework tool to handle the big data and used for big
data analytics.
Conclusion
 [1] Sethy, Rotsnarani, and Mrutyunjaya Panda "Big Data Analysis using Hadoop:
A Survey." International Journal 5.7 (2015).
 [2] Bhosale, Harshawardhan S., and Devendra P. Gadekar. "A Review Paper on
BigData and Hadoop." International Journal of Scientic and Research Publications
4.10 (2014): 1.
 [3] ]http://research.ijcaonline.org/volume108/number12/pxc3900288.pdf
 [4] https://en.wikipedia.org/wiki/Big data
 [5] Tom White,.Hadoop, The denitive guide.,OfReilly,3rd Edition
 [6] https://www.google.co.in/?gfe rd=cr&ei=ayKnWJWmDe x8AfDyLnQDg&gws
rd=ssl#q= hadoop + tutoria+ppt
 [7] https://www.google.co.in/?gfe rd=cr&ei=ayKnWJWmDe x8AfDyLnQDg&gws
rd=ssl#q= hadoop
References
[8] Bernice Purcell “The emergence of gbig datah technology and analytics “Journal of Technology
Research 2013.
[9] https://www.google.co.in/search?q=Hadoop%2 C + a + distributed + framework +for + Big + Data
&ie=utf-8&oeutf-8 &client = firefox ab&gfe rd = cr&ei =glXJWJyDMIKM4gL89IPACg
[10] Gupta, Bhawna, and Kiran Jyoti. "Big data analytics with hadoop to analyze targeted attacks
on enterprise data." (IJCSIT) International Journal of Computer Science and Information
Technologies 5.3 (2014): 3867-3870.
[11] Russom, Philip. "Big data analytics." TDWI best practices report, fourth quarter (2011): 1-35.
[12] http://blogs.mindsmapped.com/bigdatahadoop/hadoop-advantages-and-disadvantages/
[13]http://www.tutorialspoint.com/articles/what-is-nosql-and-is-it-the-next-big-trend-in-databases
[14] http://www.tutorialspoint.com/MongoDB/MongoDB-Application.htm
[15]http://www.w3resource.com/mongodb/nosql.php
[16] https://www.dezyre.com/article/5-healthcare-applications-of-hadoop-and-big-data/85
[17] https://www.tutorialspoint.com/hadoop/hadoop_enviornment_setup.htm
big-data-analytics-using-hadoop.pptx for project

More Related Content

Similar to big-data-analytics-using-hadoop.pptx for project

A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introduction
saisreealekhya
 
Hadoop Based Data Discovery
Hadoop Based Data DiscoveryHadoop Based Data Discovery
Hadoop Based Data DiscoveryBenjamin Ashkar
 
Big Data
Big DataBig Data
Big Data
Kirubaburi R
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
Thanh Nguyen
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
Thanh Nguyen
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Mode
inventionjournals
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
Gregg Barrett
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
himanshu arora
 
Non geeks-big-data-playbook-106947
Non geeks-big-data-playbook-106947Non geeks-big-data-playbook-106947
Non geeks-big-data-playbook-106947
CMR WORLD TECH
 
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Non-geek's big data playbook - Hadoop & EDW - SAS Best PracticesNon-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Non-geek's big data playbook - Hadoop & EDW - SAS Best PracticesJyrki Määttä
 
Learn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant ResourceLearn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant Resource
Assignment Help
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Giovanna Roda
 
Case study on big data
Case study on big dataCase study on big data
Case study on big data
Khushboo Kumari
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
Laxmi Rauth
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
Ganesh Sanap
 
Rajesh Angadi Brochure
Rajesh Angadi Brochure Rajesh Angadi Brochure
Rajesh Angadi Brochure Rajesh Angadi
 
G017143640
G017143640G017143640
G017143640
IOSR Journals
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
IOSR Journals
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1Thanh Nguyen
 

Similar to big-data-analytics-using-hadoop.pptx for project (20)

A Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - IntroductionA Glimpse of Bigdata - Introduction
A Glimpse of Bigdata - Introduction
 
Hadoop Based Data Discovery
Hadoop Based Data DiscoveryHadoop Based Data Discovery
Hadoop Based Data Discovery
 
Big Data
Big DataBig Data
Big Data
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Mode
 
Hadoop Business Cases
Hadoop Business CasesHadoop Business Cases
Hadoop Business Cases
 
Hadoop Overview
Hadoop OverviewHadoop Overview
Hadoop Overview
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Non geeks-big-data-playbook-106947
Non geeks-big-data-playbook-106947Non geeks-big-data-playbook-106947
Non geeks-big-data-playbook-106947
 
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Non-geek's big data playbook - Hadoop & EDW - SAS Best PracticesNon-geek's big data playbook - Hadoop & EDW - SAS Best Practices
Non-geek's big data playbook - Hadoop & EDW - SAS Best Practices
 
Learn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant ResourceLearn About Big Data and Hadoop The Most Significant Resource
Learn About Big Data and Hadoop The Most Significant Resource
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Case study on big data
Case study on big dataCase study on big data
Case study on big data
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
Introduction to hadoop
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
 
Rajesh Angadi Brochure
Rajesh Angadi Brochure Rajesh Angadi Brochure
Rajesh Angadi Brochure
 
G017143640
G017143640G017143640
G017143640
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 
Overview of big data & hadoop v1
Overview of big data & hadoop   v1Overview of big data & hadoop   v1
Overview of big data & hadoop v1
 

Recently uploaded

Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 

Recently uploaded (20)

Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 

big-data-analytics-using-hadoop.pptx for project

  • 1. Presented By Sarita Bagul TE Computer Seat No.T120414208 Under the guidance Asst.Prof.B.A.Khivsara A Seminar On
  • 2. Introduction Literature Survey Working of Hadoop in Big Data Analytics Advantages and Disadvantages of Hadoop Application of Big Data Analytics Using Hadoop Conclusion References Outlines
  • 3.
  • 5. What is Big Data? “A massive volume of both structured and unstructured data that is so large that it's difficult to process with traditional database and software techniques”.
  • 6. 5 Vs of Big Data
  • 7.  Big data analytics is the process of collecting, organizing and analyzing large sets of data (called big data) to discover patterns and other useful information.  Big Data Analytics
  • 8.
  • 9.  In this illustrated that in olden days through RDBMS tools ,the data was less and easily handled by RDBMS but recently it is difficult to handle huge data, which is preferred as “big data”. Relational database management system Relational Databases Are Not Designed To Handle Change Cost No support for complex object such as documents,video,images etc. Relational databases have limits on field lengths. No support for unstructured data.
  • 10.  2006 - Yahoo! created Hadoop based on GFS and MapReduce (with Doug Cutting and team)  2007 - Yahoo started using Hadoop on a 1000 node cluster  Jan 2008 - Apache took over Hadoop  Jul 2008 - Tested a 4000 node cluster with Hadoop successfully  2009 - Hadoop successfully sorted a petabyte of data in less than 17 hours to handle billions of searches and indexing millions of web pages.  Dec 2011 - Hadoop releases version 1.0  Aug 2013 - Version 2.0.6 is available  Nov 2014: Release 2.6.0 available  Dec, 2015: Release 2.6.3 available  Oct, 2016: Release 2.6.5 available Old Version Of Hadoop
  • 11.  It limits scalability  Availability Issue  Problem with Resource Utilization  Limitation in running non-MapReduce Application Disadvantages of old versions of hadoop
  • 12.  25 January, 2017: Release 3.0.0-alpha2 available  This is the second alpha in a series of planned alphas and betas leading up to a 3.0.0 GA release. The intention is to "release early, release often" to quickly iterate on feedback collected from downstream users. Latest Version Of Hadoop
  • 13.  To overcome the disadvantages of RDBMS, Hadoop is introduced in market.  Hadoop is an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment. HADOOP
  • 14.  There are many old technologies already present used for big data handling but each one of them has some advantages and disadvantages. There are number of technologies are there few of them are mentioned below:  Column-oriented databases  NoSQL databases  MapReduce  Hive  Pig  WibiData  PLATFORA  Apache Zeppelin  Hadoop Working Of Hadoop In Big Data Analytics
  • 16. There are two main components of Hadoop. • MapReduce • HDFS Components Of Hadoop
  • 17.
  • 18.  NoSQL (originally referring to SQL. or relational.) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relation databases (RDBMS).  This is backend database of hadoop. NoSQL
  • 19. Health Care Applications IOT Social Media Applications of Hadoop
  • 20.
  • 22. Security Concerns Not Fit for Small Data Vulnerable By Nature Disadvantages of Hadoop
  • 23.  Hadoop which is an open source software is a popular framework tool to handle the big data and used for big data analytics. Conclusion
  • 24.  [1] Sethy, Rotsnarani, and Mrutyunjaya Panda "Big Data Analysis using Hadoop: A Survey." International Journal 5.7 (2015).  [2] Bhosale, Harshawardhan S., and Devendra P. Gadekar. "A Review Paper on BigData and Hadoop." International Journal of Scientic and Research Publications 4.10 (2014): 1.  [3] ]http://research.ijcaonline.org/volume108/number12/pxc3900288.pdf  [4] https://en.wikipedia.org/wiki/Big data  [5] Tom White,.Hadoop, The denitive guide.,OfReilly,3rd Edition  [6] https://www.google.co.in/?gfe rd=cr&ei=ayKnWJWmDe x8AfDyLnQDg&gws rd=ssl#q= hadoop + tutoria+ppt  [7] https://www.google.co.in/?gfe rd=cr&ei=ayKnWJWmDe x8AfDyLnQDg&gws rd=ssl#q= hadoop References
  • 25. [8] Bernice Purcell “The emergence of gbig datah technology and analytics “Journal of Technology Research 2013. [9] https://www.google.co.in/search?q=Hadoop%2 C + a + distributed + framework +for + Big + Data &ie=utf-8&oeutf-8 &client = firefox ab&gfe rd = cr&ei =glXJWJyDMIKM4gL89IPACg [10] Gupta, Bhawna, and Kiran Jyoti. "Big data analytics with hadoop to analyze targeted attacks on enterprise data." (IJCSIT) International Journal of Computer Science and Information Technologies 5.3 (2014): 3867-3870. [11] Russom, Philip. "Big data analytics." TDWI best practices report, fourth quarter (2011): 1-35. [12] http://blogs.mindsmapped.com/bigdatahadoop/hadoop-advantages-and-disadvantages/ [13]http://www.tutorialspoint.com/articles/what-is-nosql-and-is-it-the-next-big-trend-in-databases [14] http://www.tutorialspoint.com/MongoDB/MongoDB-Application.htm [15]http://www.w3resource.com/mongodb/nosql.php [16] https://www.dezyre.com/article/5-healthcare-applications-of-hadoop-and-big-data/85 [17] https://www.tutorialspoint.com/hadoop/hadoop_enviornment_setup.htm