SlideShare a Scribd company logo
1 of 20
BIG DATA
Presented By,
R.S.M.N.PRASAD.
(pvpsit)
OUTLOOK
 Introduction
 Hadoop
 MapReduce
 Hyper Table
 Advantages
BIG DATA
• The data comes from everywhere: sensors used to
gather climate information, posts to social media sites,
digital pictures and videos, purchase transaction records,
and cell phone GPS signals to name a few. This data
is called Big Data.
• Every day, we create 2.5 quintillion bytes (one quintillion
bytes = one billion gigabytes). Of all data, so much of
90% of the data in the world today has been created in
the last two years alone.
IN FACT, IN A MINUTE…
• Email users send more than 204 million messages;
• Mobile Web receives 217 new users;
• Google receives over 2 million search queries;
• YouTube users upload 48 hours of new video;
• Facebook users share 684,000 bits of content;
• Twitter users send more than 100,000 tweets;
• Consumers spend $272,000 on Web shopping;
• Apple receives around 47,000 application downloads;
• Brands receive more than 34,000 Facebook 'likes';
• Tumblr blog owners publish 27,000 new posts;
• Instagram users share 3,600 new photos;
• Flickr users , on the other hand , add 3,125 new photos;
• Foursquare users perform 2,000 check-ins;
• WordPress users publish close to 350 new blog posts.
Big Data Vectors
• High-volume:
Amount of data
• High-velocity:
Speed rate in collecting or acquiring or generating or
processing of data
• High-variety:
Different data type such as audio, video, image data
Big Data = Transactions + Interactions + Observations
What is Hadoop?
• HADOOP
High-availability distributed object-oriented platform or
“Hadoop” is a software framework which analyze structured
and unstructured data and distribute applications on different
servers.
• Basic Application of Hadoop
Hadoop is used in maintaining, scaling, error handling,
self healing and securing large scale of data. These data can
be structured or unstructured. What I mean to say is if data is
large then traditional systems are unable to handle it.
HADOOP
DIFFERENT COMPONENTS ARE..........
Data Access Components :- PIG & HIVE
Data Storage Components :- HBASE
Data Integration Components :- APACHEFLUME ,SQOOP, CHUKWA.
Data Management Components :- AMBARI , ZOOKEEPER.
Data Serialization Components :- THRIFT & AVRO
Data Intelligence Components :- APACHE MAHOUT, DRILL
What does it do?
• Hadoop implements Google’s MapReduce, using
HDFS
• MapReduce divides applications into many small
blocks of work.
• HDFS creates multiple replicas of data blocks for
reliability, placing them on compute nodes
around the cluster.
• MapReduce can then process the data where it
is located.
• Hadoop ‘s target is to run on clusters of the order
of 10,000-nodes.
How does MapReduce work?
• The run time partitions the input and provides it
to different Map instances;
• Map (key, value)  (key’, value’)
• The run time collects the (key’, value’) pairs and
distributes them to several Reduce functions so
that each Reduce function gets the pairs with the
same key’.
• Each Reduce produces a single (or zero) file
output.
• Map and Reduce are user written functions.
HYPERTABLE
What is it?
• Open source Big table clone
• Manages massive sparse tables with timestamped cell
versions
• Single primary key index
What is it not?
• No joins
• No secondary indexes (not yet)
• No transactions (not yet)
SCALING
TABLE: VISUAL REPRESENTATION
TABLE: ACTUAL REPRESENTATION
SYSTEM OVERVIEW
RANGE SERVER
• Manages ranges of table data
• Caches updates in memory (Cell Cache)
• Periodically spills (compacts) cached updates to disk (CellStore)
PERFORMANCE OPTIMIZATIONS
Block Cache
• Caches CellStore blocks
• Blocks are cached uncompressed
Bloom Filter
• Avoids unnecessary disk access
• Filter by rows or rows + columns
• Configurable false positive rate
Access Groups
• Physically store co-accessed columns together
• Improves performance by minimizing I/O
ADVANTAGES
• Flexible : Easily to access Structured & Unstructured
Data
• Scalable: It can store & distributed very large data , sets
100’s of inexpensive Servers that Operate in Parallel.
• Efficient: By distributing the data, it can process it in
parallel on the nodes where the data is located.
• Resistant to Failure: It automatically maintains
multiple copies of data and automatically redeploys
computing tasks based on failures.
QUERIES????
Big data

More Related Content

What's hot

Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013
Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013
Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013Jen Stirrup
 
BlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for HadoopBlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for HadoopBlueData, Inc.
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real WorldMark Kromer
 
Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Mark Tabladillo
 
Bridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architectureBridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architectureIBM Analytics
 
Introducing Direct Database Access with Snowflake + Intrinio
Introducing Direct Database Access with Snowflake + IntrinioIntroducing Direct Database Access with Snowflake + Intrinio
Introducing Direct Database Access with Snowflake + IntrinioIntrinio
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster ServicesAdam Doyle
 
Big Data - HDInsight and Power BI
Big Data - HDInsight and Power BIBig Data - HDInsight and Power BI
Big Data - HDInsight and Power BIPrasad Prabhu (PP)
 
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionHow One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionDataWorks Summit
 
The Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightThe Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightGert Drapers
 
Oracle digital days sept 2016 v1 (1)
Oracle digital days sept 2016 v1 (1)Oracle digital days sept 2016 v1 (1)
Oracle digital days sept 2016 v1 (1)Christian Bilien
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudMichael Rainey
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations PresentationAdam Doyle
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics SuiteJames Serra
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsNguyen Cao
 
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...Kolja Manuel Rödel
 
Azure Con Cortana Analytics Suite
Azure Con Cortana Analytics Suite Azure Con Cortana Analytics Suite
Azure Con Cortana Analytics Suite Andy Wright
 

What's hot (20)

Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013
Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013
Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013
 
Hadoop
HadoopHadoop
Hadoop
 
BlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for HadoopBlueData Hunk Integration: Splunk Analytics for Hadoop
BlueData Hunk Integration: Splunk Analytics for Hadoop
 
Big Data in the Real World
Big Data in the Real WorldBig Data in the Real World
Big Data in the Real World
 
Apache hadoop
Apache hadoopApache hadoop
Apache hadoop
 
Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629Microsoft and Revolution Analytics -- what's the add-value? 20150629
Microsoft and Revolution Analytics -- what's the add-value? 20150629
 
Bridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architectureBridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architecture
 
Introducing Direct Database Access with Snowflake + Intrinio
Introducing Direct Database Access with Snowflake + IntrinioIntroducing Direct Database Access with Snowflake + Intrinio
Introducing Direct Database Access with Snowflake + Intrinio
 
Managed Cluster Services
Managed Cluster ServicesManaged Cluster Services
Managed Cluster Services
 
Big Data - HDInsight and Power BI
Big Data - HDInsight and Power BIBig Data - HDInsight and Power BI
Big Data - HDInsight and Power BI
 
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionHow One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
 
The Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsightThe Fundamentals Guide to HDP and HDInsight
The Fundamentals Guide to HDP and HDInsight
 
Oracle digital days sept 2016 v1 (1)
Oracle digital days sept 2016 v1 (1)Oracle digital days sept 2016 v1 (1)
Oracle digital days sept 2016 v1 (1)
 
Big data edel
Big data edelBig data edel
Big data edel
 
Data Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the CloudData Warehouse - Incremental Migration to the Cloud
Data Warehouse - Incremental Migration to the Cloud
 
Great Expectations Presentation
Great Expectations PresentationGreat Expectations Presentation
Great Expectations Presentation
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & Applications
 
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
Hadoop Data Lake vs classical Data Warehouse: How to utilize best of both wor...
 
Azure Con Cortana Analytics Suite
Azure Con Cortana Analytics Suite Azure Con Cortana Analytics Suite
Azure Con Cortana Analytics Suite
 

Viewers also liked

Презентация IP АТС MyAsterisk
Презентация IP АТС MyAsteriskПрезентация IP АТС MyAsterisk
Презентация IP АТС MyAsteriskmyasteriskru
 
10 tips bij het kiezen van een coach
10 tips bij het kiezen van een coach10 tips bij het kiezen van een coach
10 tips bij het kiezen van een coachHans van Elk
 
Automação e Análise da Inserção de falhas Single Event Transient em Circuitos...
Automação e Análise da Inserção de falhas Single Event Transient em Circuitos...Automação e Análise da Inserção de falhas Single Event Transient em Circuitos...
Automação e Análise da Inserção de falhas Single Event Transient em Circuitos...Ygor Aguiar
 
Spring 2013 Issue of "Florida Libraries"
Spring 2013 Issue of "Florida Libraries" Spring 2013 Issue of "Florida Libraries"
Spring 2013 Issue of "Florida Libraries" Maria Gebhardt
 
Lettera di presentazione carmine pucino
Lettera di presentazione carmine pucinoLettera di presentazione carmine pucino
Lettera di presentazione carmine pucinoCarmine Pucino
 
How to sell online/Pudra.ru
How to sell online/Pudra.ruHow to sell online/Pudra.ru
How to sell online/Pudra.ruSTARTUP WOMEN
 
Kişilik
KişilikKişilik
Kişiliktanerimx
 
Instructivo aprendiz sena (2)
Instructivo aprendiz sena (2)Instructivo aprendiz sena (2)
Instructivo aprendiz sena (2)daserr
 
кешубаева әсел+экостройсервис+качество и цена
кешубаева әсел+экостройсервис+качество и ценакешубаева әсел+экостройсервис+качество и цена
кешубаева әсел+экостройсервис+качество и ценаАсель Кешубаева
 
Evaluation of Developing Electronic Sports Business in Chinese Game Market - ...
Evaluation of Developing Electronic Sports Business in Chinese Game Market - ...Evaluation of Developing Electronic Sports Business in Chinese Game Market - ...
Evaluation of Developing Electronic Sports Business in Chinese Game Market - ...Teng Ma
 
PRINCE PPT
PRINCE PPTPRINCE PPT
PRINCE PPTPrince _
 
Complications of Regional Anesthesia
Complications of Regional AnesthesiaComplications of Regional Anesthesia
Complications of Regional AnesthesiaDr.Mahmoud Abbas
 

Viewers also liked (12)

Презентация IP АТС MyAsterisk
Презентация IP АТС MyAsteriskПрезентация IP АТС MyAsterisk
Презентация IP АТС MyAsterisk
 
10 tips bij het kiezen van een coach
10 tips bij het kiezen van een coach10 tips bij het kiezen van een coach
10 tips bij het kiezen van een coach
 
Automação e Análise da Inserção de falhas Single Event Transient em Circuitos...
Automação e Análise da Inserção de falhas Single Event Transient em Circuitos...Automação e Análise da Inserção de falhas Single Event Transient em Circuitos...
Automação e Análise da Inserção de falhas Single Event Transient em Circuitos...
 
Spring 2013 Issue of "Florida Libraries"
Spring 2013 Issue of "Florida Libraries" Spring 2013 Issue of "Florida Libraries"
Spring 2013 Issue of "Florida Libraries"
 
Lettera di presentazione carmine pucino
Lettera di presentazione carmine pucinoLettera di presentazione carmine pucino
Lettera di presentazione carmine pucino
 
How to sell online/Pudra.ru
How to sell online/Pudra.ruHow to sell online/Pudra.ru
How to sell online/Pudra.ru
 
Kişilik
KişilikKişilik
Kişilik
 
Instructivo aprendiz sena (2)
Instructivo aprendiz sena (2)Instructivo aprendiz sena (2)
Instructivo aprendiz sena (2)
 
кешубаева әсел+экостройсервис+качество и цена
кешубаева әсел+экостройсервис+качество и ценакешубаева әсел+экостройсервис+качество и цена
кешубаева әсел+экостройсервис+качество и цена
 
Evaluation of Developing Electronic Sports Business in Chinese Game Market - ...
Evaluation of Developing Electronic Sports Business in Chinese Game Market - ...Evaluation of Developing Electronic Sports Business in Chinese Game Market - ...
Evaluation of Developing Electronic Sports Business in Chinese Game Market - ...
 
PRINCE PPT
PRINCE PPTPRINCE PPT
PRINCE PPT
 
Complications of Regional Anesthesia
Complications of Regional AnesthesiaComplications of Regional Anesthesia
Complications of Regional Anesthesia
 

Similar to Big data

Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewAbhishek Roy
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL David Smelker
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopAmir Shaikh
 
Big data and hadoop introduction
Big data and hadoop introductionBig data and hadoop introduction
Big data and hadoop introductionAjay Mittal
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoopMohit Tare
 
Getting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightGetting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightNilesh Gule
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdataTom Rogers
 
Big data analysis using hadoop cluster
Big data analysis using hadoop clusterBig data analysis using hadoop cluster
Big data analysis using hadoop clusterFurqan Haider
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurgeRTTS
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseRizaldy Ignacio
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightTillmann Eitelberg
 

Similar to Big data (20)

Hadoop HDFS.ppt
Hadoop HDFS.pptHadoop HDFS.ppt
Hadoop HDFS.ppt
 
Hadoop Master Class : A concise overview
Hadoop Master Class : A concise overviewHadoop Master Class : A concise overview
Hadoop Master Class : A concise overview
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
 
bigdata.pptx
bigdata.pptxbigdata.pptx
bigdata.pptx
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Big data and hadoop introduction
Big data and hadoop introductionBig data and hadoop introduction
Big data and hadoop introduction
 
bigdata.pdf
bigdata.pdfbigdata.pdf
bigdata.pdf
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Getting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightGetting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsight
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdata
 
Big data analysis using hadoop cluster
Big data analysis using hadoop clusterBig data analysis using hadoop cluster
Big data analysis using hadoop cluster
 
Testing Big Data: Automated Testing of Hadoop with QuerySurge
Testing Big Data: Automated  Testing of Hadoop with QuerySurgeTesting Big Data: Automated  Testing of Hadoop with QuerySurge
Testing Big Data: Automated Testing of Hadoop with QuerySurge
 
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid WarehouseUsing the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
Using the Power of Big SQL 3.0 to Build a Big Data-Ready Hybrid Warehouse
 
Big data technology
Big data technology Big data technology
Big data technology
 
Big Data
Big DataBig Data
Big Data
 
Big Data
Big DataBig Data
Big Data
 
Data analytics & its Trends
Data analytics & its TrendsData analytics & its Trends
Data analytics & its Trends
 
SQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsightSQL Server Konferenz 2014 - SSIS & HDInsight
SQL Server Konferenz 2014 - SSIS & HDInsight
 

Recently uploaded

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 

Recently uploaded (20)

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 

Big data

  • 2. OUTLOOK  Introduction  Hadoop  MapReduce  Hyper Table  Advantages
  • 3. BIG DATA • The data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is called Big Data. • Every day, we create 2.5 quintillion bytes (one quintillion bytes = one billion gigabytes). Of all data, so much of 90% of the data in the world today has been created in the last two years alone.
  • 4. IN FACT, IN A MINUTE… • Email users send more than 204 million messages; • Mobile Web receives 217 new users; • Google receives over 2 million search queries; • YouTube users upload 48 hours of new video; • Facebook users share 684,000 bits of content; • Twitter users send more than 100,000 tweets; • Consumers spend $272,000 on Web shopping; • Apple receives around 47,000 application downloads; • Brands receive more than 34,000 Facebook 'likes'; • Tumblr blog owners publish 27,000 new posts; • Instagram users share 3,600 new photos; • Flickr users , on the other hand , add 3,125 new photos; • Foursquare users perform 2,000 check-ins; • WordPress users publish close to 350 new blog posts.
  • 5. Big Data Vectors • High-volume: Amount of data • High-velocity: Speed rate in collecting or acquiring or generating or processing of data • High-variety: Different data type such as audio, video, image data Big Data = Transactions + Interactions + Observations
  • 6. What is Hadoop? • HADOOP High-availability distributed object-oriented platform or “Hadoop” is a software framework which analyze structured and unstructured data and distribute applications on different servers. • Basic Application of Hadoop Hadoop is used in maintaining, scaling, error handling, self healing and securing large scale of data. These data can be structured or unstructured. What I mean to say is if data is large then traditional systems are unable to handle it.
  • 8. DIFFERENT COMPONENTS ARE.......... Data Access Components :- PIG & HIVE Data Storage Components :- HBASE Data Integration Components :- APACHEFLUME ,SQOOP, CHUKWA. Data Management Components :- AMBARI , ZOOKEEPER. Data Serialization Components :- THRIFT & AVRO Data Intelligence Components :- APACHE MAHOUT, DRILL
  • 9. What does it do? • Hadoop implements Google’s MapReduce, using HDFS • MapReduce divides applications into many small blocks of work. • HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. • MapReduce can then process the data where it is located. • Hadoop ‘s target is to run on clusters of the order of 10,000-nodes.
  • 10. How does MapReduce work? • The run time partitions the input and provides it to different Map instances; • Map (key, value)  (key’, value’) • The run time collects the (key’, value’) pairs and distributes them to several Reduce functions so that each Reduce function gets the pairs with the same key’. • Each Reduce produces a single (or zero) file output. • Map and Reduce are user written functions.
  • 11. HYPERTABLE What is it? • Open source Big table clone • Manages massive sparse tables with timestamped cell versions • Single primary key index What is it not? • No joins • No secondary indexes (not yet) • No transactions (not yet)
  • 16. RANGE SERVER • Manages ranges of table data • Caches updates in memory (Cell Cache) • Periodically spills (compacts) cached updates to disk (CellStore)
  • 17. PERFORMANCE OPTIMIZATIONS Block Cache • Caches CellStore blocks • Blocks are cached uncompressed Bloom Filter • Avoids unnecessary disk access • Filter by rows or rows + columns • Configurable false positive rate Access Groups • Physically store co-accessed columns together • Improves performance by minimizing I/O
  • 18. ADVANTAGES • Flexible : Easily to access Structured & Unstructured Data • Scalable: It can store & distributed very large data , sets 100’s of inexpensive Servers that Operate in Parallel. • Efficient: By distributing the data, it can process it in parallel on the nodes where the data is located. • Resistant to Failure: It automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.