SlideShare a Scribd company logo
www.iactglobal.in
1
www.iactglobal.in
What this Module 1 about ?
After completing this Module, you should be able to:
Understand what is Big Data and its characteristics
Detailed Understanding about the need for a Big Data solution
Understand where Big Data is appropriate
List the IBM products that make up IBM’s Big Data strategy
Describe the type of data appropriate for:
- Infosphere BigInsights
- Infosphere Streams
List the open source programs that are a part of Infosphere BigInsights. 2
www.iactglobal.in
System Of Units / Binary System of Units
3
International
System
Of Units(SI)
Binary
Usage(deprecated)
Kilobyte KB 10^3 2^10
megabyte MB 10^6 2^20
gigabyte GB 10^9 2^30
terabyte TB 10^12 2^40
petabyte PB 10^15 2^50
exabyte EB 10^18 2^60
zettabyte ZB 10^21 2^70
yottabyte YB 10^24 2^80
www.iactglobal.in
2.5 petabytes
Memory capacity of the human brain
13 petabytes
Amount that could be downloaded from the internet in two minutes, if every
American (300M) got on a computer at the same time
4.75 exabytes
Total genome sequences of all people on the earth
422 exabytes
Total digital data created in 2008
 1 Zetabyte
World’s current digital storage capacity
1.8 Zettabytes
Total digital data expected to be created in 2011
4
BigData @ Scale
www.iactglobal.in
Explosion in data and real world events
5Source : IBM internal : http://www.slideshare.net/jowen_evansdata/keynote-randy-newell-of-ibm
www.iactglobal.in
Commercial
 Web Events / Data Base Logs
 Sensor Networks
 RFID
 Internet Text and Documents
 Internet Search Indexing
 CDR (Call Detail Records)
 Medical Records ….. Etc
Government
 Regular Government Business & Commerce Needs
 Military & Homeland Security Surveillance
6
Examples Of BigData
www.iactglobal.in
Science
 Astronomy
 Atmosphere
 Biological
 Genomics
Social
 Social Networks
 Social Data
7
Examples Of BigData
www.iactglobal.in
BigData @ Organizations
8Source: http://www.slideshare.net/albertspijkers/2011-07-27baoclientpresentation
www.iactglobal.in
Perception gap surrounding social media
9Source: IBM internal
www.iactglobal.in
Big Data Characteristics
10Source: http://www.linguamatics.com/blog/big-data-real-world-data-where-does-text-analytics-fit
www.iactglobal.in
Challenge @ BigData to find new insights:
11
Source: IBM Internal:
http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
www.iactglobal.in
Is there really a need for Big Data?
12
Source:
http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
www.iactglobal.in
Case Study and Implementation @ Vestas
13
Vestas wind systems has 43,000 wind turbines in 65 countries over 5
continents
Customer Pain Point:
 Optimal place to install wind turbine
 Must consider large number of location dependant factors like temperature, precipitation,
wind velocity and humidity
 Existing legacy process doesn’t support all data to be analyzed
 Analyzing the data must be completed in hours
Solution Required:
 Allow to leverage all available data, drastically reduce modeling time, support future
expansions in modeling techniques.
 Improve accuracy of decisions for wind turbine placement
www.iactglobal.in
Case Study and Implementation @ Vestas
14
Implementation using InfoSphere BigInsights :
 Has created a “wind and site competence center”
 Engineers will be modeling data and forecasting optimal turbine
locations
 Initially to use publically available weather data from nation weather
data services as well as own recorded weather data
 Data sources considered: global deforestation metrics, satellite images,
historical metrics, geospatial data
 InfoSphere BigInsights will be used to as a core infrastructure to hold
generated weather data
www.iactglobal.in
Big Data presents big opportunities ?
15
Source:IBM Internal:
http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
www.iactglobal.in
Traditional Vs BigData approaches:
16
Source:
http://image.slidesharecdn.com/1524howibmsbigdatasolutioncanhelpyougaininsightintoyourdatacenterv2-130306205122-php
www.iactglobal.in
17
Merging the Traditional and Big Data Approaches
Source:IBM Internal: http://www.rosebt.com/uploads/8/1/8/1/8181762/3861342_orig.jpg?1
www.iactglobal.in
Enterprise information architecture:
Big Data will be a
Permanent part of your
Information architecture
It cannot be a silo- It
Must be fully integrated
In order to leverage its
Value
It must be easy to
deploy and integrate
18Source: IBM Internal:http://www.slideshare.net/albertspijkers/2011-07-27baoclientpresentation
www.iactglobal.in
IBM Big Data platform strategy:
 Integrate and manage the full variety, velocity and volume of Big
Data
 Apply advanced analytics to information in its native form
 Visualize all available data for ad- hoc analysis
 Development environment for building new analytic applications
 Support workload optimization and scheduling
 Provide for security and governance
 Integrate with enterprise software
19
www.iactglobal.in
IBM Big Data platform strategy:
Source: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
20
www.iactglobal.in
Enterprise class BigData Product @ IBM:
Failure Tolerance:
 High availability architecture to support hardware or
application failure.
Scale Economically:
 Runs on scalable hardware with the ability to dynamically add
additional nodes.
Security & Privacy:
 Security protection for granular data access control.
21
Source: IBM internal
www.iactglobal.in
Different BigInsights editions for varying needs
22Source:IBM Internal: http://www.bloter.net/wp-content/uploads/2013/04/ibm_biginsights_2_1.jpg
www.iactglobal.in
Different BigInsights editions for varying needs
Characteristics that distinguish BigInsights include its built-
in support for analytics its integration with other enterprise
software, and its production readiness.
For InfoSphere BigInsights , there are Two Releases:
Basic Edition
Enterprise Edition
23
www.iactglobal.in
Infosphere Streams:
24Source:IBM Internal: https://bruceweed.wordpress.com/tag/ibm-infosphere-streams/
www.iactglobal.in
To Summarize
• An enterprise-ready Big Data platform
• Innovative, customer-tested products-InfoSphere
BigInsights-InfoSphere Streams
• Platform and products enabled for integration with
the overall enterprise infrastructure
• Even though BigInsights contains open source
code-Licensing is like other IBM software offering
25
www.iactglobal.in
Having completed this Module, you should be able to
Understand need for a Big Data solution
List the IBM products that make up IBM’s Big Data Strategy
Describe the type of data appropriate for:
-InfoSphere BigInsights
-InfoSphere Streams
List the open source programs that are a part of InfoSphere
BigInsights
26
To Summarize

More Related Content

What's hot

Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
yashbheda
 
Big Data
Big DataBig Data
Big Data
Priyanka Tuteja
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
Chirag Ahuja
 
Big Data
Big DataBig Data
Big Data
Vinayak Kamath
 
Big data
Big dataBig data
Big data
Nausheen Hasan
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
BigMine
 
Big Data Projects Research Ideas
Big Data Projects Research IdeasBig Data Projects Research Ideas
Big Data Projects Research Ideas
Matlab Simulation
 
View on big data technologies
View on big data technologiesView on big data technologies
View on big data technologies
Krisshhna Daasaarii
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
rajkamaltibacademy
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
Petr Novotný
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
kk1718
 
Big data tools
Big data toolsBig data tools
Big data tools
Novita Sari
 
Big Data
Big DataBig Data
Big Data
Neha Mehta
 
Big data
Big dataBig data
Big data
ArchanaMani2
 
Bigdata Analytics using Hadoop
Bigdata Analytics using HadoopBigdata Analytics using Hadoop
Bigdata Analytics using Hadoop
Nagamani Gurram
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data Science
BrijeshGoyani
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
Vamshikrishna Goud
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
Bernard Marr
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 

What's hot (20)

Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
Big data (4Vs,history,concept,algorithm) analysis and applications #bigdata #...
 
Big Data
Big DataBig Data
Big Data
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Big Data Projects Research Ideas
Big Data Projects Research IdeasBig Data Projects Research Ideas
Big Data Projects Research Ideas
 
View on big data technologies
View on big data technologiesView on big data technologies
View on big data technologies
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
 
Big Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big GraphsBig Stream Processing Systems, Big Graphs
Big Stream Processing Systems, Big Graphs
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data tools
Big data toolsBig data tools
Big data tools
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
Bigdata Analytics using Hadoop
Bigdata Analytics using HadoopBigdata Analytics using Hadoop
Bigdata Analytics using Hadoop
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data Science
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must KnowBig Data - The 5 Vs Everyone Must Know
Big Data - The 5 Vs Everyone Must Know
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 

Viewers also liked

Six Sigma
Six Sigma Six Sigma
Six Sigma
iACT Global
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
iACT Global
 
Final purchasing and materials management ppt
Final purchasing and materials management pptFinal purchasing and materials management ppt
Final purchasing and materials management ppt
iACT Global
 
Final purchasing and materials management ppt
Final purchasing and materials management pptFinal purchasing and materials management ppt
Final purchasing and materials management ppt
iACT Global
 
Certification and Training in International Financial Reporting Standards (IFRS)
Certification and Training in International Financial Reporting Standards (IFRS)Certification and Training in International Financial Reporting Standards (IFRS)
Certification and Training in International Financial Reporting Standards (IFRS)
iACT Global
 
IFRS - IACT Global
IFRS - IACT GlobalIFRS - IACT Global
IFRS - IACT Global
iACT Global
 
Purchasing and Material Management training Certification with iACT Global
Purchasing and Material Management training Certification with iACT Global Purchasing and Material Management training Certification with iACT Global
Purchasing and Material Management training Certification with iACT Global
iACT Global
 
Saxen van coller on wild photography
Saxen van coller on wild photographySaxen van coller on wild photography
Saxen van coller on wild photography
Saxen Van Coller
 
10. Пачатак Вялікай Айчыннай вайны
10. Пачатак Вялікай Айчыннай вайны10. Пачатак Вялікай Айчыннай вайны
10. Пачатак Вялікай Айчыннай вайны
AnastasiyaF
 
Myths about lung cancer disease
Myths about lung cancer diseaseMyths about lung cancer disease
Myths about lung cancer disease
lee shin
 
NEWresume (1)
NEWresume (1)NEWresume (1)
NEWresume (1)Sharol Za
 
BLOG-POST_DATA CENTER INCENTIVE PROGRAMS
BLOG-POST_DATA CENTER INCENTIVE PROGRAMSBLOG-POST_DATA CENTER INCENTIVE PROGRAMS
BLOG-POST_DATA CENTER INCENTIVE PROGRAMSDaniel Bodenski
 
Subconsultas y consultas multitabla en bases de datos de sql server
Subconsultas y consultas multitabla en bases de datos de sql serverSubconsultas y consultas multitabla en bases de datos de sql server
Subconsultas y consultas multitabla en bases de datos de sql server
ingrid garcia
 
ENOG 9 (2015)
ENOG 9 (2015)ENOG 9 (2015)
ENOG 9 (2015)
Evgeny Uskov
 
نتيجه السادس الإبتدائي دمياط 2016
نتيجه السادس الإبتدائي دمياط 2016نتيجه السادس الإبتدائي دمياط 2016
نتيجه السادس الإبتدائي دمياط 2016
Nour Elbader
 

Viewers also liked (16)

Six Sigma
Six Sigma Six Sigma
Six Sigma
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Final purchasing and materials management ppt
Final purchasing and materials management pptFinal purchasing and materials management ppt
Final purchasing and materials management ppt
 
Final purchasing and materials management ppt
Final purchasing and materials management pptFinal purchasing and materials management ppt
Final purchasing and materials management ppt
 
Certification and Training in International Financial Reporting Standards (IFRS)
Certification and Training in International Financial Reporting Standards (IFRS)Certification and Training in International Financial Reporting Standards (IFRS)
Certification and Training in International Financial Reporting Standards (IFRS)
 
IFRS - IACT Global
IFRS - IACT GlobalIFRS - IACT Global
IFRS - IACT Global
 
Purchasing and Material Management training Certification with iACT Global
Purchasing and Material Management training Certification with iACT Global Purchasing and Material Management training Certification with iACT Global
Purchasing and Material Management training Certification with iACT Global
 
Saxen van coller on wild photography
Saxen van coller on wild photographySaxen van coller on wild photography
Saxen van coller on wild photography
 
10. Пачатак Вялікай Айчыннай вайны
10. Пачатак Вялікай Айчыннай вайны10. Пачатак Вялікай Айчыннай вайны
10. Пачатак Вялікай Айчыннай вайны
 
Myths about lung cancer disease
Myths about lung cancer diseaseMyths about lung cancer disease
Myths about lung cancer disease
 
NEWresume (1)
NEWresume (1)NEWresume (1)
NEWresume (1)
 
CV (2) 2016 (3)
CV (2) 2016 (3)CV (2) 2016 (3)
CV (2) 2016 (3)
 
BLOG-POST_DATA CENTER INCENTIVE PROGRAMS
BLOG-POST_DATA CENTER INCENTIVE PROGRAMSBLOG-POST_DATA CENTER INCENTIVE PROGRAMS
BLOG-POST_DATA CENTER INCENTIVE PROGRAMS
 
Subconsultas y consultas multitabla en bases de datos de sql server
Subconsultas y consultas multitabla en bases de datos de sql serverSubconsultas y consultas multitabla en bases de datos de sql server
Subconsultas y consultas multitabla en bases de datos de sql server
 
ENOG 9 (2015)
ENOG 9 (2015)ENOG 9 (2015)
ENOG 9 (2015)
 
نتيجه السادس الإبتدائي دمياط 2016
نتيجه السادس الإبتدائي دمياط 2016نتيجه السادس الإبتدائي دمياط 2016
نتيجه السادس الإبتدائي دمياط 2016
 

Similar to Introduction to Big Data & Hadoop

13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
Aravindharamanan S
 
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
exponential-inc
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
Gina Buck
 
Big Data Big Media the new paradigm of multimedia content management with Per...
Big Data Big Media the new paradigm of multimedia content management with Per...Big Data Big Media the new paradigm of multimedia content management with Per...
Big Data Big Media the new paradigm of multimedia content management with Per...
ACTUONDA
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
IBM Cloud Data Services
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
IMC Institute
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff Scheel
Kangaroot
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Edureka!
 
Big Data and Hadoop - key drivers, ecosystem and use cases
Big Data and Hadoop - key drivers, ecosystem and use casesBig Data and Hadoop - key drivers, ecosystem and use cases
Big Data and Hadoop - key drivers, ecosystem and use cases
Jeff Kelly
 
Realizing a multitenant big data infrastructure 3
Realizing a multitenant big data infrastructure 3Realizing a multitenant big data infrastructure 3
Realizing a multitenant big data infrastructure 3
Steven Sit
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark new
Anam Mahmood
 
re:Introduce Big Data and Hadoop Eco-system.
re:Introduce Big Data and Hadoop Eco-system.re:Introduce Big Data and Hadoop Eco-system.
re:Introduce Big Data and Hadoop Eco-system.
Shakir Ali
 
re:Introduce Big Data and Hadoop Eco-system.
re:Introduce Big Data and Hadoop Eco-system.re:Introduce Big Data and Hadoop Eco-system.
re:Introduce Big Data and Hadoop Eco-system.
Shakir Ali
 
Big Data: an introduction
Big Data: an introductionBig Data: an introduction
Big Data: an introduction
Bart Vandewoestyne
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Cynthia Saracco
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoop
Remas Ittahir
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
Rakuten Group, Inc.
 
Accumulo Summit 2014: Addressing big data challenges through innovative archi...
Accumulo Summit 2014: Addressing big data challenges through innovative archi...Accumulo Summit 2014: Addressing big data challenges through innovative archi...
Accumulo Summit 2014: Addressing big data challenges through innovative archi...
Accumulo Summit
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
Bob Marcus
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
Prof.Balakrishnan S
 

Similar to Introduction to Big Data & Hadoop (20)

13 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v313 pv-do es-18-bigdata-v3
13 pv-do es-18-bigdata-v3
 
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
 
Big Data Big Media the new paradigm of multimedia content management with Per...
Big Data Big Media the new paradigm of multimedia content management with Per...Big Data Big Media the new paradigm of multimedia content management with Per...
Big Data Big Media the new paradigm of multimedia content management with Per...
 
Get Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a ServiceGet Started Quickly with IBM's Hadoop as a Service
Get Started Quickly with IBM's Hadoop as a Service
 
Big Data on Public Cloud
Big Data on Public CloudBig Data on Public Cloud
Big Data on Public Cloud
 
Analyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff ScheelAnalyzing Big Data - Jeff Scheel
Analyzing Big Data - Jeff Scheel
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data and Hadoop - key drivers, ecosystem and use cases
Big Data and Hadoop - key drivers, ecosystem and use casesBig Data and Hadoop - key drivers, ecosystem and use cases
Big Data and Hadoop - key drivers, ecosystem and use cases
 
Realizing a multitenant big data infrastructure 3
Realizing a multitenant big data infrastructure 3Realizing a multitenant big data infrastructure 3
Realizing a multitenant big data infrastructure 3
 
Introduction to pyspark new
Introduction to pyspark newIntroduction to pyspark new
Introduction to pyspark new
 
re:Introduce Big Data and Hadoop Eco-system.
re:Introduce Big Data and Hadoop Eco-system.re:Introduce Big Data and Hadoop Eco-system.
re:Introduce Big Data and Hadoop Eco-system.
 
re:Introduce Big Data and Hadoop Eco-system.
re:Introduce Big Data and Hadoop Eco-system.re:Introduce Big Data and Hadoop Eco-system.
re:Introduce Big Data and Hadoop Eco-system.
 
Big Data: an introduction
Big Data: an introductionBig Data: an introduction
Big Data: an introduction
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
Big data with hadoop
Big data with hadoopBig data with hadoop
Big data with hadoop
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
 
Accumulo Summit 2014: Addressing big data challenges through innovative archi...
Accumulo Summit 2014: Addressing big data challenges through innovative archi...Accumulo Summit 2014: Addressing big data challenges through innovative archi...
Accumulo Summit 2014: Addressing big data challenges through innovative archi...
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 

Recently uploaded

World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
Bisnar Chase Personal Injury Attorneys
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 

Recently uploaded (20)

World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 

Introduction to Big Data & Hadoop

  • 2. www.iactglobal.in What this Module 1 about ? After completing this Module, you should be able to: Understand what is Big Data and its characteristics Detailed Understanding about the need for a Big Data solution Understand where Big Data is appropriate List the IBM products that make up IBM’s Big Data strategy Describe the type of data appropriate for: - Infosphere BigInsights - Infosphere Streams List the open source programs that are a part of Infosphere BigInsights. 2
  • 3. www.iactglobal.in System Of Units / Binary System of Units 3 International System Of Units(SI) Binary Usage(deprecated) Kilobyte KB 10^3 2^10 megabyte MB 10^6 2^20 gigabyte GB 10^9 2^30 terabyte TB 10^12 2^40 petabyte PB 10^15 2^50 exabyte EB 10^18 2^60 zettabyte ZB 10^21 2^70 yottabyte YB 10^24 2^80
  • 4. www.iactglobal.in 2.5 petabytes Memory capacity of the human brain 13 petabytes Amount that could be downloaded from the internet in two minutes, if every American (300M) got on a computer at the same time 4.75 exabytes Total genome sequences of all people on the earth 422 exabytes Total digital data created in 2008  1 Zetabyte World’s current digital storage capacity 1.8 Zettabytes Total digital data expected to be created in 2011 4 BigData @ Scale
  • 5. www.iactglobal.in Explosion in data and real world events 5Source : IBM internal : http://www.slideshare.net/jowen_evansdata/keynote-randy-newell-of-ibm
  • 6. www.iactglobal.in Commercial  Web Events / Data Base Logs  Sensor Networks  RFID  Internet Text and Documents  Internet Search Indexing  CDR (Call Detail Records)  Medical Records ….. Etc Government  Regular Government Business & Commerce Needs  Military & Homeland Security Surveillance 6 Examples Of BigData
  • 7. www.iactglobal.in Science  Astronomy  Atmosphere  Biological  Genomics Social  Social Networks  Social Data 7 Examples Of BigData
  • 8. www.iactglobal.in BigData @ Organizations 8Source: http://www.slideshare.net/albertspijkers/2011-07-27baoclientpresentation
  • 9. www.iactglobal.in Perception gap surrounding social media 9Source: IBM internal
  • 10. www.iactglobal.in Big Data Characteristics 10Source: http://www.linguamatics.com/blog/big-data-real-world-data-where-does-text-analytics-fit
  • 11. www.iactglobal.in Challenge @ BigData to find new insights: 11 Source: IBM Internal: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
  • 12. www.iactglobal.in Is there really a need for Big Data? 12 Source: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
  • 13. www.iactglobal.in Case Study and Implementation @ Vestas 13 Vestas wind systems has 43,000 wind turbines in 65 countries over 5 continents Customer Pain Point:  Optimal place to install wind turbine  Must consider large number of location dependant factors like temperature, precipitation, wind velocity and humidity  Existing legacy process doesn’t support all data to be analyzed  Analyzing the data must be completed in hours Solution Required:  Allow to leverage all available data, drastically reduce modeling time, support future expansions in modeling techniques.  Improve accuracy of decisions for wind turbine placement
  • 14. www.iactglobal.in Case Study and Implementation @ Vestas 14 Implementation using InfoSphere BigInsights :  Has created a “wind and site competence center”  Engineers will be modeling data and forecasting optimal turbine locations  Initially to use publically available weather data from nation weather data services as well as own recorded weather data  Data sources considered: global deforestation metrics, satellite images, historical metrics, geospatial data  InfoSphere BigInsights will be used to as a core infrastructure to hold generated weather data
  • 15. www.iactglobal.in Big Data presents big opportunities ? 15 Source:IBM Internal: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2
  • 16. www.iactglobal.in Traditional Vs BigData approaches: 16 Source: http://image.slidesharecdn.com/1524howibmsbigdatasolutioncanhelpyougaininsightintoyourdatacenterv2-130306205122-php
  • 17. www.iactglobal.in 17 Merging the Traditional and Big Data Approaches Source:IBM Internal: http://www.rosebt.com/uploads/8/1/8/1/8181762/3861342_orig.jpg?1
  • 18. www.iactglobal.in Enterprise information architecture: Big Data will be a Permanent part of your Information architecture It cannot be a silo- It Must be fully integrated In order to leverage its Value It must be easy to deploy and integrate 18Source: IBM Internal:http://www.slideshare.net/albertspijkers/2011-07-27baoclientpresentation
  • 19. www.iactglobal.in IBM Big Data platform strategy:  Integrate and manage the full variety, velocity and volume of Big Data  Apply advanced analytics to information in its native form  Visualize all available data for ad- hoc analysis  Development environment for building new analytic applications  Support workload optimization and scheduling  Provide for security and governance  Integrate with enterprise software 19
  • 20. www.iactglobal.in IBM Big Data platform strategy: Source: http://www.slideshare.net/cmeniche/1524-how-ibms-big-data-solution-can-help-you-gain-insight-into-your-data-center-v2 20
  • 21. www.iactglobal.in Enterprise class BigData Product @ IBM: Failure Tolerance:  High availability architecture to support hardware or application failure. Scale Economically:  Runs on scalable hardware with the ability to dynamically add additional nodes. Security & Privacy:  Security protection for granular data access control. 21 Source: IBM internal
  • 22. www.iactglobal.in Different BigInsights editions for varying needs 22Source:IBM Internal: http://www.bloter.net/wp-content/uploads/2013/04/ibm_biginsights_2_1.jpg
  • 23. www.iactglobal.in Different BigInsights editions for varying needs Characteristics that distinguish BigInsights include its built- in support for analytics its integration with other enterprise software, and its production readiness. For InfoSphere BigInsights , there are Two Releases: Basic Edition Enterprise Edition 23
  • 24. www.iactglobal.in Infosphere Streams: 24Source:IBM Internal: https://bruceweed.wordpress.com/tag/ibm-infosphere-streams/
  • 25. www.iactglobal.in To Summarize • An enterprise-ready Big Data platform • Innovative, customer-tested products-InfoSphere BigInsights-InfoSphere Streams • Platform and products enabled for integration with the overall enterprise infrastructure • Even though BigInsights contains open source code-Licensing is like other IBM software offering 25
  • 26. www.iactglobal.in Having completed this Module, you should be able to Understand need for a Big Data solution List the IBM products that make up IBM’s Big Data Strategy Describe the type of data appropriate for: -InfoSphere BigInsights -InfoSphere Streams List the open source programs that are a part of InfoSphere BigInsights 26 To Summarize