SlideShare a Scribd company logo
Big Data
Principles of Database Design
Textbook Reference:
Oracle The Big Data Handbook
Today we will discuss:
• What is Data?
• Why Big Data?
• How it is Different?
• Characteristic of Big Data
• Application of Big Data
• Benefits of Big Data
• Future of Big Data
2
To be continued …
What is Data?
• Data can be any character, text, words,
number, pictures, sound, or video and, if not
put into context, means little or nothing to a
human.
• Information is useful and usually formatted
in a manner that allows it to be understood
by a human.
Big data V/s Small Data
Big Data
• The large picture
• Encompasses many
different types of data
• Unstructured data
• Unfocused
• Difficulty to interpret
Small Data
• The small picture.
• Mostly Homogenous
• Structured
• Focused
• Easily Interpreted
Why the hype around Big Data?
• An aim to solve new problems or old
problems in a better way
• Big Data generates value from the storage
and processing of very large quantities of
digital information
How is big data different?
• Automatically generated by a machine (e.g.
Sensor embedded in an engine)
• Typically an entirely new source of data (e.g.
Use of the internet)
• Not designed to be friendly (e.g. Text
streams)
• May not have much values
• Need to focus on the important part
Some real examples
How big is big data?
• Analysts predict that by 2020, there will be 5,200
gigabytes of data on every person in the world.
• On average, people send about 500 million tweets per
day.
• The average U.S. customer uses 1.8 gigabytes of data
per month on his or her cell phone plan.
• Walmart processes one million customer transactions
per hour.
• Amazon sells 600 items per second.
• On average, each person who uses email receives 88
emails per day and send 34. That adds up to more than
200 billion emails each day.
• MasterCard processes 74 billion transactions per year.
• Commercial airlines make about 5,800 flights per day.
Big data is not much howbig the
data is, it is about the value within
the data
Characteristics of Big Data
Volume Data Quantity
VarietyData Types
Velocity Data Speed
Volume
Refers to vast amount of data that is generated
every second
Volume
• Today, Facebook ingests 500 terabytes of new
data every day.
• Boeing 737 will generate 240 terabytes of
flight data during a single flight across the US.
• The smart phones, the data they create and
consume; sensors embedded into everyday
objects will soon result in billions of new,
constantly-updated data feeds containing
environmental, location, and other
information, including video.
Velocity
Refers to the speed at which new data is
generated
Velocity
• Clickstreams and ad impressions capture user
behavior at millions of events per second
• High-frequency stock trading algorithms
reflect market changes within microseconds
• Machine to machine processes exchange data
between billions of devices
• Infrastructure and sensors generate massive
log data in realtime
• On-line gaming systems support millions of
concurrent users, each producing multiple
inputs per second.
Variety
Different Types of Data
Variety
• Big Data analysis includes different types of
data
• Geospatial data, 3D data, audio and video,
and unstructured text, including log files and
social media.
• Traditional database systems were designed
to address smaller volumes of structured
data, fewer updates or a predictable,
consistent data structure.
Some common Types
• Activity Data
• Conversation Data
• Photo and Video Image
• Sensor Data
• IoT Data
• Scientific Data
• Geo-spatial Data
• Biological Data
Veracity – The 4th V
Refers to the massiveness or trust worthies of
the data
Big Data Sources
Sources
Users
Systems
Sensors
Application
Storing Big Data
• Data models: key value, graph, document,
column-family
• Hadoop Distributed File System
• HBase
• Hive
Overview of Big Data stores
Storing Big Data
• Selecting data sources for analysis
• Eliminating redundant data
• Establishing the role of NoSQL
Analyzing your data characteristics
Data Analytics
• Examining large amount of data
• Identification of hidden patterns, unknown
correlations
• Better business decisions: strategic and
operational
• Effective marketing, customer satisfaction,
increased revenue
TYPES OF TOOLS IN BIG DATA
Where processing is hosted?
• Distributed Servers / Cloud (e.g. Amazon EC2)
Where data is stored?
• Distributed Storage (e.g. Amazon S3)
What is the programming model?
• Distributed Processing (e.g. MapReduce)
How data is stored & indexed?
• High-performance schema-free databases (e.g. MongoDB)
What operations are performed on data?
• Analytic / Semantic Processing
Risks of Big Data
• Will be so overwhelmed
• Need the right people and solve the right
problems
• Costs escalate too fast
• Isn’t necessary to capture 100%
• Data privacy
• Self-regulation
• Legal regulation
Applications of Big Data
Search Quality
Better understand and
target customers
Understand and
Optimize Business
Improving Health
Improving Security
and Law Enforcement
Trading Analytics
Multichannel Sales
Basically…. Endless….
Benefits of Big Data
• Ability to make better decisions and take
meaningful actions at the right time.
• Technologies like Hadoop give you the scale
and flexibility to store data before you know
how you are going to process it.
Benefits of Big Data
• Organizations are using big data to target
customer-centric outcomes, tap into internal
data and build a better information
ecosystem.
• Technologies such as MapReduce, Hive and
Impala enable you to run queries without
changing the data structures underneath.
Future of Big Data
• $15 billion on software firms only specializing
in data management and analytics.
• This industry on its own is worth more than
$100 billion and growing at almost 10% a
year which is roughly twice as fast as the
software business as a whole. •
• In February 2012, the open source analyst
firm Wikibon released the first market
forecast for Big Data , listing $5.1B revenue in
2012 with growth to $53.4B in 2017
Thank you
You can download the presentation at
slideshare.com/enfarose

More Related Content

What's hot

Big Data
Big DataBig Data
Big Data
Rohit Jain
 
Data Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical ApproachesData Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical Approaches
DATAVERSITY
 
Big data, Big decision
Big data, Big decisionBig data, Big decision
Big data, Big decision
Venkatesh Balakumar
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
hktripathy
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
Guido Schmutz
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
Md. Salman Ahmed
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
James Serra
 
Big data
Big dataBig data
Big data
Nausheen Hasan
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
Ajay Ohri
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
malathieswaran29
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
Aayush Kumar
 
Business Intelligence-v1.pptx
Business Intelligence-v1.pptxBusiness Intelligence-v1.pptx
Business Intelligence-v1.pptx
RandhirShah3
 
Overview of Big data(ppt)
Overview of Big data(ppt)Overview of Big data(ppt)
Overview of Big data(ppt)
Shatavisha Roy Chowdhury
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Ghulam Imaduddin
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
SiamAhmed16
 
Big Data
Big DataBig Data
Big Data
Vinayak Kamath
 
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Simplilearn
 
Big data
Big dataBig data
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Simplilearn
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
Osman Ali
 

What's hot (20)

Big Data
Big DataBig Data
Big Data
 
Data Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical ApproachesData Modeling Best Practices - Business & Technical Approaches
Data Modeling Best Practices - Business & Technical Approaches
 
Big data, Big decision
Big data, Big decisionBig data, Big decision
Big data, Big decision
 
Lecture1 introduction to big data
Lecture1 introduction to big dataLecture1 introduction to big data
Lecture1 introduction to big data
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Big data
Big dataBig data
Big data
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
 
Data mining techniques unit 1
Data mining techniques  unit 1Data mining techniques  unit 1
Data mining techniques unit 1
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
 
Business Intelligence-v1.pptx
Business Intelligence-v1.pptxBusiness Intelligence-v1.pptx
Business Intelligence-v1.pptx
 
Overview of Big data(ppt)
Overview of Big data(ppt)Overview of Big data(ppt)
Overview of Big data(ppt)
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
Big Data
Big DataBig Data
Big Data
 
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
Hadoop Ecosystem | Hadoop Ecosystem Tutorial | Hadoop Tutorial For Beginners ...
 
Big data
Big dataBig data
Big data
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 

Similar to Big data

Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
dickonsondorris
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
nayanbhatia2
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
Vedanand Singh
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
kalai75
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
KARTIKEY TRIPATHI
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
TanguturiAvinash
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
Vamshikrishna Goud
 
bigdatappt.pptx
bigdatappt.pptxbigdatappt.pptx
bigdatappt.pptx
KrishnaTeja570279
 
Big data
Big dataBig data
Big data
SaraRao3
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
Tomy Rhymond
 
Big Data Analytics - A Glimpse
Big Data Analytics - A GlimpseBig Data Analytics - A Glimpse
Big Data Analytics - A Glimpse
Laguna State Polytechnic University
 
BigData.pptx
BigData.pptxBigData.pptx
BigData.pptx
vidhi171881
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Umair Shafique
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
Bohitesh Misra, PMP
 
Big Data Analytics.pdfbgfjgjgghfhhffhdfyf
Big Data Analytics.pdfbgfjgjgghfhhffhdfyfBig Data Analytics.pdfbgfjgjgghfhhffhdfyf
Big Data Analytics.pdfbgfjgjgghfhhffhdfyf
VijayKaran7
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
Terry Bunio
 
Big data
Big dataBig data
Big data
Mahmudul Alam
 
Unit 1 (DSBDA) PD.pptx
Unit 1 (DSBDA)  PD.pptxUnit 1 (DSBDA)  PD.pptx
Unit 1 (DSBDA) PD.pptx
Samiksha880257
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
Skillwise Consulting
 

Similar to Big data (20)

Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
bigdatappt.pptx
bigdatappt.pptxbigdatappt.pptx
bigdatappt.pptx
 
Big data
Big dataBig data
Big data
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
Big Data Analytics - A Glimpse
Big Data Analytics - A GlimpseBig Data Analytics - A Glimpse
Big Data Analytics - A Glimpse
 
BigData.pptx
BigData.pptxBigData.pptx
BigData.pptx
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Big Data Analytics.pdfbgfjgjgghfhhffhdfyf
Big Data Analytics.pdfbgfjgjgghfhhffhdfyfBig Data Analytics.pdfbgfjgjgghfhhffhdfyf
Big Data Analytics.pdfbgfjgjgghfhhffhdfyf
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
Big data
Big dataBig data
Big data
 
Unit 1 (DSBDA) PD.pptx
Unit 1 (DSBDA)  PD.pptxUnit 1 (DSBDA)  PD.pptx
Unit 1 (DSBDA) PD.pptx
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 

Recently uploaded

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 

Recently uploaded (20)

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 

Big data

  • 1. Big Data Principles of Database Design Textbook Reference: Oracle The Big Data Handbook
  • 2. Today we will discuss: • What is Data? • Why Big Data? • How it is Different? • Characteristic of Big Data • Application of Big Data • Benefits of Big Data • Future of Big Data 2 To be continued …
  • 3. What is Data? • Data can be any character, text, words, number, pictures, sound, or video and, if not put into context, means little or nothing to a human. • Information is useful and usually formatted in a manner that allows it to be understood by a human.
  • 4. Big data V/s Small Data Big Data • The large picture • Encompasses many different types of data • Unstructured data • Unfocused • Difficulty to interpret Small Data • The small picture. • Mostly Homogenous • Structured • Focused • Easily Interpreted
  • 5. Why the hype around Big Data? • An aim to solve new problems or old problems in a better way • Big Data generates value from the storage and processing of very large quantities of digital information
  • 6. How is big data different? • Automatically generated by a machine (e.g. Sensor embedded in an engine) • Typically an entirely new source of data (e.g. Use of the internet) • Not designed to be friendly (e.g. Text streams) • May not have much values • Need to focus on the important part
  • 8. How big is big data? • Analysts predict that by 2020, there will be 5,200 gigabytes of data on every person in the world. • On average, people send about 500 million tweets per day. • The average U.S. customer uses 1.8 gigabytes of data per month on his or her cell phone plan. • Walmart processes one million customer transactions per hour. • Amazon sells 600 items per second. • On average, each person who uses email receives 88 emails per day and send 34. That adds up to more than 200 billion emails each day. • MasterCard processes 74 billion transactions per year. • Commercial airlines make about 5,800 flights per day.
  • 9. Big data is not much howbig the data is, it is about the value within the data
  • 10. Characteristics of Big Data Volume Data Quantity VarietyData Types Velocity Data Speed
  • 11. Volume Refers to vast amount of data that is generated every second
  • 12. Volume • Today, Facebook ingests 500 terabytes of new data every day. • Boeing 737 will generate 240 terabytes of flight data during a single flight across the US. • The smart phones, the data they create and consume; sensors embedded into everyday objects will soon result in billions of new, constantly-updated data feeds containing environmental, location, and other information, including video.
  • 13. Velocity Refers to the speed at which new data is generated
  • 14. Velocity • Clickstreams and ad impressions capture user behavior at millions of events per second • High-frequency stock trading algorithms reflect market changes within microseconds • Machine to machine processes exchange data between billions of devices • Infrastructure and sensors generate massive log data in realtime • On-line gaming systems support millions of concurrent users, each producing multiple inputs per second.
  • 16. Variety • Big Data analysis includes different types of data • Geospatial data, 3D data, audio and video, and unstructured text, including log files and social media. • Traditional database systems were designed to address smaller volumes of structured data, fewer updates or a predictable, consistent data structure.
  • 17. Some common Types • Activity Data • Conversation Data • Photo and Video Image • Sensor Data • IoT Data • Scientific Data • Geo-spatial Data • Biological Data
  • 18. Veracity – The 4th V Refers to the massiveness or trust worthies of the data
  • 20. Storing Big Data • Data models: key value, graph, document, column-family • Hadoop Distributed File System • HBase • Hive Overview of Big Data stores
  • 21. Storing Big Data • Selecting data sources for analysis • Eliminating redundant data • Establishing the role of NoSQL Analyzing your data characteristics
  • 22. Data Analytics • Examining large amount of data • Identification of hidden patterns, unknown correlations • Better business decisions: strategic and operational • Effective marketing, customer satisfaction, increased revenue
  • 23. TYPES OF TOOLS IN BIG DATA
  • 24. Where processing is hosted? • Distributed Servers / Cloud (e.g. Amazon EC2)
  • 25. Where data is stored? • Distributed Storage (e.g. Amazon S3)
  • 26. What is the programming model? • Distributed Processing (e.g. MapReduce)
  • 27. How data is stored & indexed? • High-performance schema-free databases (e.g. MongoDB)
  • 28. What operations are performed on data? • Analytic / Semantic Processing
  • 29. Risks of Big Data • Will be so overwhelmed • Need the right people and solve the right problems • Costs escalate too fast • Isn’t necessary to capture 100% • Data privacy • Self-regulation • Legal regulation
  • 39. Benefits of Big Data • Ability to make better decisions and take meaningful actions at the right time. • Technologies like Hadoop give you the scale and flexibility to store data before you know how you are going to process it.
  • 40. Benefits of Big Data • Organizations are using big data to target customer-centric outcomes, tap into internal data and build a better information ecosystem. • Technologies such as MapReduce, Hive and Impala enable you to run queries without changing the data structures underneath.
  • 41. Future of Big Data • $15 billion on software firms only specializing in data management and analytics. • This industry on its own is worth more than $100 billion and growing at almost 10% a year which is roughly twice as fast as the software business as a whole. • • In February 2012, the open source analyst firm Wikibon released the first market forecast for Big Data , listing $5.1B revenue in 2012 with growth to $53.4B in 2017
  • 42. Thank you You can download the presentation at slideshare.com/enfarose