SlideShare a Scribd company logo
Introduction to BigData
Name: R.Thilakavathi
Class: II M.Sc Computer Science
Batch:2017-2019
Incharge Staff:Ms.M.Florence Dayana
What’s BigData?
No single definition; here is fromWikipedia:
• Big data is the term for acollection of data sets so large and
complex that it becomes difficult to process using on-hand
database management tools or traditional data processing
applications.
• Thechallenges include capture, curation, storage,search,
sharing, transfer, analysis, and visualization.
• Thetrend to larger data sets is due to the additional
information derivable from analysis of asingle large set of
related data, ascompared to separate smaller sets with the
sametotal amount of data, allowing correlations to be found to
"spot business trends, determine quality of research, prevent
diseases, link legal citations, combat crime, and determinereal-
time roadway traffic conditions.”
2
BigData: 3V’s
3
12+ TBs
of tweet data
every day
25+ TBs of
log data
every day
?TBsof
dataeveryday
2+
billion
peopleon
theWeb
byend
2011
30 billionRFID
tagstoday
(1.3B in 2005)
4.6
billion
camera
phones
worldwide
100s of
millions
of GPS
enabled
devicessold
annually
76 millionsmart meters
in 2009…
200M by2014
TheEarthscope
• TheEarthscope is the world's largest
scienceproject. Designedto track
North America's geological evolution,
this observatory records data over 3.8
million square miles, amassing 67
terabytes of data. It analyzesseismic
slips in the SanAndreas fault, sure,but
also the plume of magmaunderneath
Yellowstone and much, much more.
(http://www.msnbc.msn.com/id/4436
3598/ns/technology_and_science-
future_of_technology/#.TmetOdQ--uI)
Variety (Complexity)
• Relational Data (Tables/Transaction/Legacy
Data)
• TextData (Web)
• Semi-structured Data (XML)
• Graph Data
– SocialNetwork, Semantic Web (RDF),…
• Streaming Data
– Youcanonly scanthe data once
• Asingle application canbe generating/collecting
many types of data
• Big Public Data (online, weather, finance, etc)
6
Toextract knowledge all these types of
data need to linkedtogether
A Single View to theCustomer
Customer
Social
Media
Gaming
Entertain
Banking
Finance
Our
Known
History
Purchas
e
Velocity (Speed)
• Data is begin generated fast and need to be
processed fast
• Online DataAnalytics
• Late decisions  missingopportunities
• Examples
– E-Promotions: Basedon your current location, your purchase history, what
you like  send promotions right now for store next to you
– Healthcare monitoring: sensors monitoring your activities and body any
abnormal measurements require immediate reaction
8
Real-time/Fast Data
Social media andnetworks
(all of usare generatingdata)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all thetime)
Sensor technology andnetworks
(measuring all kinds of data)
• Theprogress and innovation isno longer hindered by the ability to collect data
• But, by the ability to manage, analyze, summarize, visualize, and discover knowledge
from the collected data in a timely manner and in a scalable fashion
9
Real-Time Analytics/Decision Requirement
Customer
Influence
Behavior
Product
Recommendations
that are Relevant
& Compelling
FriendInvitations
to joina
Game orActivity
that expands
business
Preventing Fraud
as it is Occurring
& preventingmore
proactively
Learning whyCustomers
Switch to competitors
and their offers; in
time toCounter
Improving the
Marketing
Effectiveness of a
Promotion whileit
is still inPlay
SomeMake it 4V’s
11
Harnessing Big Data
(DBMSs)
(Data Warehousing)
• OLTP:Online Transaction Processing
• OLAP: OnlineAnalytical Processing
• RTAP:Real-TimeAnalytics Processing (Big DataArchitecture & technology)
12
TheModel HasChanged…
• TheModel of Generating/Consuming Datahas
Changed
Old Model: Fewcompanies are generating data, all others are consumingdata
New Model: all of us are generating data, and all of us are consuming data
13
What’s driving BigData
- Ad-hoc querying and reporting
- Data mining techniques
- Structured data, typical sources
- Smallto mid-sizedatasets
- Optimizations and predictive analytics
- Complexstatistical analysis
- All types of data, and manysources
- Very large datasets
- More of areal-time
14
Big Data:
Batch Processing &
Distributed DataStore
Hadoop/Spark; HBase/Cassandra
BI Reporting
OLAP&
Datawarehouse
BusinessObjects, SAS,
Informatica, Cognosother SQL
Reporting Tools
Interactive Business
Intelligence&
In-memoryRDBMS
QliqView, Tableau,HANA
Big Data:
Real Time&
Single View
GraphDatabases
THEEVOLUTION OFBUSINESS INTELLIGENCE
1990’s 2000’s 2010’s
Speed
Scale
Scale
Speed
BigData Analytics
• Bigdata is more real-time in nature
than traditional DWapplications
• Traditional DWarchitectures (e.g.
Exadata,Teradata)are not well-
suited for big dataapps
• Shared nothing, massively parallel
processing, scale out architectures
are well-suited for big dataapps
16
BigData Technology
17
Benefits
• Cost& management
– Economiesof scale, “out-sourced”resource
management
• ReducedTime to deployment
– Easeof assembly,works “out of thebox”
• Scaling
– Ondemand provisioning, co-locate data andcompute
• Reliability
– Massive, redundant, sharedresources
• Sustainability
– Hardware not owned
Infrastructure asaService (IaaS)
More RefinedCategorization
• Storage-as-a-service
• Database-as-a-service
• Information-as-a-service
• Process-as-a-service
• Application-as-a-service
• Platform-as-a-service
• Integration-as-a-service
• Security-as-a-service
• Management/
Governance-as-a-service
• Testing-as-a-service
• Infrastructure-as-a-service InfoWorld Cloud Computing DeepDive
Enabling Technology: Virtualization
Hardware
OperatingSystem
App App App
Hardware
OS
App App App
Hypervisor
OS OS
What does Azure platform offer to
developers?
June3,2008 Slide23
GoogleAppEnginevs.Amazon
EC2/S3
Google’s AppEngine vsAmazon’s EC2
AppEngine:
• Higher-level functionality
(e.g., automatic scaling)
• More restrictive
(e.g., respond to URLonly)
• Proprietary lock-in
EC2/S3:
• Lower-level functionality
• More flexible
• Coarserbilling model
VMs
Flat FileStorage
Python
BigTable
OtherAPI’s
Cloud Resources
• Hadoop on your localmachine
• Hadoop in avirtual machine on yourlocal
machine (Pseudo-Distributed on Ubuntu)
• Hadoop in the clouds with AmazonEC2
Course Prerequisite
• Prerequisite:
– JavaProgramming / C++
– Data Structures andAlgorithm
– ComputerArchitecture
– BasicStatistics and Probability
– Database and Data Mining (preferred)
25
THANK YOU

More Related Content

What's hot

Big Data
Big DataBig Data
Big Data
Priyanka Tuteja
 
Intro to big data and applications - day 2
Intro to big data and applications - day 2Intro to big data and applications - day 2
Intro to big data and applications - day 2
Parviz Vakili
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
Muh Saleh
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
Chirag Ahuja
 
View on big data technologies
View on big data technologiesView on big data technologies
View on big data technologies
Krisshhna Daasaarii
 
Big Data
Big DataBig Data
Big Data
Neha Mehta
 
Big data
Big dataBig data
Big data
Nausheen Hasan
 
Big Data Hadoop
Big Data HadoopBig Data Hadoop
Big Data Hadoop
Techsparks
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
BigMine
 
Big data.
Big data.Big data.
Big data.
MeganShaw38
 
Big Data
Big DataBig Data
Big Data
Vinayak Kamath
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
kk1718
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
NAGARAJAGIDDE
 
Big data tools
Big data toolsBig data tools
Big data tools
Novita Sari
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
rajkamaltibacademy
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdf
Anand572211
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business Needs
Bernard Marr
 
Big data
Big dataBig data
Big data
ArchanaMani2
 

What's hot (20)

Big Data
Big DataBig Data
Big Data
 
Intro to big data and applications - day 2
Intro to big data and applications - day 2Intro to big data and applications - day 2
Intro to big data and applications - day 2
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
View on big data technologies
View on big data technologiesView on big data technologies
View on big data technologies
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
Big Data Hadoop
Big Data HadoopBig Data Hadoop
Big Data Hadoop
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
Big data.
Big data.Big data.
Big data.
 
Big Data
Big DataBig Data
Big Data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 
Big data tools
Big data toolsBig data tools
Big data tools
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdf
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business Needs
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 

Similar to Thilga

Bigdata (1) converted
Bigdata (1) convertedBigdata (1) converted
Bigdata (1) converted
THILAKAVATHIRAMRAJ
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
Indu Khemchandani
 
Big data
Big dataBig data
Big data
Hoang Nguyen
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
Albert Alex
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
RojaT4
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
rajsharma159890
 
Big Data
Big DataBig Data
Big Data
Seminar Links
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
naveenlingala2
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
ShivanandaVSeeri
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
Tony Bain
 
Why Big Data is Really about Small Data
Why Big Data is Really about Small DataWhy Big Data is Really about Small Data
Why Big Data is Really about Small Data
Hurwitz & Associates
 
ai based computer basic learning Lecture about Bigdata.ppt
ai based computer basic learning Lecture about Bigdata.pptai based computer basic learning Lecture about Bigdata.ppt
ai based computer basic learning Lecture about Bigdata.ppt
ALAMGIRHOSSAIN256982
 
Unit 1
Unit 1Unit 1
What Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdfWhat Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdf
Pridesys IT Ltd.
 

Similar to Thilga (20)

Bigdata (1) converted
Bigdata (1) convertedBigdata (1) converted
Bigdata (1) converted
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
bigdataintro.pptx
bigdataintro.pptxbigdataintro.pptx
bigdataintro.pptx
 
All About Big Data
All About Big Data All About Big Data
All About Big Data
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Big-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdfBig-Data-Analytics.8592259.powerpoint.pdf
Big-Data-Analytics.8592259.powerpoint.pdf
 
Big Data
Big DataBig Data
Big Data
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Why Big Data is Really about Small Data
Why Big Data is Really about Small DataWhy Big Data is Really about Small Data
Why Big Data is Really about Small Data
 
ai based computer basic learning Lecture about Bigdata.ppt
ai based computer basic learning Lecture about Bigdata.pptai based computer basic learning Lecture about Bigdata.ppt
ai based computer basic learning Lecture about Bigdata.ppt
 
Unit 1
Unit 1Unit 1
Unit 1
 
What Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdfWhat Is Big Data How Big Data Works.pdf
What Is Big Data How Big Data Works.pdf
 

Recently uploaded

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 

Recently uploaded (20)

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 

Thilga

  • 1. Introduction to BigData Name: R.Thilakavathi Class: II M.Sc Computer Science Batch:2017-2019 Incharge Staff:Ms.M.Florence Dayana
  • 2. What’s BigData? No single definition; here is fromWikipedia: • Big data is the term for acollection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. • Thechallenges include capture, curation, storage,search, sharing, transfer, analysis, and visualization. • Thetrend to larger data sets is due to the additional information derivable from analysis of asingle large set of related data, ascompared to separate smaller sets with the sametotal amount of data, allowing correlations to be found to "spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determinereal- time roadway traffic conditions.” 2
  • 4. 12+ TBs of tweet data every day 25+ TBs of log data every day ?TBsof dataeveryday 2+ billion peopleon theWeb byend 2011 30 billionRFID tagstoday (1.3B in 2005) 4.6 billion camera phones worldwide 100s of millions of GPS enabled devicessold annually 76 millionsmart meters in 2009… 200M by2014
  • 5. TheEarthscope • TheEarthscope is the world's largest scienceproject. Designedto track North America's geological evolution, this observatory records data over 3.8 million square miles, amassing 67 terabytes of data. It analyzesseismic slips in the SanAndreas fault, sure,but also the plume of magmaunderneath Yellowstone and much, much more. (http://www.msnbc.msn.com/id/4436 3598/ns/technology_and_science- future_of_technology/#.TmetOdQ--uI)
  • 6. Variety (Complexity) • Relational Data (Tables/Transaction/Legacy Data) • TextData (Web) • Semi-structured Data (XML) • Graph Data – SocialNetwork, Semantic Web (RDF),… • Streaming Data – Youcanonly scanthe data once • Asingle application canbe generating/collecting many types of data • Big Public Data (online, weather, finance, etc) 6 Toextract knowledge all these types of data need to linkedtogether
  • 7. A Single View to theCustomer Customer Social Media Gaming Entertain Banking Finance Our Known History Purchas e
  • 8. Velocity (Speed) • Data is begin generated fast and need to be processed fast • Online DataAnalytics • Late decisions  missingopportunities • Examples – E-Promotions: Basedon your current location, your purchase history, what you like  send promotions right now for store next to you – Healthcare monitoring: sensors monitoring your activities and body any abnormal measurements require immediate reaction 8
  • 9. Real-time/Fast Data Social media andnetworks (all of usare generatingdata) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all thetime) Sensor technology andnetworks (measuring all kinds of data) • Theprogress and innovation isno longer hindered by the ability to collect data • But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 9
  • 10. Real-Time Analytics/Decision Requirement Customer Influence Behavior Product Recommendations that are Relevant & Compelling FriendInvitations to joina Game orActivity that expands business Preventing Fraud as it is Occurring & preventingmore proactively Learning whyCustomers Switch to competitors and their offers; in time toCounter Improving the Marketing Effectiveness of a Promotion whileit is still inPlay
  • 12. Harnessing Big Data (DBMSs) (Data Warehousing) • OLTP:Online Transaction Processing • OLAP: OnlineAnalytical Processing • RTAP:Real-TimeAnalytics Processing (Big DataArchitecture & technology) 12
  • 13. TheModel HasChanged… • TheModel of Generating/Consuming Datahas Changed Old Model: Fewcompanies are generating data, all others are consumingdata New Model: all of us are generating data, and all of us are consuming data 13
  • 14. What’s driving BigData - Ad-hoc querying and reporting - Data mining techniques - Structured data, typical sources - Smallto mid-sizedatasets - Optimizations and predictive analytics - Complexstatistical analysis - All types of data, and manysources - Very large datasets - More of areal-time 14
  • 15. Big Data: Batch Processing & Distributed DataStore Hadoop/Spark; HBase/Cassandra BI Reporting OLAP& Datawarehouse BusinessObjects, SAS, Informatica, Cognosother SQL Reporting Tools Interactive Business Intelligence& In-memoryRDBMS QliqView, Tableau,HANA Big Data: Real Time& Single View GraphDatabases THEEVOLUTION OFBUSINESS INTELLIGENCE 1990’s 2000’s 2010’s Speed Scale Scale Speed
  • 16. BigData Analytics • Bigdata is more real-time in nature than traditional DWapplications • Traditional DWarchitectures (e.g. Exadata,Teradata)are not well- suited for big dataapps • Shared nothing, massively parallel processing, scale out architectures are well-suited for big dataapps 16
  • 18. Benefits • Cost& management – Economiesof scale, “out-sourced”resource management • ReducedTime to deployment – Easeof assembly,works “out of thebox” • Scaling – Ondemand provisioning, co-locate data andcompute • Reliability – Massive, redundant, sharedresources • Sustainability – Hardware not owned
  • 20. More RefinedCategorization • Storage-as-a-service • Database-as-a-service • Information-as-a-service • Process-as-a-service • Application-as-a-service • Platform-as-a-service • Integration-as-a-service • Security-as-a-service • Management/ Governance-as-a-service • Testing-as-a-service • Infrastructure-as-a-service InfoWorld Cloud Computing DeepDive
  • 21. Enabling Technology: Virtualization Hardware OperatingSystem App App App Hardware OS App App App Hypervisor OS OS
  • 22. What does Azure platform offer to developers?
  • 23. June3,2008 Slide23 GoogleAppEnginevs.Amazon EC2/S3 Google’s AppEngine vsAmazon’s EC2 AppEngine: • Higher-level functionality (e.g., automatic scaling) • More restrictive (e.g., respond to URLonly) • Proprietary lock-in EC2/S3: • Lower-level functionality • More flexible • Coarserbilling model VMs Flat FileStorage Python BigTable OtherAPI’s
  • 24. Cloud Resources • Hadoop on your localmachine • Hadoop in avirtual machine on yourlocal machine (Pseudo-Distributed on Ubuntu) • Hadoop in the clouds with AmazonEC2
  • 25. Course Prerequisite • Prerequisite: – JavaProgramming / C++ – Data Structures andAlgorithm – ComputerArchitecture – BasicStatistics and Probability – Database and Data Mining (preferred) 25