SlideShare a Scribd company logo
1 of 33
AUGUST 2016
Big Data Analytics for BI/BA/QA
Dmitry Tolpeko
2
BIG DATA
Why was it invented?
How is it used now?
How will it be used in the near future?
What do we need to do to stay competitive?
3
FIRST QUESTIONS
What size does it start?
Is it just another technology vendor?
4
IN REALITY
It is very easy to start using Hadoop
and Cloud now.
So it is true that now most people doing
traditional things with just larger data
sets.
And at much lower cost, of course.
So it looks like the size matters, and this
is just another technology
5
BUT IT IS …
Completely new mindset and
approach to analytics
Solution to satisfy new, “mass
market” analytics
And you cannot skip it
6
YOU CAN FEEL THIS AS …
Developers (Java, .NET etc.), non-
BI and even non-IT people talk and
work with analytics today.
That was not the case before.
So what happens?
7
TRADITIONAL ANALYTICS
Expensive
Separate and isolated BI world
Analyzing transactions (data you
cannot afford to lose or calculate
with errors)
Historical data and strategic decisions
8
AND TODAY THIS IS …
Very small % of analytics (1-5%?)
Analytics Boom
9
EVERTHING IS ABOUT DATA
Mindset: Data Analysis
not OLTP, DWH, ETL
Kimball/Inmon
Any application: UX+Analytics
(Machine Learning i.e.)
Competing on analytics, not just
product and service
Analytics become operational,
mass market
10
THE NEXT BIG SHIFT?
 Digital Transformation of Economy
IoT, VR, AR, Machine Learning, AI
Personalized UX
Heavily relies on analytics
11
ANALYTICS TODAY
 Fast, Advanced and Predictive
Analytics
o Personalization and customization: from
summary reports to a lot of tailored
data-driven actions (in near real time)
o Fast prototyping, implementation,
deployment and fast performance
o Data lakes
12
EXAMPLE - YESTERDAY
Company sends promo by email to
1M users paying $1 for each email,
50,000 users purchased goods at
$25
Profit: 50,000 * $25 - $1M =
$250,000
This is what traditional analytics
does.
13
EXAMPLE - TODAY
Today
Company identified to send promo
email just to 100,000 users, now
30,000 users purchased goods at $25
Profit: 30,000 * $25 - $100K =
$650,000
No new customers, no new
contracts – just algorithms and more
data
14
USE CASES
o Anomaly Detection
o Recommendation Systems
o Loyalty and Retention Programs
o Optimization
o A/B Testing
o Alarms, Scoring, Diagnosis
o Demand Forecasting and so on.
15
NEW CORE SKILLS
Distributed Data Processing and
Streaming Analytics
Programming (Python, R, Spark)
Math, Statistics
Machine Learning
Deep Learning
16
MACHINE LEARNING
Automation of discovery
Automatically adapt to new
circumstances
Detect patterns
In wide use now. “Self-testing”.
Few lines of code
17
BUILDING BLOCKS
Enriching analysis, development and
quality in software development
o Generic algorithms vs hardcoding
endless IF-ELSE
o Discovering hidden, not obvious
patterns
o Finding anomalies, outliers vs test
cases
18
BI TOOLS NOW
Self-service (less jobs?)
Advanced analytics (requires
understanding stats and machine
learning fundamentals)
19
SOURCE DATA
Non-transactional systems, weak or
no data model
Calculations with probability
Raw, unstructured data from
diverse data sources
Extracting small relevant pieces of
data from huge data sets
20
PEOPLE
Data engineers
Data scientists
Significant work force, not just 1-
5% as in BI
21
GOOD NEWS
BI people still good match as they
love crunching data
But significant shift in skills is
required
22
WHY TO BE INVOLVED
o Cutting edge
o Challenges
o Cool staff (predictions, AI
etc.)
o Growth, margin and revenue
23
HOW TO BE INVOLVED
o Mindset
o Skills
o Experience
o Solutions
24
PLATFORMS
25
TRADITIONAL EDW PLATFORMS
o Too expensive ($10,000 per TB and more)
o Large upfront cost
o Not easy procurement, setup and
maintenance
o Designed for relational data, SQL interface
only, limited schema flexibility
o Data must be loaded first (modeled,
prepared and moved)
o Marketing limitations for Appliances
26
TRADITIONAL OPEN SOURCE PLATFORMS
• Designed for relational data, SQL interface
only, limited schema flexibility
• Data must be loaded first (modeled,
prepared and moved)
• Not easily scalable (scale up and down)
27
TRADITIONAL DATA MINING TOOLS
• Expensive
• Smaller community (one more isolated
world)
• Targeted for enterprise users
• Longer release cycles, no way to mix tools
and try fresh new staff etc.
• Scalability and integration issues
28
WHY BIG DATA AND CLOUD
o Extremely economically attractive
o Scalable and elastic
o Self service
o Rich and diverse data tools
o Good enough quality (and
constantly improving)
29
BIG DATA AND CLOUD DESIGN PRINCIPLES
Decoupling Data Storage and Computing
o Database engine does not own data anymore
o Simplified load/extract
o Schema on read
o Not just SQL interface
o Any computing engines on top of data
Commodity Hardware
o Fault tolerant
Scale up and down
30
GROW PATH
From monolithic suites to diverse and rich tool set
SQL tools on Hadoop, Cloud
Advanced Data Analysis and Analytics
o Spark, MapReduce, NoSQL
o Python, R, Java, Scala
o Statistics
o Batch, Streaming, Real-time
Machine Learning and Deep Learning
o Understand use cases
o Understand specific algorithms and their
application
o Implementation
31
GAME (HOME WORK)
32
LET’S WIN THIS CAR
Suppose you're on a game show, and
you're given the choice of three
doors:
Behind one door is a car; behind the
others, goats.
You pick a door, say No. 3
33
SWITCH OR NOT?
Then the host, who knows what's
behind the doors, opens another
door, say No. 2, which has a goat.
He then says to you, "Do you want
to pick door No. 1?"
Is it to your advantage to switch
your choice?

More Related Content

What's hot

Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the EnterpriseSrinath Perera
 
Big Data Rampage
Big Data RampageBig Data Rampage
Big Data RampageNiko Vuokko
 
Big data session five ( a )f
Big data session five ( a )fBig data session five ( a )f
Big data session five ( a )fmarukanda
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsMohd Izhar Firdaus Ismail
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelDataiku
 
Case of success: Visualization as an example for exercising democratic transp...
Case of success: Visualization as an example for exercising democratic transp...Case of success: Visualization as an example for exercising democratic transp...
Case of success: Visualization as an example for exercising democratic transp...Big Data Spain
 
Choosing data warehouse considerations
Choosing data warehouse considerationsChoosing data warehouse considerations
Choosing data warehouse considerationsAseem Bansal
 
Introduction to data science club
Introduction to data science clubIntroduction to data science club
Introduction to data science clubData Science Club
 
SlamData Overview 9-1-2014
SlamData Overview 9-1-2014SlamData Overview 9-1-2014
SlamData Overview 9-1-2014carrjc2
 
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum ShachamH2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum ShachamSri Ambati
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineersIBM Analytics
 
One Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsOne Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsFairCom
 
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...ETCenter
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionProvectus
 
Big data and machine learning / Gil Chamiel
Big data and machine learning / Gil Chamiel   Big data and machine learning / Gil Chamiel
Big data and machine learning / Gil Chamiel geektimecoil
 

What's hot (20)

Building up a Data Science Team from Scratch
Building up a Data Science Team from ScratchBuilding up a Data Science Team from Scratch
Building up a Data Science Team from Scratch
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
Big Data Rampage
Big Data RampageBig Data Rampage
Big Data Rampage
 
Big data session five ( a )f
Big data session five ( a )fBig data session five ( a )f
Big data session five ( a )f
 
Data Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact SolutionsData Science: Harnessing Open Data for High Impact Solutions
Data Science: Harnessing Open Data for High Impact Solutions
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML model
 
Case of success: Visualization as an example for exercising democratic transp...
Case of success: Visualization as an example for exercising democratic transp...Case of success: Visualization as an example for exercising democratic transp...
Case of success: Visualization as an example for exercising democratic transp...
 
So you want to be a Data Scientist?
So you want to be a Data Scientist?So you want to be a Data Scientist?
So you want to be a Data Scientist?
 
Choosing data warehouse considerations
Choosing data warehouse considerationsChoosing data warehouse considerations
Choosing data warehouse considerations
 
Big data(1st presentation)
Big data(1st presentation)Big data(1st presentation)
Big data(1st presentation)
 
Paving The Way To Data Driven
Paving The Way To Data DrivenPaving The Way To Data Driven
Paving The Way To Data Driven
 
Introduction to data science club
Introduction to data science clubIntroduction to data science club
Introduction to data science club
 
SlamData Overview 9-1-2014
SlamData Overview 9-1-2014SlamData Overview 9-1-2014
SlamData Overview 9-1-2014
 
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum ShachamH2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
H2O World - Data Science w/ Big Data in a Corporate Environment - Nachum Shacham
 
Big-Data Computing on the Cloud
Big-Data Computing on the CloudBig-Data Computing on the Cloud
Big-Data Computing on the Cloud
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineers
 
One Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical ApplicationsOne Database Countless Possibilities for Mission-critical Applications
One Database Countless Possibilities for Mission-critical Applications
 
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
Open Source Framework for Deploying Data Science Models and Cloud Based Appli...
 
Big Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems EvolutionBig Data Meetup: Analytical Systems Evolution
Big Data Meetup: Analytical Systems Evolution
 
Big data and machine learning / Gil Chamiel
Big data and machine learning / Gil Chamiel   Big data and machine learning / Gil Chamiel
Big data and machine learning / Gil Chamiel
 

Similar to Big Data Analytics for BI, BA and QA

Top Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwareTop Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwarePanorama Software
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Prof.Balakrishnan S
 
How to make your data scientists happy
How to make your data scientists happy How to make your data scientists happy
How to make your data scientists happy Hussain Sultan
 
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...Data Con LA
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"MDS ap
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewDr. Ananth Krishnamoorthy
 
Data science fin_tech_2016
Data science fin_tech_2016Data science fin_tech_2016
Data science fin_tech_2016iECARUS
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieSunil Ranka
 
Data Science Overview
Data Science OverviewData Science Overview
Data Science OverviewDavide Mauri
 
Big Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightBig Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightSunil Ranka
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerMicrosoft
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataRoi Blanco
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...Big Data Value Association
 
Should I stay or should I go?
Should I stay or should I go?Should I stay or should I go?
Should I stay or should I go?Markus Flechtner
 
From Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data ScienceFrom Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data ScienceInstitute of Contemporary Sciences
 
How to Empower Your Business Users with Oracle Data Visualization
How to Empower Your Business Users with Oracle Data VisualizationHow to Empower Your Business Users with Oracle Data Visualization
How to Empower Your Business Users with Oracle Data VisualizationPerficient, Inc.
 
How to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceHow to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceJuuso Parkkinen
 

Similar to Big Data Analytics for BI, BA and QA (20)

Top Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama SoftwareTop Business Intelligence Trends for 2016 by Panorama Software
Top Business Intelligence Trends for 2016 by Panorama Software
 
Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19Big Data Driven Solutions to Combat Covid' 19
Big Data Driven Solutions to Combat Covid' 19
 
Big data
Big dataBig data
Big data
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
How to make your data scientists happy
How to make your data scientists happy How to make your data scientists happy
How to make your data scientists happy
 
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
 
The Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape OverviewThe Python ecosystem for data science - Landscape Overview
The Python ecosystem for data science - Landscape Overview
 
Data science fin_tech_2016
Data science fin_tech_2016Data science fin_tech_2016
Data science fin_tech_2016
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
Data Science Overview
Data Science OverviewData Science Overview
Data Science Overview
 
Big Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to ForesightBig Data : From HindSight to Insight to Foresight
Big Data : From HindSight to Insight to Foresight
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringer
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
BDVe Webinar Series - Designing Big Data pipelines with Toreador (Ernesto Dam...
 
Should I stay or should I go?
Should I stay or should I go?Should I stay or should I go?
Should I stay or should I go?
 
From Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data ScienceFrom Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data Science
 
How to Empower Your Business Users with Oracle Data Visualization
How to Empower Your Business Users with Oracle Data VisualizationHow to Empower Your Business Users with Oracle Data Visualization
How to Empower Your Business Users with Oracle Data Visualization
 
Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017 Proposed Talk Outline for Pycon2017
Proposed Talk Outline for Pycon2017
 
How to Prepare for a Career in Data Science
How to Prepare for a Career in Data ScienceHow to Prepare for a Career in Data Science
How to Prepare for a Career in Data Science
 

Recently uploaded

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Recently uploaded (20)

Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Big Data Analytics for BI, BA and QA

  • 1. AUGUST 2016 Big Data Analytics for BI/BA/QA Dmitry Tolpeko
  • 2. 2 BIG DATA Why was it invented? How is it used now? How will it be used in the near future? What do we need to do to stay competitive?
  • 3. 3 FIRST QUESTIONS What size does it start? Is it just another technology vendor?
  • 4. 4 IN REALITY It is very easy to start using Hadoop and Cloud now. So it is true that now most people doing traditional things with just larger data sets. And at much lower cost, of course. So it looks like the size matters, and this is just another technology
  • 5. 5 BUT IT IS … Completely new mindset and approach to analytics Solution to satisfy new, “mass market” analytics And you cannot skip it
  • 6. 6 YOU CAN FEEL THIS AS … Developers (Java, .NET etc.), non- BI and even non-IT people talk and work with analytics today. That was not the case before. So what happens?
  • 7. 7 TRADITIONAL ANALYTICS Expensive Separate and isolated BI world Analyzing transactions (data you cannot afford to lose or calculate with errors) Historical data and strategic decisions
  • 8. 8 AND TODAY THIS IS … Very small % of analytics (1-5%?) Analytics Boom
  • 9. 9 EVERTHING IS ABOUT DATA Mindset: Data Analysis not OLTP, DWH, ETL Kimball/Inmon Any application: UX+Analytics (Machine Learning i.e.) Competing on analytics, not just product and service Analytics become operational, mass market
  • 10. 10 THE NEXT BIG SHIFT?  Digital Transformation of Economy IoT, VR, AR, Machine Learning, AI Personalized UX Heavily relies on analytics
  • 11. 11 ANALYTICS TODAY  Fast, Advanced and Predictive Analytics o Personalization and customization: from summary reports to a lot of tailored data-driven actions (in near real time) o Fast prototyping, implementation, deployment and fast performance o Data lakes
  • 12. 12 EXAMPLE - YESTERDAY Company sends promo by email to 1M users paying $1 for each email, 50,000 users purchased goods at $25 Profit: 50,000 * $25 - $1M = $250,000 This is what traditional analytics does.
  • 13. 13 EXAMPLE - TODAY Today Company identified to send promo email just to 100,000 users, now 30,000 users purchased goods at $25 Profit: 30,000 * $25 - $100K = $650,000 No new customers, no new contracts – just algorithms and more data
  • 14. 14 USE CASES o Anomaly Detection o Recommendation Systems o Loyalty and Retention Programs o Optimization o A/B Testing o Alarms, Scoring, Diagnosis o Demand Forecasting and so on.
  • 15. 15 NEW CORE SKILLS Distributed Data Processing and Streaming Analytics Programming (Python, R, Spark) Math, Statistics Machine Learning Deep Learning
  • 16. 16 MACHINE LEARNING Automation of discovery Automatically adapt to new circumstances Detect patterns In wide use now. “Self-testing”. Few lines of code
  • 17. 17 BUILDING BLOCKS Enriching analysis, development and quality in software development o Generic algorithms vs hardcoding endless IF-ELSE o Discovering hidden, not obvious patterns o Finding anomalies, outliers vs test cases
  • 18. 18 BI TOOLS NOW Self-service (less jobs?) Advanced analytics (requires understanding stats and machine learning fundamentals)
  • 19. 19 SOURCE DATA Non-transactional systems, weak or no data model Calculations with probability Raw, unstructured data from diverse data sources Extracting small relevant pieces of data from huge data sets
  • 20. 20 PEOPLE Data engineers Data scientists Significant work force, not just 1- 5% as in BI
  • 21. 21 GOOD NEWS BI people still good match as they love crunching data But significant shift in skills is required
  • 22. 22 WHY TO BE INVOLVED o Cutting edge o Challenges o Cool staff (predictions, AI etc.) o Growth, margin and revenue
  • 23. 23 HOW TO BE INVOLVED o Mindset o Skills o Experience o Solutions
  • 25. 25 TRADITIONAL EDW PLATFORMS o Too expensive ($10,000 per TB and more) o Large upfront cost o Not easy procurement, setup and maintenance o Designed for relational data, SQL interface only, limited schema flexibility o Data must be loaded first (modeled, prepared and moved) o Marketing limitations for Appliances
  • 26. 26 TRADITIONAL OPEN SOURCE PLATFORMS • Designed for relational data, SQL interface only, limited schema flexibility • Data must be loaded first (modeled, prepared and moved) • Not easily scalable (scale up and down)
  • 27. 27 TRADITIONAL DATA MINING TOOLS • Expensive • Smaller community (one more isolated world) • Targeted for enterprise users • Longer release cycles, no way to mix tools and try fresh new staff etc. • Scalability and integration issues
  • 28. 28 WHY BIG DATA AND CLOUD o Extremely economically attractive o Scalable and elastic o Self service o Rich and diverse data tools o Good enough quality (and constantly improving)
  • 29. 29 BIG DATA AND CLOUD DESIGN PRINCIPLES Decoupling Data Storage and Computing o Database engine does not own data anymore o Simplified load/extract o Schema on read o Not just SQL interface o Any computing engines on top of data Commodity Hardware o Fault tolerant Scale up and down
  • 30. 30 GROW PATH From monolithic suites to diverse and rich tool set SQL tools on Hadoop, Cloud Advanced Data Analysis and Analytics o Spark, MapReduce, NoSQL o Python, R, Java, Scala o Statistics o Batch, Streaming, Real-time Machine Learning and Deep Learning o Understand use cases o Understand specific algorithms and their application o Implementation
  • 32. 32 LET’S WIN THIS CAR Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 3
  • 33. 33 SWITCH OR NOT? Then the host, who knows what's behind the doors, opens another door, say No. 2, which has a goat. He then says to you, "Do you want to pick door No. 1?" Is it to your advantage to switch your choice?