Big Data & Business Analytics:
Understanding the Marketspace
Prof. Bala Iyer
@BalaIyer
1
Agenda
 Big Data Basics
 Understand why Big Data is

increasingly important to the business
 Ecosystem Analysis
 Key Recommendations
 Next Steps

2
―we now uncover as much data in
48 hours – 1.8 zettabytes (that's
1,800,000,000,000,000,000,000
bytes) – as humans gathered from
"the dawn of civilization to the year
2003."

3
Trends
 Hyper-competition
 Customer expectations
 Big data around business
 Science Envy?
 Analytical organization

4
Environment

Decisions

Your
Data

Models
Decisions

5
Enter the Data Scientist
 A data scientist is an engineer who employs the

scientific method and applies data-discovery tools to
find new insights in data. The scientific method—the
formulation of a hypothesis, the testing, the careful
design of experiments, the verification by others—is
something they take from their knowledge of
statistics and their training in scientific disciplines.

Data Scientists: The Definition of Sexy, Forbes 2013 link
6
Opportunities
 Shortage of data scientists
 Huge technical challenges
 India advantage
 Use cases emerging

According to Wikibon the market is expected
to reach USD53.4 billion in 2016
7
Averages
Tier 1: $323MM
Tier 2: $26 MM

8
Categories
 People
 Machine
 Social
 Transactional

9
Offerings
 Consulting
 Implementation/change
 Technology Platforms
 Methodologies and Frameworks
 Assessment
 CoE, Incubators, Labs, Co-innovation
networks

10
Problems
 Structured
 Semi-structured
 Un-structured

11
Skills
 Domain knowledge
 Math
 Statistics
 Heuristics
 Machine learning
 Programming
 Scientific method
12
Technology Impact on Client Business

Impact Framework by Prof. Venkatraman

13
What do we mean by
―Analytical‖?
 Analytical Decision-making: the use of
data, analysis, models & systematic
reasoning to make decisions
 Questions to answer:






What decisions or business areas should
analytics be applied?
What kind of data do we have now & do we
need?
What kinds of analysis do we do?

Source: ―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from
Analytics at Work: Smarter Decisions, Better Results, 2010.

14
Big Data, Predictive Analytics
 Software and/or hardware solutions
that allow firms to improve business
performance or mitigate risk by
analyzing big data sources.
 Predictive analytics uses algorithms to

find patterns in data that might predict
similar outcomes
15
Applied analytics
 Monitoring to help us track real time behaviors;
 Insight to help us understand what is going wrong;
 Prediction to show us what minor issue today is a

precursor to catastrophic failure tomorrow;
 Optimization to achieve operational targets for yield,
throughput, or energy consumption; and
 Machine learning to help us discover new trends or
patterns of failure or optimization
Analytics in each category requires different volumes
and velocities of data to be effective
16
Competencies or Stack
Change Management
Insights
(Experimentation/Visualization)

Domain Knowledge
(best practices)
Model Building
(tools and techniques)
Infrastructure
(Data, Models/architecture)

T
O
O
L
S

17
How are decisions made today?
 Intuition & Experience
 The HiPPOs (Highest-paid person’s opinion)1
 Outcomes?



Decisions that are not-optimized or ill-informed
We introduce biases, prejudice, unaided intuition, &
self-justification to the process2

Shift towards analytics: 44% of executives feel analytics
are strongly supporting or driving strategy3

Source: 1 ―Big Data: The Management Revolution, McAfee & Brynjolfsson, HBR, Oct. 2012.
2―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from Analytics at Work:
18
Smarter Decisions, Better Results, 2010.
3―Preparing for Analytics 3.0‖, Davenport, CIO Journal in The Wall Street Journal, 2/20/2013.

18
Traditional Analytics (Analytics
1.0)

Structured Data

• Focus on purchase transactions (retail & online)
• Product insights (rich): best-selling products, most profitable
• Customer insights (limited): loyalty programs & online
behaviors lead to segmentation, personalization
• Inputs are structured data & outputs are reports
Image Source: http://ecommercecenter.net/management/what-is-a-data-warehouse.html

19
We’re now doing Analytics 2.0!
Multichannel

Social

Data Integration (unstructured + structured)
Location

Sensor

Image Source: http://dkidiscussion.blogspot.com/2012/03/were-facebook-friends.html; http://www.apple.com/ipod/nike/;
http://www.accontrols.com/ge-sensing.html; http://howto.cnet.com/8301-11310_39-20070819-285/7-fun-ways-to-interact-with-your20
20
foursquare-account/; http://bigdata.pervasive.com/Products/Hadoop-Data-Integration.aspx;
Why is Big Data new?
 Volume: 2.5 exabytes of data created each day
 Velocity: real time or near real-time data provides agility
over competitors
 Variety: unstructured data (in addition to
structured/transactional) & external (as well as internal)
 Social
 Location-based/mobile
 Personal interests
 Sensor
 Video/audio

Value

Source:

―Big

Data: The Management Revolution, McAfee & Brynjolfsson, HBR, Oct. 2012.

21
What are the sources of data?









ERP/CRM Transactional Systems
Point-of-Sale/Scanner at Retail
Customer Loyalty Programs
Financial Transactions
Click-Stream Data
Social Data
Mobile
External Data Aggregators (e.g., AC Nielson)

22
Key Questions Addressed by Data
Analytics
Past
What happened?

Present

Future

What is happening
now?
(Alerts)

What will happen?

How and why did it
happen?

What’s the next best
action?

What’s the best/worst
than can happen?

(Modeling,
experimental design)

(recommendation)

(Prediction,
optimization,
simulation)

Information(Reporting)

Insight

(extrapolation)

Source: ―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from
Analytics at Work: Smarter Decisions, Better Results, 2010.

23

23
Data analytics can focus on
different areas of the business
Business Area
Benefits
• More focused segmentation
• Understand customer behavior
Customer
• Etc.

Operations

Talent
Management

• Optimize inventory levels & delivery routes
• Build facilities in best locations
• Etc.

• Hire & retain employees based on right skillset
• Identify valuable skillsets by job role
• Etc.

Source: ―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from
Analytics at Work: Smarter Decisions, Better Results, 2010.

24

24
Target used data mining to predict buying habits
of customers going through major life events
 Target was able to identify 25 products (e.g., vitamin

supplements) that when analyzed together helped
determine a ―pregnancy prediction‖ score
 Sent baby-related promotions to women based on this
score

 Outcome:




Sales of Target’s Mom and Baby products sharply
increased soon after new advertising campaigns
Privacy concerns: Target had to adjust how it
communicated the new promotions

Source: ―How Companies Learn Your Secrets‖, Duhigg, The New York Times, Feb. 16, 2012.

25
Whirlpool monitors social to
discover what their customers are
saying…
 Used Attensity360 to discover where online people were

discussing Whirlpool & what they were saying
 Used customer feedback for better product development,
planning & customer service
 Metrics:
 Size of the overall online appliance conversation
 Sentiment analysis by brand (Whirlpool & competitors)
 Time elapsed between complaint/comment &
contacting the customer
 # of customers contacted
 # of satisfied completed interactions
Source: IDC Customer Spotlight, Whirlpool Corporation’s Digital Detectives:
Attensity Provides the Lens, IDC 2011 report.

26
Many industries using data analytics
for improving value disciplines
 General Electric using Big Data to optimize the service
contracts & maintenance1 The industrial internet.

 Netflix used Big Data to predict if a TV show will be

successful- ―House of Cards‖ series, Director & promotions2

 LinkedIn used Big Data to develop ―People You May Know‖
products – 30% higher click-thru-rates3

Source: 1―What’s Your Strategic Intent for Big Data?‖, Davenport , CIO Journal in The Wall Street Journal,
1/23/2013.
2‖The Future of Entertainment is Analytical‖, Davenport , CIO Journal in The Wall Street Journal, 3/6/2013.
Source: ―Data Scientist: The Sexiest Job of the 21st Century‖, Davenport & Patil, HBR, Oct 2012.

27
How do companies build an
analytics capability?
 Meet the Data Scientist (need analytical +

social + communication skills)
 Bring structure to large unstructured data &
make analysis possible

 Help decision-makers shift from ad-hoc analysis
to ongoing conversations with data

 Suggest business direction implications
 Include process and technology
Source: ―Data Scientist: The Sexiest Job of the 21st Century‖, Davenport & Patil, HBR, Oct 2012.

28
Create a culture of data-driven
decisions
 Leadership needs to set direction on how to make

decisions
 Shift mindset from What do we think? To What
do we know?

 Understand underlying models & its assumptions
 Domain subject-matter experts still critical: they
know what questions to ask

Source:

―Big

Data: The Management Revolution, McAfee & Brynjolfsson, HBR, Oct. 2012.

29
That are many analytics
tools/techniques
Technique

Description

Data Mining

• Extract new patterns in large data sets

Predictive
Modeling

• A model is created to best predict the probability of an
outcome

A/B Testing

• A control is compared to test groups.
• Often used in website design to test for higher conversion
rates

Textual Analysis

• Topics can be extracted along with their linkages

30
So, should we all use data-driven
decisions?
 Performance Implications:


Companies in top of their industries with datadriven decisions were:
 More productive & profitable
 Had a higher stock market evaluation1



External business intelligence (e.g., analyzing social media
data) boosts innovation & profits2

Source: 1―Big Data: The Management Revolution, McAfee & Brynjolfsson, HBR, Oct. 2012.
2‖Survey: External Business Intelligence Boosts Innovation and Profits‖, Schectman, CIO Journal in The Wall Street Journal,
31
2/26/2013.
Why not?







No time – decision needs to be made now
Invalid/outdated assumptions
No precedent (lacking past data)
Historical data misleading
Use analytics to rationalize pre-determined decisions
Don’t have the capabilities!




Major shortage of data scientists or skills associated with
data analytics
Culture not suited (e.g., privacy, regulatory concerns)

Source: ―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from
Analytics at Work: Smarter Decisions, Better Results, 2010.

32
Ecosystem Analysis

33
Analytics Ecosystem (840 nodes)

Component
Platform

34
Platform with high brokerage
High brokerage nodes
Cloudera

Pentaho

IBM

Fractal

MuSigma

Rapidminer

SAS

Cognizant

MTECH

Accenture

Tableau

SPSS

Infosys

AbsolutData

Capgemini

Genpact

KXEN

Oracle

Wipro

Opera

TCS

HCL

LatentView

Guavus

35
Potential Models
 Augmentation or spot services
 SaaS Decision Environment
 Consultant
 Strategic partner
 Digital thought leadership





Training for data scientists
Smart Lab
CoE

 Infrastructure and libraries
36
Key competencies
 Technical


Modeling

 Strategic
 Talent management
 Change management

37
Client Recommendations
 Prepare the organization: to become analytics-driven.
 Understand the problem:.
 Sourcing intent:





complement existing initiatives
setting up an R&D or experimentation center,
creating your own centers for excellence on analytics

 Build absorptive capacity:

38
Vendor recommendations








Lab environments:
Decision styles and other biases
Location of Analytics organization
Data scientists
Model management Capabilities:
Connectors or boundary spanners
Order takers vs. scientists

39
Overall
 The business analytics market is still at
its infancy. Like many other industries,
much is yet to be learned. As others
have opined, profitability and risk
mitigation awaits the fast learners.

40
Organization chart
 R&D
 Sales
 In-bound marketing
 Client-partners
 Connectors (SWAT teams)
 Data Scientists



Programmers
analysts
41
Investments
 Training/Recruitment



Sales
Data Scientist
 Certification based on competency and project

experience




Techniques
Domain knowledge

 Product/platforms
 Visualization lab
42
Risks
 Privacy and ethics of data
 ―Big brother‖
 New skills for production and selling
 Managing a pool of modelers
 Communication between modelers,
programmers and scientists
 Model management
 Installed base of engineers

43
Questions?

44

Big Data & Business Analytics: Understanding the Marketspace

  • 1.
    Big Data &Business Analytics: Understanding the Marketspace Prof. Bala Iyer @BalaIyer 1
  • 2.
    Agenda  Big DataBasics  Understand why Big Data is increasingly important to the business  Ecosystem Analysis  Key Recommendations  Next Steps 2
  • 3.
    ―we now uncoveras much data in 48 hours – 1.8 zettabytes (that's 1,800,000,000,000,000,000,000 bytes) – as humans gathered from "the dawn of civilization to the year 2003." 3
  • 4.
    Trends  Hyper-competition  Customerexpectations  Big data around business  Science Envy?  Analytical organization 4
  • 5.
  • 6.
    Enter the DataScientist  A data scientist is an engineer who employs the scientific method and applies data-discovery tools to find new insights in data. The scientific method—the formulation of a hypothesis, the testing, the careful design of experiments, the verification by others—is something they take from their knowledge of statistics and their training in scientific disciplines. Data Scientists: The Definition of Sexy, Forbes 2013 link 6
  • 7.
    Opportunities  Shortage ofdata scientists  Huge technical challenges  India advantage  Use cases emerging According to Wikibon the market is expected to reach USD53.4 billion in 2016 7
  • 8.
  • 9.
    Categories  People  Machine Social  Transactional 9
  • 10.
    Offerings  Consulting  Implementation/change Technology Platforms  Methodologies and Frameworks  Assessment  CoE, Incubators, Labs, Co-innovation networks 10
  • 11.
  • 12.
    Skills  Domain knowledge Math  Statistics  Heuristics  Machine learning  Programming  Scientific method 12
  • 13.
    Technology Impact onClient Business Impact Framework by Prof. Venkatraman 13
  • 14.
    What do wemean by ―Analytical‖?  Analytical Decision-making: the use of data, analysis, models & systematic reasoning to make decisions  Questions to answer:    What decisions or business areas should analytics be applied? What kind of data do we have now & do we need? What kinds of analysis do we do? Source: ―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from Analytics at Work: Smarter Decisions, Better Results, 2010. 14
  • 15.
    Big Data, PredictiveAnalytics  Software and/or hardware solutions that allow firms to improve business performance or mitigate risk by analyzing big data sources.  Predictive analytics uses algorithms to find patterns in data that might predict similar outcomes 15
  • 16.
    Applied analytics  Monitoringto help us track real time behaviors;  Insight to help us understand what is going wrong;  Prediction to show us what minor issue today is a precursor to catastrophic failure tomorrow;  Optimization to achieve operational targets for yield, throughput, or energy consumption; and  Machine learning to help us discover new trends or patterns of failure or optimization Analytics in each category requires different volumes and velocities of data to be effective 16
  • 17.
    Competencies or Stack ChangeManagement Insights (Experimentation/Visualization) Domain Knowledge (best practices) Model Building (tools and techniques) Infrastructure (Data, Models/architecture) T O O L S 17
  • 18.
    How are decisionsmade today?  Intuition & Experience  The HiPPOs (Highest-paid person’s opinion)1  Outcomes?   Decisions that are not-optimized or ill-informed We introduce biases, prejudice, unaided intuition, & self-justification to the process2 Shift towards analytics: 44% of executives feel analytics are strongly supporting or driving strategy3 Source: 1 ―Big Data: The Management Revolution, McAfee & Brynjolfsson, HBR, Oct. 2012. 2―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from Analytics at Work: 18 Smarter Decisions, Better Results, 2010. 3―Preparing for Analytics 3.0‖, Davenport, CIO Journal in The Wall Street Journal, 2/20/2013. 18
  • 19.
    Traditional Analytics (Analytics 1.0) StructuredData • Focus on purchase transactions (retail & online) • Product insights (rich): best-selling products, most profitable • Customer insights (limited): loyalty programs & online behaviors lead to segmentation, personalization • Inputs are structured data & outputs are reports Image Source: http://ecommercecenter.net/management/what-is-a-data-warehouse.html 19
  • 20.
    We’re now doingAnalytics 2.0! Multichannel Social Data Integration (unstructured + structured) Location Sensor Image Source: http://dkidiscussion.blogspot.com/2012/03/were-facebook-friends.html; http://www.apple.com/ipod/nike/; http://www.accontrols.com/ge-sensing.html; http://howto.cnet.com/8301-11310_39-20070819-285/7-fun-ways-to-interact-with-your20 20 foursquare-account/; http://bigdata.pervasive.com/Products/Hadoop-Data-Integration.aspx;
  • 21.
    Why is BigData new?  Volume: 2.5 exabytes of data created each day  Velocity: real time or near real-time data provides agility over competitors  Variety: unstructured data (in addition to structured/transactional) & external (as well as internal)  Social  Location-based/mobile  Personal interests  Sensor  Video/audio Value Source: ―Big Data: The Management Revolution, McAfee & Brynjolfsson, HBR, Oct. 2012. 21
  • 22.
    What are thesources of data?         ERP/CRM Transactional Systems Point-of-Sale/Scanner at Retail Customer Loyalty Programs Financial Transactions Click-Stream Data Social Data Mobile External Data Aggregators (e.g., AC Nielson) 22
  • 23.
    Key Questions Addressedby Data Analytics Past What happened? Present Future What is happening now? (Alerts) What will happen? How and why did it happen? What’s the next best action? What’s the best/worst than can happen? (Modeling, experimental design) (recommendation) (Prediction, optimization, simulation) Information(Reporting) Insight (extrapolation) Source: ―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from Analytics at Work: Smarter Decisions, Better Results, 2010. 23 23
  • 24.
    Data analytics canfocus on different areas of the business Business Area Benefits • More focused segmentation • Understand customer behavior Customer • Etc. Operations Talent Management • Optimize inventory levels & delivery routes • Build facilities in best locations • Etc. • Hire & retain employees based on right skillset • Identify valuable skillsets by job role • Etc. Source: ―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from Analytics at Work: Smarter Decisions, Better Results, 2010. 24 24
  • 25.
    Target used datamining to predict buying habits of customers going through major life events  Target was able to identify 25 products (e.g., vitamin supplements) that when analyzed together helped determine a ―pregnancy prediction‖ score  Sent baby-related promotions to women based on this score  Outcome:   Sales of Target’s Mom and Baby products sharply increased soon after new advertising campaigns Privacy concerns: Target had to adjust how it communicated the new promotions Source: ―How Companies Learn Your Secrets‖, Duhigg, The New York Times, Feb. 16, 2012. 25
  • 26.
    Whirlpool monitors socialto discover what their customers are saying…  Used Attensity360 to discover where online people were discussing Whirlpool & what they were saying  Used customer feedback for better product development, planning & customer service  Metrics:  Size of the overall online appliance conversation  Sentiment analysis by brand (Whirlpool & competitors)  Time elapsed between complaint/comment & contacting the customer  # of customers contacted  # of satisfied completed interactions Source: IDC Customer Spotlight, Whirlpool Corporation’s Digital Detectives: Attensity Provides the Lens, IDC 2011 report. 26
  • 27.
    Many industries usingdata analytics for improving value disciplines  General Electric using Big Data to optimize the service contracts & maintenance1 The industrial internet.  Netflix used Big Data to predict if a TV show will be successful- ―House of Cards‖ series, Director & promotions2  LinkedIn used Big Data to develop ―People You May Know‖ products – 30% higher click-thru-rates3 Source: 1―What’s Your Strategic Intent for Big Data?‖, Davenport , CIO Journal in The Wall Street Journal, 1/23/2013. 2‖The Future of Entertainment is Analytical‖, Davenport , CIO Journal in The Wall Street Journal, 3/6/2013. Source: ―Data Scientist: The Sexiest Job of the 21st Century‖, Davenport & Patil, HBR, Oct 2012. 27
  • 28.
    How do companiesbuild an analytics capability?  Meet the Data Scientist (need analytical + social + communication skills)  Bring structure to large unstructured data & make analysis possible  Help decision-makers shift from ad-hoc analysis to ongoing conversations with data  Suggest business direction implications  Include process and technology Source: ―Data Scientist: The Sexiest Job of the 21st Century‖, Davenport & Patil, HBR, Oct 2012. 28
  • 29.
    Create a cultureof data-driven decisions  Leadership needs to set direction on how to make decisions  Shift mindset from What do we think? To What do we know?  Understand underlying models & its assumptions  Domain subject-matter experts still critical: they know what questions to ask Source: ―Big Data: The Management Revolution, McAfee & Brynjolfsson, HBR, Oct. 2012. 29
  • 30.
    That are manyanalytics tools/techniques Technique Description Data Mining • Extract new patterns in large data sets Predictive Modeling • A model is created to best predict the probability of an outcome A/B Testing • A control is compared to test groups. • Often used in website design to test for higher conversion rates Textual Analysis • Topics can be extracted along with their linkages 30
  • 31.
    So, should weall use data-driven decisions?  Performance Implications:  Companies in top of their industries with datadriven decisions were:  More productive & profitable  Had a higher stock market evaluation1  External business intelligence (e.g., analyzing social media data) boosts innovation & profits2 Source: 1―Big Data: The Management Revolution, McAfee & Brynjolfsson, HBR, Oct. 2012. 2‖Survey: External Business Intelligence Boosts Innovation and Profits‖, Schectman, CIO Journal in The Wall Street Journal, 31 2/26/2013.
  • 32.
    Why not?       No time– decision needs to be made now Invalid/outdated assumptions No precedent (lacking past data) Historical data misleading Use analytics to rationalize pre-determined decisions Don’t have the capabilities!   Major shortage of data scientists or skills associated with data analytics Culture not suited (e.g., privacy, regulatory concerns) Source: ―What it Means to Put Analytics to Work‖, Davenport, Harris & Morrison, chapter from Analytics at Work: Smarter Decisions, Better Results, 2010. 32
  • 33.
  • 34.
    Analytics Ecosystem (840nodes) Component Platform 34 Platform with high brokerage
  • 35.
  • 36.
    Potential Models  Augmentationor spot services  SaaS Decision Environment  Consultant  Strategic partner  Digital thought leadership    Training for data scientists Smart Lab CoE  Infrastructure and libraries 36
  • 37.
    Key competencies  Technical  Modeling Strategic  Talent management  Change management 37
  • 38.
    Client Recommendations  Preparethe organization: to become analytics-driven.  Understand the problem:.  Sourcing intent:    complement existing initiatives setting up an R&D or experimentation center, creating your own centers for excellence on analytics  Build absorptive capacity: 38
  • 39.
    Vendor recommendations        Lab environments: Decisionstyles and other biases Location of Analytics organization Data scientists Model management Capabilities: Connectors or boundary spanners Order takers vs. scientists 39
  • 40.
    Overall  The businessanalytics market is still at its infancy. Like many other industries, much is yet to be learned. As others have opined, profitability and risk mitigation awaits the fast learners. 40
  • 41.
    Organization chart  R&D Sales  In-bound marketing  Client-partners  Connectors (SWAT teams)  Data Scientists   Programmers analysts 41
  • 42.
    Investments  Training/Recruitment   Sales Data Scientist Certification based on competency and project experience   Techniques Domain knowledge  Product/platforms  Visualization lab 42
  • 43.
    Risks  Privacy andethics of data  ―Big brother‖  New skills for production and selling  Managing a pool of modelers  Communication between modelers, programmers and scientists  Model management  Installed base of engineers 43
  • 44.