SlideShare a Scribd company logo
1 of 24
CHAPTER 1
INTRODUCTION: DATA-
ANALYTIC THINKING
1
Presented By
 The past fifteen years have seen extensive investments in business
infrastructure, which have improved the ability to collect data
throughout the enterprise.
 Virtually every aspect of business is now open to data collection
and often even instrumented for data collection: operations,
manufacturing, supply-chain management, customer behavior,
marketing campaign performance, workflow procedures, and so on.
 At the same time, information is now widely available on external
events such as market trends, industry news, and competitor’s
movements.
 This broad availability of data has led to increasing interest in
methods for extracting useful information and knowledge from
data-the realm of data science. 2
www.kellytechno.com
THE UBIQUITY OF DATA OPPORTUNITIES
 With vast amounts of data now available, companies in almost every
industry are focused on exploiting data for competitive advantage.
 In the past, firms could employ teams of statisticians, modelers, and
analysts to explore datasets manually, but the volume and variety of
data have far outstripped the capacity of manual analysis.
 At the same time, computers have become far more powerful, networking
has become ubiquitous, and algorithms have been developed that can
connect datasets to enable broader and deeper analyses than previously
possible.
 The convergence of these phenomena has given rise to the increasing
widespread business application of data science principles and data
mining techniques.
3
www.kellytechno.com
THE UBIQUITY OF DATA OPPORTUNITIES
 Data mining is used for general customer relationship management to
analyze customer behavior in order to manage attrition and maximize
expected customer value.
 The finance industry uses data mining for credit scoring and trading,
and in operations via fraud detection and workforce management.
 Major retailers from Walmart to Amazon apply data mining throughout
their businesses, from marketing to supply-chain management.
 Many firms have differentiated themselves strategically with data
science, sometimes to the point of evolving into data mining companies.
 The primary goals of this book are to help you view business problems
from a data perspective and understand principles of extracting useful
knowledge from data.
4
www.kellytechno.com
THE UBIQUITY OF DATA OPPORTUNITIES
 The primary goals of this book are to help you view
business problems from a data perspective and understand
principles of extracting useful knowledge from data.
 There is a fundamental structure to data-analytic thinking,
and basic principles that should be understood.
 There are also particular areas where intuition, creativity,
common sense, and domain knowledge must be brought to
bear.
5
www.kellytechno.com
THE UBIQUITY OF DATA OPPORTUNITIES
 Throughout the first two chapters of this books, we will discuss in detail
various topics and techniques related to data science and data mining.
 The terms “data science” and “data mining” often are used
interchangeably, and the former has taken a life of its own as various
individuals and organizations try to capitalize on the current hype
surrounding it.
 At a high level, data science is a set of fundamental principles that guide
the extraction of knowledge from data. Data mining is the extraction of
knowledge from data, via technologies that incorporate these principles.
 As a term, “data science” often is applied more broadly than the
traditional use of “data mining”, but data mining techniques provide
some of the clearest illustrations of the principles of data science.
6
www.kellytechno.com
EXAMPLE: HURRICANE FRANCES
 Consider an example from a New York Time story from 2004:
 Hurricane Frances was on its way, barreling across the Caribbean,
threatening a direct hit on Florida’s Atlantic coast. Residents made for
higher ground, but far away, in Bentonville, Ark., executives at Wal-Mart
Stores decided that the situation offered a great opportunity for one of
their newest data-driven weapons … predictive technology.
 A week ahead of the storm’s landfall, Linda M. Dillman, Wal-Mart’s chief
information officer, pressed her staff to come up with forecasts based on
what had happened when Hurricane Charley struck several weeks
earlier. Backed by the trillions of bytes’ worth of shopper history that is
stored in Wal-Mart’s data warehouse, she felt that the company could
‘start predicting what’s going to happen, instead of waiting for it to
happen,’ as she put it. (Hays, 2004) 7
www.kellytechno.com
EXAMPLE: HURRICANE FRANCES
 Consider why data-driven prediction might be useful in this
scenario.
 It might be useful to predict that people in the path of the
hurricane would buy more bottled water. Maybe, but this point
seems a bit obvious, and why would we need data science to
discover it?
 It might be useful to project the amount of increase in sale due to
the hurricane, to ensure that local Wal-Mart are properly stocked.
 Perhaps mining the data could reveal that a particular DVD sold
out in the hurricane’s path – but maybe it sold out that week at
Wal-Marts across the country, not just where the hurricane
landing was imminent. 8
www.kellytechno.com
EXAMPLE: HURRICANE FRANCES
 The prediction could be somewhat useful, but is probably more
general than Ms. Dillman was intending.
 It would be more valuable to discover patterns due to the
hurricane that were not obvious.
 To do this, analysts might examine the huge volume of Wal-Mart
data from prior, similar situations (such as Hurricane Charley) to
identify unusual local demand for products.
9
www.kellytechno.com
EXAMPLE: HURRICANE FRANCES
 From such patterns, the company might be able to anticipate
unusual demand for products and rush stock to the stores ahead of
the hurricane’s landfall. Indeed, that is what happened.
 The New York Times (Hays, 2004) reported that:”…the experts
mined the data and found that the stores would indeed need
certain products-and not just the usual flashlights. “We didn’t
know in the past that strawberry PopTarts increase in sales, like
seven times their normal sales rate, ahead of a hurricane’, Ms.
Dillman said in a recent interview. “And the pre-hurricane top-
selling item was beer.”
10
www.kellytechno.com
EXAMPLE: PREDICTING CUSTOMER CHURN
 How are such data analyses performed? Consider a second, more
typical business scenario and how it might be treated from a data
perspective.
 Assume you just landed a great analytical job with MegaTelCo, one
of the largest telecommunication firms in the United States.
 They are having major problem with customer retention in their
wireless business. In the mid-Atlantic region, 20% of cell phone
customers leave when their contracts expire, and it is getting
increasingly difficult to acquire new customers.
 Since the cell phone market is now saturated, the huge growth in
the wireless market has tapered off.
11
www.kellytechno.com
EXAMPLE: PREDICTING CUSTOMER CHURN
 Communications companies are now engaged in battles to attract
each other’s customers while retaining their own.
 Customers switching from one company to another is called churn,
and it is expensive all around: one company must spend on
incentives to attract a customer while another company loses
revenue when the customer departs.
 You have been called in to help understand the problem and to
devise a solution.
 Attracting new customers is much more expensive than retaining
existing ones, so a good deal of marketing budget is allocated to
prevent churn.
12
www.kellytechno.com
EXAMPLE: PREDICTING CUSTOMER CHURN
 Marketing has already designed a special retention offer. Your
task is to devise a precise, step-by-step plan for how the data
science team should use MegaTelCo’s vast data resources to decide
which customers should be offered the special retention deal prior
to the expiration of their contract.
 Think carefully about what data you might use and how they
would be used. Specifically, how should MegaTelCo choose a set of
customers to receive their offer in order to best reduce churn for a
particular incentive budget? Answering this question is much more
complicated than it may seem initially.
13
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 Data science involves principles, processes, and techniques for
understanding phenomena via the (automated) analysis of data.
 In this book, we will view the ultimate goal of data science as
improving decision making, as this generally is of direct interest to
business.
14
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 Figure 1-1 places data science
in the context of various other
closely related and data
related processes in the
organization.
 It distinguishes data science
from other aspects of data
processing that are gaining
increasing attention in
business. Let’s start at the top.
15
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 Data-driven decision-making (DDD) refers to the practice of basing
decisions on the analysis of data, rather than purely on intuition.
 For example, a marketer could select advertisements based purely
on her long experience in the field and her eye for what will work.
Or, she could base her selection on the analysis of data regarding
how consumers react to different ads.
 She could also use a combination of these approaches. DDD is not
an all-or-nothing practice, and different firms engage in DDD to
greater or lesser degrees.
16
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 Economist Erik Brynjolfsson and his colleagues from MIT and
Penn’s Wharton School conducted a study of how DDD affects firm
performance (Brynjolfsson, Hitt, &Kim,2011).
 They developed a measure of DDD that rates firms as to how
strongly they use data to make decisions across the company.
 They show that statistically, the more data driven a firm is, the
more productive it is-even controlling for a wide range of possible
confounding factors.
 And the differences are not small. One standard deviation higher
on the DDD scale is associated with a 4%-6% increase in
productivity. DDD also is correlated with higher return on assets,
return on equity, asset utilization, and market value, and the
relationship seems to be causal. 17
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 The sort of decisions we will be interested in this book mainly fall
into two type:
 (1) decisions for which “discoveries” need to be made within data, and
 (2) decisions that repeat, especially at massive scale, and so decision-
making can benefit from even small increases in decision-making
accuracy based on data analysis.
 The Walmart example above illustrates a type 1 problem: Linda
Dillman would like to discover knowledge that will help Walmart
prepare for Hurricane Frances’s imminent arrival.
 In 2012, Walmart’s competitor Target was in the news for a data-
driven decision-making case of its own, also a type 1 problem
(Duhigg, 2012). Like most retailers, Target cares about consumers’
shopping habits, what drives them, and what can influence them. 18
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 Consumers tend to have inertia in their habits and getting them to
change is very difficult. Decision makers at Target knew, however,
that the arrival of a new baby in a family is one point where people
do change their shopping habits significantly.
 In the Target analyst’s word, “As soon as we get them buying
diapers from us, they’re going to start buying everything else too”.
Most retailers know this and so they compete with each other
trying to sell baby-related products to new parents. Since most
birth records are public, retailers obtain information on births and
send out special offers to the new parents.
19
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 However, Target wanted to get a jump on their competition. They
were interested in whether they could predict that people are
expecting a baby. If they could, they would gain an advantage by
making offers before their competitors. Using techniques of data
science, Target analyzed historical data on customers who later
were revealed to have been pregnant.
 For example, pregnant mothers often change their diets, their
wardrobes, their vitamin regimens, and so on. These indicators
could be extracted from historical data, assembled into predictive
models, and then deployed in marketing campaigns.
20
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 We will discuss predictive models in much detail as we go through
the book.
 For the time being, it is sufficient to understand that a predictive
model abstracts away most of the complexity of the world, focusing
in on particular set of indicators that correlate in some way with a
quantity of interest.
 Importantly, in both the Walmart and the Target example, the
data analysis was not testing a simple hypothesis. Instead, the
data were explored with the hope that something useful would be
discovered.
21
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 Our churn example illustrates type 2 DDD problem. MegaTelCo
has hundreds of millions of customers, each a candidate for
defection. Ten of millions of customers have contracts expiring
each month, so each one of them has an increased likelihood of
defection in the near future. If we improve our ability to estimate,
for a given customer, how profitable it would be for us to focus on
her, we can potentially reap large benefits by applying this ability
to the millions of customers in the population.
 This same logic applies to many of the areas where we have seen
the most application of data science and data mining: direct
marketing, online advertising, credit scoring, financial trading,
help-desk management, fraud detection, search ranking, product
recommendation, and so on. 22
www.kellytechno.com
DATA SCIENCE, ENGINEERING, AND DATA-
DRIVEN DECISION MAKING
 The diagram in figure 1-1 shows data science supporting data-
driven decision-making, but also overlapping with data-driven
decision making. This highlights the often overlooked fact that,
increasingly, business decisions are being made automatically by
computer systems. Different industries have adopted automatic
decision-making at different rates. The finance and
telecommunications industries were early adopts, largely because
of their precocious development of data networks and
implementation of massive-scale computing, which allowed the
aggregation and modeling of data at a large scale, as well as the
application of the resultant models to decision-making.
23
www.kellytechno.com
24
www.kellytechno.com

More Related Content

What's hot

The Essential Data Ingredient
The Essential Data IngredientThe Essential Data Ingredient
The Essential Data Ingredient
Rich Cooper
 
From hype to action getting what's needed from big data a
From hype to action getting what's needed from big data aFrom hype to action getting what's needed from big data a
From hype to action getting what's needed from big data a
gwdeodhar
 
Atos_whitepaper_Analytics_HR_interactive
Atos_whitepaper_Analytics_HR_interactiveAtos_whitepaper_Analytics_HR_interactive
Atos_whitepaper_Analytics_HR_interactive
Nicolas Mallison
 
Making sense-of-the-chaos
Making sense-of-the-chaosMaking sense-of-the-chaos
Making sense-of-the-chaos
swaipnew
 
The implications of Big Data for BTS and COS
The implications of Big Data for BTS and COSThe implications of Big Data for BTS and COS
The implications of Big Data for BTS and COS
George Kershoff
 

What's hot (17)

Big Data: Where is the Real Opportunity?
Big Data: Where is the Real Opportunity?Big Data: Where is the Real Opportunity?
Big Data: Where is the Real Opportunity?
 
Real Estate Big Data- Benefits & Challenges
Real Estate Big Data- Benefits & ChallengesReal Estate Big Data- Benefits & Challenges
Real Estate Big Data- Benefits & Challenges
 
The Essential Data Ingredient
The Essential Data IngredientThe Essential Data Ingredient
The Essential Data Ingredient
 
Big data assignment
Big data assignmentBig data assignment
Big data assignment
 
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
 
From hype to action getting what's needed from big data a
From hype to action getting what's needed from big data aFrom hype to action getting what's needed from big data a
From hype to action getting what's needed from big data a
 
Sales Summit 2 - Minds&More - Cloud & disruptive trends
Sales Summit 2 - Minds&More - Cloud & disruptive trendsSales Summit 2 - Minds&More - Cloud & disruptive trends
Sales Summit 2 - Minds&More - Cloud & disruptive trends
 
Wall streetjournal
Wall streetjournalWall streetjournal
Wall streetjournal
 
Atos_whitepaper_Analytics_HR_interactive
Atos_whitepaper_Analytics_HR_interactiveAtos_whitepaper_Analytics_HR_interactive
Atos_whitepaper_Analytics_HR_interactive
 
Governing Big Data : Principles and practices
Governing Big Data : Principles and practicesGoverning Big Data : Principles and practices
Governing Big Data : Principles and practices
 
Is Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big DataIs Your Company Braced Up for handling Big Data
Is Your Company Braced Up for handling Big Data
 
The Emergence of Alt-Data and its Applications
The Emergence of Alt-Data and its ApplicationsThe Emergence of Alt-Data and its Applications
The Emergence of Alt-Data and its Applications
 
Business analytics a certain something
Business analytics   a certain somethingBusiness analytics   a certain something
Business analytics a certain something
 
Do You Know What The Power Of BULK SMS Is?
Do You Know What The Power Of BULK SMS Is?Do You Know What The Power Of BULK SMS Is?
Do You Know What The Power Of BULK SMS Is?
 
Making sense-of-the-chaos
Making sense-of-the-chaosMaking sense-of-the-chaos
Making sense-of-the-chaos
 
The implications of Big Data for BTS and COS
The implications of Big Data for BTS and COSThe implications of Big Data for BTS and COS
The implications of Big Data for BTS and COS
 
Big Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-CommerceBig Data Analytics and its Application in E-Commerce
Big Data Analytics and its Application in E-Commerce
 

Viewers also liked

Statistical analysis of process data 7 stages oil flow chart power point temp...
Statistical analysis of process data 7 stages oil flow chart power point temp...Statistical analysis of process data 7 stages oil flow chart power point temp...
Statistical analysis of process data 7 stages oil flow chart power point temp...
SlideTeam.net
 
Data Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for HadoopData Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for Hadoop
Gwen (Chen) Shapira
 
Build a predictive analytics model on a terabyte of data within hours
Build a predictive analytics model on a terabyte of data within hoursBuild a predictive analytics model on a terabyte of data within hours
Build a predictive analytics model on a terabyte of data within hours
DataWorks Summit
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
Patrick McFadin
 
Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
Wes McKinney
 

Viewers also liked (20)

Tune up your data science process
Tune up your data science processTune up your data science process
Tune up your data science process
 
Analysis, data & process modeling
Analysis, data & process modelingAnalysis, data & process modeling
Analysis, data & process modeling
 
Cross border - off-shoring and outsourcing privacy sensitive data
Cross border - off-shoring and outsourcing privacy sensitive dataCross border - off-shoring and outsourcing privacy sensitive data
Cross border - off-shoring and outsourcing privacy sensitive data
 
Statistical analysis of process data 7 stages oil flow chart power point temp...
Statistical analysis of process data 7 stages oil flow chart power point temp...Statistical analysis of process data 7 stages oil flow chart power point temp...
Statistical analysis of process data 7 stages oil flow chart power point temp...
 
Data Science and Goodhart's Law
Data Science and Goodhart's LawData Science and Goodhart's Law
Data Science and Goodhart's Law
 
Data Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for HadoopData Wrangling and Oracle Connectors for Hadoop
Data Wrangling and Oracle Connectors for Hadoop
 
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry DataA Tour of the Data Science Process, a Case Study Using Movie Industry Data
A Tour of the Data Science Process, a Case Study Using Movie Industry Data
 
Data Analysis - Making Big Data Work
Data Analysis - Making Big Data WorkData Analysis - Making Big Data Work
Data Analysis - Making Big Data Work
 
How to read a data model
How to read a data modelHow to read a data model
How to read a data model
 
Build a predictive analytics model on a terabyte of data within hours
Build a predictive analytics model on a terabyte of data within hoursBuild a predictive analytics model on a terabyte of data within hours
Build a predictive analytics model on a terabyte of data within hours
 
pandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statisticspandas: a Foundational Python Library for Data Analysis and Statistics
pandas: a Foundational Python Library for Data Analysis and Statistics
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
Python Data Wrangling: Preparing for the Future
Python Data Wrangling: Preparing for the FuturePython Data Wrangling: Preparing for the Future
Python Data Wrangling: Preparing for the Future
 
pandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Pythonpandas: Powerful data analysis tools for Python
pandas: Powerful data analysis tools for Python
 
Building a Predictive Model
Building a Predictive ModelBuilding a Predictive Model
Building a Predictive Model
 
UX x Analytics: Love or hate
UX x Analytics: Love or hateUX x Analytics: Love or hate
UX x Analytics: Love or hate
 
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
Explore Your Data Using Amazon QuickSight and Build Your First Machine Learni...
 
Python for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandasPython for Financial Data Analysis with pandas
Python for Financial Data Analysis with pandas
 
Improving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and InteroperabilityImproving Python and Spark (PySpark) Performance and Interoperability
Improving Python and Spark (PySpark) Performance and Interoperability
 
Analytics>Forward - Design Thinking for Data Science
Analytics>Forward - Design Thinking for Data ScienceAnalytics>Forward - Design Thinking for Data Science
Analytics>Forward - Design Thinking for Data Science
 

Similar to Data science training in hyderabad

Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
AshrafDabbas2
 
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
AshrafDabbas2
 
360i Report: Big Data
360i Report: Big Data360i Report: Big Data
360i Report: Big Data
360i
 
141900791 big-data
141900791 big-data141900791 big-data
141900791 big-data
glittaz
 
BBDO Proximity: Big-data May 2013
BBDO Proximity: Big-data May 2013BBDO Proximity: Big-data May 2013
BBDO Proximity: Big-data May 2013
Brian Crotty
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin Malhotra
Vin Malhotra
 
Data mining techniques and dss
Data mining techniques and dssData mining techniques and dss
Data mining techniques and dss
Niyitegekabilly
 
Big Data - CRM's Promise Land
Big Data - CRM's Promise LandBig Data - CRM's Promise Land
Big Data - CRM's Promise Land
Danny Camprubi Douglas
 

Similar to Data science training in hyderabad (20)

Data Science and its Relationship to Big Data and Data-Driven Decision Making
Data Science and its Relationship to Big Data and Data-Driven Decision MakingData Science and its Relationship to Big Data and Data-Driven Decision Making
Data Science and its Relationship to Big Data and Data-Driven Decision Making
 
Data Science and its relationship to Big Data and data-driven decision making
Data Science and its relationship to Big Data and data-driven decision makingData Science and its relationship to Big Data and data-driven decision making
Data Science and its relationship to Big Data and data-driven decision making
 
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
 
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
Paulraj Ponniah - Data Warehousing Fundamentals for IT Professionals-Wiley (2...
 
Big data is a popular term used to describe the exponential growth and availa...
Big data is a popular term used to describe the exponential growth and availa...Big data is a popular term used to describe the exponential growth and availa...
Big data is a popular term used to describe the exponential growth and availa...
 
How To Use Real-Time Data? Key Examples and Use Cases
How To Use Real-Time Data? Key Examples and Use CasesHow To Use Real-Time Data? Key Examples and Use Cases
How To Use Real-Time Data? Key Examples and Use Cases
 
360i Report: Big Data
360i Report: Big Data360i Report: Big Data
360i Report: Big Data
 
Customer intelligence platform - Maximum Capabilities of Your Data
Customer intelligence platform - Maximum Capabilities of Your DataCustomer intelligence platform - Maximum Capabilities of Your Data
Customer intelligence platform - Maximum Capabilities of Your Data
 
Big Data
Big DataBig Data
Big Data
 
141900791 big-data
141900791 big-data141900791 big-data
141900791 big-data
 
BBDO Proximity: Big-data May 2013
BBDO Proximity: Big-data May 2013BBDO Proximity: Big-data May 2013
BBDO Proximity: Big-data May 2013
 
Big Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin MalhotraBig Data & Analytics Trends 2016 Vin Malhotra
Big Data & Analytics Trends 2016 Vin Malhotra
 
Data mining techniques and dss
Data mining techniques and dssData mining techniques and dss
Data mining techniques and dss
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data Applications
 
Big Data - CRM's Promise Land
Big Data - CRM's Promise LandBig Data - CRM's Promise Land
Big Data - CRM's Promise Land
 
How to Ruin your Business with Data Science & Machine Learning by Ingo Mierswa
  How to Ruin your Business with Data Science & Machine Learning by Ingo Mierswa  How to Ruin your Business with Data Science & Machine Learning by Ingo Mierswa
How to Ruin your Business with Data Science & Machine Learning by Ingo Mierswa
 
Cis 500 assignment 4
Cis 500 assignment 4Cis 500 assignment 4
Cis 500 assignment 4
 
The dawn of Big Data
The dawn of Big DataThe dawn of Big Data
The dawn of Big Data
 
What's the big deal about big data 1 30-2015
What's the big deal      about big data 1 30-2015What's the big deal      about big data 1 30-2015
What's the big deal about big data 1 30-2015
 
Analytics Trends 20145 - Deloitte - us-da-analytics-analytics-trends-2015
Analytics Trends 20145 -  Deloitte - us-da-analytics-analytics-trends-2015Analytics Trends 20145 -  Deloitte - us-da-analytics-analytics-trends-2015
Analytics Trends 20145 - Deloitte - us-da-analytics-analytics-trends-2015
 

More from Kelly Technologies

Salesforce crm-training-in-bangalore
Salesforce crm-training-in-bangaloreSalesforce crm-training-in-bangalore
Salesforce crm-training-in-bangalore
Kelly Technologies
 

More from Kelly Technologies (20)

Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
 
Hadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologiesHadoop trainting-in-hyderabad@kelly technologies
Hadoop trainting-in-hyderabad@kelly technologies
 
Hadoop training-in-hyderabad
Hadoop training-in-hyderabadHadoop training-in-hyderabad
Hadoop training-in-hyderabad
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science institutes in hyderabad
Data science institutes in hyderabadData science institutes in hyderabad
Data science institutes in hyderabad
 
Hadoop training institute in hyderabad
Hadoop training institute in hyderabadHadoop training institute in hyderabad
Hadoop training institute in hyderabad
 
Hadoop institutes in hyderabad
Hadoop institutes in hyderabadHadoop institutes in hyderabad
Hadoop institutes in hyderabad
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologiesHadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologies
 
Sas training in hyderabad
Sas training in hyderabadSas training in hyderabad
Sas training in hyderabad
 
Websphere mb training in hyderabad
Websphere mb training in hyderabadWebsphere mb training in hyderabad
Websphere mb training in hyderabad
 
Hadoop institutes-in-bangalore
Hadoop institutes-in-bangaloreHadoop institutes-in-bangalore
Hadoop institutes-in-bangalore
 
Oracle training-institutes-in-hyderabad
Oracle training-institutes-in-hyderabadOracle training-institutes-in-hyderabad
Oracle training-institutes-in-hyderabad
 
Hadoop training institutes in bangalore
Hadoop training institutes in bangaloreHadoop training institutes in bangalore
Hadoop training institutes in bangalore
 
Hadoop training institute in bangalore
Hadoop training institute in bangaloreHadoop training institute in bangalore
Hadoop training institute in bangalore
 
Tableau training in bangalore
Tableau training in bangaloreTableau training in bangalore
Tableau training in bangalore
 
Salesforce crm-training-in-bangalore
Salesforce crm-training-in-bangaloreSalesforce crm-training-in-bangalore
Salesforce crm-training-in-bangalore
 
Oracle training in hyderabad
Oracle training in hyderabadOracle training in hyderabad
Oracle training in hyderabad
 
Qlikview training in hyderabad
Qlikview training in hyderabadQlikview training in hyderabad
Qlikview training in hyderabad
 
Spark training-in-bangalore
Spark training-in-bangaloreSpark training-in-bangalore
Spark training-in-bangalore
 
Project Management Planning training in hyderabad
Project Management Planning training in hyderabadProject Management Planning training in hyderabad
Project Management Planning training in hyderabad
 

Recently uploaded

An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 

Data science training in hyderabad

  • 1. CHAPTER 1 INTRODUCTION: DATA- ANALYTIC THINKING 1 Presented By
  • 2.  The past fifteen years have seen extensive investments in business infrastructure, which have improved the ability to collect data throughout the enterprise.  Virtually every aspect of business is now open to data collection and often even instrumented for data collection: operations, manufacturing, supply-chain management, customer behavior, marketing campaign performance, workflow procedures, and so on.  At the same time, information is now widely available on external events such as market trends, industry news, and competitor’s movements.  This broad availability of data has led to increasing interest in methods for extracting useful information and knowledge from data-the realm of data science. 2 www.kellytechno.com
  • 3. THE UBIQUITY OF DATA OPPORTUNITIES  With vast amounts of data now available, companies in almost every industry are focused on exploiting data for competitive advantage.  In the past, firms could employ teams of statisticians, modelers, and analysts to explore datasets manually, but the volume and variety of data have far outstripped the capacity of manual analysis.  At the same time, computers have become far more powerful, networking has become ubiquitous, and algorithms have been developed that can connect datasets to enable broader and deeper analyses than previously possible.  The convergence of these phenomena has given rise to the increasing widespread business application of data science principles and data mining techniques. 3 www.kellytechno.com
  • 4. THE UBIQUITY OF DATA OPPORTUNITIES  Data mining is used for general customer relationship management to analyze customer behavior in order to manage attrition and maximize expected customer value.  The finance industry uses data mining for credit scoring and trading, and in operations via fraud detection and workforce management.  Major retailers from Walmart to Amazon apply data mining throughout their businesses, from marketing to supply-chain management.  Many firms have differentiated themselves strategically with data science, sometimes to the point of evolving into data mining companies.  The primary goals of this book are to help you view business problems from a data perspective and understand principles of extracting useful knowledge from data. 4 www.kellytechno.com
  • 5. THE UBIQUITY OF DATA OPPORTUNITIES  The primary goals of this book are to help you view business problems from a data perspective and understand principles of extracting useful knowledge from data.  There is a fundamental structure to data-analytic thinking, and basic principles that should be understood.  There are also particular areas where intuition, creativity, common sense, and domain knowledge must be brought to bear. 5 www.kellytechno.com
  • 6. THE UBIQUITY OF DATA OPPORTUNITIES  Throughout the first two chapters of this books, we will discuss in detail various topics and techniques related to data science and data mining.  The terms “data science” and “data mining” often are used interchangeably, and the former has taken a life of its own as various individuals and organizations try to capitalize on the current hype surrounding it.  At a high level, data science is a set of fundamental principles that guide the extraction of knowledge from data. Data mining is the extraction of knowledge from data, via technologies that incorporate these principles.  As a term, “data science” often is applied more broadly than the traditional use of “data mining”, but data mining techniques provide some of the clearest illustrations of the principles of data science. 6 www.kellytechno.com
  • 7. EXAMPLE: HURRICANE FRANCES  Consider an example from a New York Time story from 2004:  Hurricane Frances was on its way, barreling across the Caribbean, threatening a direct hit on Florida’s Atlantic coast. Residents made for higher ground, but far away, in Bentonville, Ark., executives at Wal-Mart Stores decided that the situation offered a great opportunity for one of their newest data-driven weapons … predictive technology.  A week ahead of the storm’s landfall, Linda M. Dillman, Wal-Mart’s chief information officer, pressed her staff to come up with forecasts based on what had happened when Hurricane Charley struck several weeks earlier. Backed by the trillions of bytes’ worth of shopper history that is stored in Wal-Mart’s data warehouse, she felt that the company could ‘start predicting what’s going to happen, instead of waiting for it to happen,’ as she put it. (Hays, 2004) 7 www.kellytechno.com
  • 8. EXAMPLE: HURRICANE FRANCES  Consider why data-driven prediction might be useful in this scenario.  It might be useful to predict that people in the path of the hurricane would buy more bottled water. Maybe, but this point seems a bit obvious, and why would we need data science to discover it?  It might be useful to project the amount of increase in sale due to the hurricane, to ensure that local Wal-Mart are properly stocked.  Perhaps mining the data could reveal that a particular DVD sold out in the hurricane’s path – but maybe it sold out that week at Wal-Marts across the country, not just where the hurricane landing was imminent. 8 www.kellytechno.com
  • 9. EXAMPLE: HURRICANE FRANCES  The prediction could be somewhat useful, but is probably more general than Ms. Dillman was intending.  It would be more valuable to discover patterns due to the hurricane that were not obvious.  To do this, analysts might examine the huge volume of Wal-Mart data from prior, similar situations (such as Hurricane Charley) to identify unusual local demand for products. 9 www.kellytechno.com
  • 10. EXAMPLE: HURRICANE FRANCES  From such patterns, the company might be able to anticipate unusual demand for products and rush stock to the stores ahead of the hurricane’s landfall. Indeed, that is what happened.  The New York Times (Hays, 2004) reported that:”…the experts mined the data and found that the stores would indeed need certain products-and not just the usual flashlights. “We didn’t know in the past that strawberry PopTarts increase in sales, like seven times their normal sales rate, ahead of a hurricane’, Ms. Dillman said in a recent interview. “And the pre-hurricane top- selling item was beer.” 10 www.kellytechno.com
  • 11. EXAMPLE: PREDICTING CUSTOMER CHURN  How are such data analyses performed? Consider a second, more typical business scenario and how it might be treated from a data perspective.  Assume you just landed a great analytical job with MegaTelCo, one of the largest telecommunication firms in the United States.  They are having major problem with customer retention in their wireless business. In the mid-Atlantic region, 20% of cell phone customers leave when their contracts expire, and it is getting increasingly difficult to acquire new customers.  Since the cell phone market is now saturated, the huge growth in the wireless market has tapered off. 11 www.kellytechno.com
  • 12. EXAMPLE: PREDICTING CUSTOMER CHURN  Communications companies are now engaged in battles to attract each other’s customers while retaining their own.  Customers switching from one company to another is called churn, and it is expensive all around: one company must spend on incentives to attract a customer while another company loses revenue when the customer departs.  You have been called in to help understand the problem and to devise a solution.  Attracting new customers is much more expensive than retaining existing ones, so a good deal of marketing budget is allocated to prevent churn. 12 www.kellytechno.com
  • 13. EXAMPLE: PREDICTING CUSTOMER CHURN  Marketing has already designed a special retention offer. Your task is to devise a precise, step-by-step plan for how the data science team should use MegaTelCo’s vast data resources to decide which customers should be offered the special retention deal prior to the expiration of their contract.  Think carefully about what data you might use and how they would be used. Specifically, how should MegaTelCo choose a set of customers to receive their offer in order to best reduce churn for a particular incentive budget? Answering this question is much more complicated than it may seem initially. 13 www.kellytechno.com
  • 14. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  Data science involves principles, processes, and techniques for understanding phenomena via the (automated) analysis of data.  In this book, we will view the ultimate goal of data science as improving decision making, as this generally is of direct interest to business. 14 www.kellytechno.com
  • 15. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  Figure 1-1 places data science in the context of various other closely related and data related processes in the organization.  It distinguishes data science from other aspects of data processing that are gaining increasing attention in business. Let’s start at the top. 15 www.kellytechno.com
  • 16. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  Data-driven decision-making (DDD) refers to the practice of basing decisions on the analysis of data, rather than purely on intuition.  For example, a marketer could select advertisements based purely on her long experience in the field and her eye for what will work. Or, she could base her selection on the analysis of data regarding how consumers react to different ads.  She could also use a combination of these approaches. DDD is not an all-or-nothing practice, and different firms engage in DDD to greater or lesser degrees. 16 www.kellytechno.com
  • 17. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  Economist Erik Brynjolfsson and his colleagues from MIT and Penn’s Wharton School conducted a study of how DDD affects firm performance (Brynjolfsson, Hitt, &Kim,2011).  They developed a measure of DDD that rates firms as to how strongly they use data to make decisions across the company.  They show that statistically, the more data driven a firm is, the more productive it is-even controlling for a wide range of possible confounding factors.  And the differences are not small. One standard deviation higher on the DDD scale is associated with a 4%-6% increase in productivity. DDD also is correlated with higher return on assets, return on equity, asset utilization, and market value, and the relationship seems to be causal. 17 www.kellytechno.com
  • 18. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  The sort of decisions we will be interested in this book mainly fall into two type:  (1) decisions for which “discoveries” need to be made within data, and  (2) decisions that repeat, especially at massive scale, and so decision- making can benefit from even small increases in decision-making accuracy based on data analysis.  The Walmart example above illustrates a type 1 problem: Linda Dillman would like to discover knowledge that will help Walmart prepare for Hurricane Frances’s imminent arrival.  In 2012, Walmart’s competitor Target was in the news for a data- driven decision-making case of its own, also a type 1 problem (Duhigg, 2012). Like most retailers, Target cares about consumers’ shopping habits, what drives them, and what can influence them. 18 www.kellytechno.com
  • 19. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  Consumers tend to have inertia in their habits and getting them to change is very difficult. Decision makers at Target knew, however, that the arrival of a new baby in a family is one point where people do change their shopping habits significantly.  In the Target analyst’s word, “As soon as we get them buying diapers from us, they’re going to start buying everything else too”. Most retailers know this and so they compete with each other trying to sell baby-related products to new parents. Since most birth records are public, retailers obtain information on births and send out special offers to the new parents. 19 www.kellytechno.com
  • 20. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  However, Target wanted to get a jump on their competition. They were interested in whether they could predict that people are expecting a baby. If they could, they would gain an advantage by making offers before their competitors. Using techniques of data science, Target analyzed historical data on customers who later were revealed to have been pregnant.  For example, pregnant mothers often change their diets, their wardrobes, their vitamin regimens, and so on. These indicators could be extracted from historical data, assembled into predictive models, and then deployed in marketing campaigns. 20 www.kellytechno.com
  • 21. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  We will discuss predictive models in much detail as we go through the book.  For the time being, it is sufficient to understand that a predictive model abstracts away most of the complexity of the world, focusing in on particular set of indicators that correlate in some way with a quantity of interest.  Importantly, in both the Walmart and the Target example, the data analysis was not testing a simple hypothesis. Instead, the data were explored with the hope that something useful would be discovered. 21 www.kellytechno.com
  • 22. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  Our churn example illustrates type 2 DDD problem. MegaTelCo has hundreds of millions of customers, each a candidate for defection. Ten of millions of customers have contracts expiring each month, so each one of them has an increased likelihood of defection in the near future. If we improve our ability to estimate, for a given customer, how profitable it would be for us to focus on her, we can potentially reap large benefits by applying this ability to the millions of customers in the population.  This same logic applies to many of the areas where we have seen the most application of data science and data mining: direct marketing, online advertising, credit scoring, financial trading, help-desk management, fraud detection, search ranking, product recommendation, and so on. 22 www.kellytechno.com
  • 23. DATA SCIENCE, ENGINEERING, AND DATA- DRIVEN DECISION MAKING  The diagram in figure 1-1 shows data science supporting data- driven decision-making, but also overlapping with data-driven decision making. This highlights the often overlooked fact that, increasingly, business decisions are being made automatically by computer systems. Different industries have adopted automatic decision-making at different rates. The finance and telecommunications industries were early adopts, largely because of their precocious development of data networks and implementation of massive-scale computing, which allowed the aggregation and modeling of data at a large scale, as well as the application of the resultant models to decision-making. 23 www.kellytechno.com