This document discusses data mining techniques for customer relationship management (CRM). It defines data mining as the extraction of implicit and novel knowledge from large datasets. The document outlines common data mining applications in retail, banking, telecommunications and other industries. It also discusses how data mining can be used across different stages of the customer lifecycle in CRM, such as up-selling, cross-selling and customer retention. Finally, it provides an overview of common predictive and descriptive data mining techniques like decision trees, rule induction, clustering and association rule mining.
What is Data Mining?
The process of determining useful patterns and relationships in big data through algorithms, to extract knowledge from data warehouses.
Role of Data Mining in CRM:
1. Get a holistic view of customer life-cycle
2. More data will result in accurate models
3. Leverage forecasting and descriptive modeling techniques
4. Have a proactive approach with predictive analytics
How will Data Mining Benefit my Business?
1. Conduct Basket Analysis for better stocking, store layout, and promotion strategies
2. Sales Forecasting to optimize your sales, supply chain and financial management
3. Predictive Lifecycle Management for each customer to segment them accordingly
4. Optimal Allocation of company’s resources for enhanced productivity and better ROI
5. Product Customization by predicting features that meet the customer’s demands
6. Database Marketing for a defined target market, personalized campaigns, and promotional offers
7. Predict Warranty Claims by the customers and their average cost to have efficient management of funds
8. Fraud detection by analyzing past fraudulent transactions and taking corrective measures
Techniques of Data Mining in CRM
1. Clustering – Identify similar data sets and understand both similarities and differences in data to increase conversion rates
2. Classification – Gather all information about a data set and classify it into proper categories.
3. Anomaly Detection – Information that does not match the expected behavior or projected pattern, providing actionable information
4. Association Rule Learning – Uncover hidden patterns from the data to better understand your customer’s habits and predict their decisions
5. Regression – Find dependency between different data items and map out the effect of variables on each other. It helps in determining customer satisfaction levels and its impact on customer loyalty
Read our entire blog here: https://www.rolustech.com/blog/data-mining-crm
What is Data Mining?
The process of determining useful patterns and relationships in big data through algorithms, to extract knowledge from data warehouses.
Role of Data Mining in CRM:
1. Get a holistic view of customer life-cycle
2. More data will result in accurate models
3. Leverage forecasting and descriptive modeling techniques
4. Have a proactive approach with predictive analytics
How will Data Mining Benefit my Business?
1. Conduct Basket Analysis for better stocking, store layout, and promotion strategies
2. Sales Forecasting to optimize your sales, supply chain and financial management
3. Predictive Lifecycle Management for each customer to segment them accordingly
4. Optimal Allocation of company’s resources for enhanced productivity and better ROI
5. Product Customization by predicting features that meet the customer’s demands
6. Database Marketing for a defined target market, personalized campaigns, and promotional offers
7. Predict Warranty Claims by the customers and their average cost to have efficient management of funds
8. Fraud detection by analyzing past fraudulent transactions and taking corrective measures
Techniques of Data Mining in CRM
1. Clustering – Identify similar data sets and understand both similarities and differences in data to increase conversion rates
2. Classification – Gather all information about a data set and classify it into proper categories.
3. Anomaly Detection – Information that does not match the expected behavior or projected pattern, providing actionable information
4. Association Rule Learning – Uncover hidden patterns from the data to better understand your customer’s habits and predict their decisions
5. Regression – Find dependency between different data items and map out the effect of variables on each other. It helps in determining customer satisfaction levels and its impact on customer loyalty
Read our entire blog here: https://www.rolustech.com/blog/data-mining-crm
Chapter 8 of Marketing 4.0: Moving from Traditional to Digital discusses the need for brands to adopt human qualities to attract customers in the human-centric era.
How to use your CRM for upselling and cross-sellingRedspire Ltd
In order to really boost your business you need to be upselling and cross-selling to the customers who you know can best increase your margins.
While CRM is a great toolbox for the complete sales cycle - cold prospect to red-hot lead - Customer Relationship Management needs to be looked at from another perspective: cross-selling and up-selling. In today’s market, concentrating on the customers who know you best can increase turnover and margins faster than any headline-grabbing push into new sales territories.
What is data mining?
Why data mining is required?
Data mining Applications
Data mining in Retail Industry
Marketing
Risk Management
Fraud Detection
Customer Acquisition and Retention
Text mining to correct missing CRM information: a practical data science projectJonathan Sedar
20min talk given at PyData London 2014
A client in the energy sector wanted to create predictive behavioural models of business customers at the company level, but the CRM data was messy, often containing several sub-accounts for each business, without any grouping identifiers, and so aggregation was impossible. In this talk I describe a short project where we used text mining, a handful of unsupervised learning techniques and pragmatic use of human skill, to identify the true company level structures in the CRM data.
Chapter 8 of Marketing 4.0: Moving from Traditional to Digital discusses the need for brands to adopt human qualities to attract customers in the human-centric era.
How to use your CRM for upselling and cross-sellingRedspire Ltd
In order to really boost your business you need to be upselling and cross-selling to the customers who you know can best increase your margins.
While CRM is a great toolbox for the complete sales cycle - cold prospect to red-hot lead - Customer Relationship Management needs to be looked at from another perspective: cross-selling and up-selling. In today’s market, concentrating on the customers who know you best can increase turnover and margins faster than any headline-grabbing push into new sales territories.
What is data mining?
Why data mining is required?
Data mining Applications
Data mining in Retail Industry
Marketing
Risk Management
Fraud Detection
Customer Acquisition and Retention
Text mining to correct missing CRM information: a practical data science projectJonathan Sedar
20min talk given at PyData London 2014
A client in the energy sector wanted to create predictive behavioural models of business customers at the company level, but the CRM data was messy, often containing several sub-accounts for each business, without any grouping identifiers, and so aggregation was impossible. In this talk I describe a short project where we used text mining, a handful of unsupervised learning techniques and pragmatic use of human skill, to identify the true company level structures in the CRM data.
We estimate that nearly one third of news articles contain references to future events. While this information can prove crucial to understanding news stories and how events will develop for a given topic, there is currently no easy way to access this information. We propose a new task to address the problem of retrieving and ranking sentences that contain mentions to future events, which we call ranking related news predictions. In this paper, we formally define this task and propose a learning to rank approach based on 4 classes of features: term similarity, entity-based similarity, topic similarity, and temporal similarity. Through extensive evaluations using a corpus consisting of 1.8 millions news articles and 6,000 manually judged relevance pairs, we show that our approach is able to retrieve a significant number of relevant predictions related to a given topic.
Recommender Systems and Active LearningDain Kaplan
This presentation presents a high level overview of recommender systems and active learning, including from the viewpoint of startups vs. established companies, the cold-start problem, etc.
Online recommendations at scale using matrix factorisationMarcus Ljungblad
This presentation was used for my thesis defense held at Universidad Politecnica de Catalunya, Spain, for a double-degree master programme in Distributed Computing. The other two universities participating in the programme are Royal Institute of Technology, Stockholm, Sweden and Instituto Tecnico Superior, Lisbon, Portugal.
To download please go to: http://www.intelligentmining.com/category/knowledge-base/
Slides as presented by Alex Lin to the NYC Predictive Analytics Meetup group: http://www.meetup.com/NYC-Predictive-Analytics/ on Dec. 10, 2009.
Data Mining: What is Data Mining?
History
How data mining works?
Data Mining Techniques.
Data Mining Process.
(The Cross-Industry Standard Process)
Data Mining: Applications.
Advantages and Disadvantages of Data Mining.
Conclusion.
Data Mining Concepts with Customer Relationship ManagementIJERA Editor
Data mining is important in creating a great experience at e-business. Data mining is the systematic way of extracting information from data. Many of the companies are developing an online internet presence to sell or promote their products and services. Most of the internet users are aware of on-line shopping concepts and techniques to own a product. The e-commerce landscape is the relation between customer relationship management (sales, marketing & support), internet and suppliers.
Consumer Behavior project. Examine and define best ways for Consumer Research Company (Equitec) to target and reach new customers, along with suggesting new ways for the company to market itself.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
1. Data Mining
Techniques for CRM
Seyyed Jamaleddin Pishvayi
Customer Relationship Management
Instructor : Dr. Taghiyare
Tehran University
Spring 1383
2. 2
Outlines
What is Data Mining?
Data Mining Motivation
Data Mining Applications
Applications of Data Mining in CRM
Data Mining Taxonomy
Data Mining Techniques
3. 3
Data Mining
The non-trivial extraction of novel, implicit, and actionable
knowledge from large datasets.
Extremely large datasets
Discovery of the non-obvious
Useful knowledge that can improve processes
Can not be done manually
Technology to enable data exploration, data analysis, and data
visualization of very large databases at a high level of
abstraction, without a specific hypothesis in mind.
Sophisticated data search capability that uses statistical
algorithms to discover patterns and correlations in data.
5. 5
Data Mining (cont.)
Data Mining is a step of Knowledge Discovery in
Databases (KDD) Process
Data Warehousing
Data Selection
Data Preprocessing
Data Transformation
Data Mining
Interpretation/Evaluation
Data Mining is sometimes referred to as KDD and
DM and KDD tend to be used as synonyms
7. 7
Data Mining is Not …
Data warehousing
SQL / Ad Hoc Queries / Reporting
Software Agents
Online Analytical Processing (OLAP)
Data Visualization
8. 8
Data Mining Motivation
Changes in the Business Environment
Customers becoming more demanding
Markets are saturated
Databases today are huge:
More than 1,000,000 entities/records/rows
From 10 to 10,000 fields/attributes/variables
Gigabytes and terabytes
Databases a growing at an unprecedented rate
Decisions must be made rapidly
Decisions must be made with maximum knowledge
9. 9
“The key in business is to know something that
nobody else knows.”
— Aristotle Onassis
“To understand is to perceive patterns.”
— Sir Isaiah Berlin
PHOTO:LUCINDADOUGLAS-MENZIES
PHOTO: HULTON-DEUTSCH COLL
Data Mining Motivation
11. 11
Data Mining Applications:
Retail
Performing basket analysis
Which items customers tend to purchase together. This
knowledge can improve stocking, store layout strategies, and
promotions.
Sales forecasting
Examining time-based patterns helps retailers make stocking
decisions. If a customer purchases an item today, when are they
likely to purchase a complementary item?
Database marketing
Retailers can develop profiles of customers with certain
behaviors, for example, those who purchase designer labels
clothing or those who attend sales. This information can be used
to focus cost–effective promotions.
Merchandise planning and allocation
When retailers add new stores, they can improve merchandise
planning and allocation by examining patterns in stores with
similar demographic characteristics. Retailers can also use data
mining to determine the ideal layout for a specific store.
12. 12
Data Mining Applications:
Banking
Card marketing
By identifying customer segments, card issuers and acquirers
can improve profitability with more effective acquisition and
retention programs, targeted product development, and
customized pricing.
Cardholder pricing and profitability
Card issuers can take advantage of data mining technology to
price their products so as to maximize profit and minimize loss of
customers. Includes risk-based pricing.
Fraud detection
Fraud is enormously costly. By analyzing past transactions that
were later determined to be fraudulent, banks can identify
patterns.
Predictive life-cycle management
DM helps banks predict each customer’s lifetime value and to
service each segment appropriately (for example, offering
special deals and discounts).
13. 13
Data Mining Applications:
Telecommunication
Call detail record analysis
Telecommunication companies accumulate detailed call
records. By identifying customer segments with similar use
patterns, the companies can develop attractive pricing and
feature promotions.
Customer loyalty
Some customers repeatedly switch providers, or “churn”, to
take advantage of attractive incentives by competing
companies. The companies can use DM to identify the
characteristics of customers who are likely to remain loyal
once they switch, thus enabling the companies to target
their spending on customers who will produce the most
profit.
14. 14
Data Mining Applications:
Other Applications
Customer segmentation
All industries can take advantage of DM to discover discrete
segments in their customer bases by considering additional
variables beyond traditional analysis.
Manufacturing
Through choice boards, manufacturers are beginning to
customize products for customers; therefore they must be able to
predict which features should be bundled to meet customer
demand.
Warranties
Manufacturers need to predict the number of customers who will
submit warranty claims and the average cost of those claims.
Frequent flier incentives
Airlines can identify groups of customers that can be given
incentives to fly more.
15. 15
Data Mining in CRM:
Customer Life Cycle
Customer Life Cycle
The stages in the relationship between a customer and a
business
Key stages in the customer lifecycle
Prospects: people who are not yet customers but are in
the target market
Responders: prospects who show an interest in a product
or service
Active Customers: people who are currently using the
product or service
Former Customers: may be “bad” customers who did not
pay their bills or who incurred high costs
It’s important to know life cycle events (e.g.
retirement)
16. 16
Data Mining in CRM:
Customer Life Cycle
What marketers want: Increasing customer
revenue and customer profitability
Up-sell
Cross-sell
Keeping the customers for a longer period of time
Solution: Applying data mining
17. 17
Data Mining in CRM
DM helps to
Determine the behavior surrounding a particular
lifecycle event
Find other people in similar life stages and
determine which customers are following similar
behavior patterns
18. 18
Data Mining in CRM (cont.)
Data Warehouse Data Mining
Campaign Management
Customer Profile
Customer Life Cycle Info.
19. 19
Data Mining in CRM:
More
Building Data Mining Applications for CRM
by Alex Berson, Stephen Smith, Kurt
Thearling (McGraw Hill, 2000).
21. 21
Two Good Algorithm Books
Intelligent Data
Analysis: An
Introduction
by Berthold and Hand
The Elements of
Statistical Learning:
Data Mining, Inference,
and Prediction
by Hastie, Tibshirani, and
Friedman
24. 24
Decision Trees
Data
height hair eyes class
short blond blue A
tall blond brown B
tall red blue A
short dark blue B
tall dark blue B
tall blond blue A
tall dark brown B
short blond brown B
28. 28
Decision Trees:
Another Example
Total list
50% member
0-1 child 2-3 child
20% member
4+ children
$50-75k income
15% member
$75k+ income
70% member
$50-75k income $20-50k income
85% member
Age: 40-60
80% member
Age: 20-40
45% member
29. 29
Rule Induction
Try to find rules of the form
IF <left-hand-side> THEN <right-hand-side>
This is the reverse of a rule-based agent, where the rules are
given and the agent must act. Here the actions are given
and we have to discover the rules!
Prevalence = probability that LHS and RHS
occur together (sometimes called “support factor,”
“leverage” or “lift”)
Predictability = probability of RHS given LHS
(sometimes called “confidence” or “strength”)
31. 31
Use of Rule Associations
Coupons, discounts
Don’t give discounts on 2 items that are frequently
bought together. Use the discount on 1 to “pull” the
other
Product placement
Offer correlated products to the customer at the same
time. Increases sales
Timing of cross-marketing
Send camcorder offer to VCR purchasers 2-3 months
after VCR purchase
Discovery of patterns
People who bought X, Y and Z (but not any pair)
bought W over half the time
32. 32
Finding Rule Associations
Algorithm
Example: grocery shopping
For each item, count # of occurrences (say out of 100,000)
apples 1891, caviar 3, ice cream 1088, …
Drop the ones that are below a minimum support level
apples 1891, ice cream 1088, pet food 2451, …
Make a table of each item against each other item:
Discard cells below support threshold. Now make a cube for
triples, etc. Add 1 dimension for each product on LHS.
apples ice cream pet food
apples 1891 685 24
ice cream ----- 1088 322
pet food ----- ----- 2451
33. 33
Clustering
The art of finding groups in data
Objective: gather items from a database into
sets according to (unknown) common
characteristics
Much more difficult than classification since
the classes are not known in advance (no
training)
Technique: unsupervised learning
34. 34
The K-Means Clustering
Method
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
K=2
Arbitrarily choose K
objects as initial
cluster center
Assign
each of
the
objects
to most
similar
center
Update
the
cluster
means
Update
the
cluster
means
reassignreassign