Business Intelligence Presentation - Data Mining (2/2)
The document explores the concept of business intelligence and data mining, highlighting methods for analyzing historical data and making forecasts for future business scenarios. It outlines various data mining tasks, predictive models, and algorithms, alongside their applications in different industries such as marketing, finance, and medicine. The text also addresses potential challenges in data mining, including the risk of deriving incorrect or irrelevant conclusions from analyzed data.
How far canI go?
• Storing and analyzing historical data you can see just
one part of reality (the past and the present)
• Is there a way to answer questions not yet made?
Can I look into the future?
• Can I predict how my business is going to work?
What about the market? And my customers?
4.
Data Mining
• Isa process to extract patterns from data
• “We’re drowning in data but information
thirsty”
• Data Mining borrows techniques from
statistics, probability, maths, artificial
intelligence and other fields
Descriptive Models
• AssociationRules / Affinity
• Looks for correlation indexes among
diverse associated elements
• Market Basket Analysis
• Clusterization
• Groups items according to similarity
• “Automatic” classification
9.
Work Cycle
Transform
Data to
Information
Identify Business Opportunities Act with
Information
Measure Results
10.
Data Mining andDWh
• The Data Warehsouse unifies diverse data sources
in one common repository
• Before the DM process, you must have reliable data
sources
• Data must be presented in a way that eases analysis
11.
Project Cycle
• BusinessProblem Formulation
• Data Gathering
• Data transformation and cleansing
• Model Construction
• Model Evaluation
• Reports and Prediction
• Application Integration
• Model Management
12.
What is aModel?
• The model is a set of conclusions reached (in
mathematical format) after data processing
• Is used to extract knowledge and to compare it
to new data to reach to new conclusions
• It has some efficency percentage
• Must be adjusted to make helpful predictions
• It is time-constrainted
13.
Cases
Outlook Temperature (C) Humidity Wind Play Golf?
Sunny 29.4 85% NO No
Sunny 26.6 90% YES No
Overcast 28.3 78% NO Yes
Rainy 21.1 96% NO Yes
Rainy 20.0 80% NO Yes
Rainy 18.3 70% YES No
Overcast 17.7 65% YES Yes
Sunny 22.2 95% NO No
Sunny 20.5 70% NO Yes
Rainy 23.8 80% NO Yes
Sunny 23.8 70% YES Yes
Overcast 22.2 90% YES Yes
Overcast 27.2 75% NO Yes
Rainy 21.6 80% YES No
14.
Model
Outlook
Overcast Rainy Sunny
YES Wind Humidity
NO YES <=77.5 >77.5
YES NO YES NO
15.
Data Mining Algorithms
• Naive Bayes
• Decission Trees
• Autoregression trees (ARTxp and ARIMA)
• K-Means
• Kohonen Maps
• Neural Networks
• Logistic regression
• Time Series
16.
Where can Iuse them?
• Marketing: Segmentation, Campaigns, Results,
Loyalty,...
• Sales: Behaviour detection, Sales habits
• Finances: Investments, Portfolio Management
• Banks and Assurance: Credit Check
• Security: Fraud Detection
• Medicine: Possible treatment analysis
• Manufacturing: Quality Control
• Internet: Click analysis, Text Mining
17.
Data Mining andCRM (1)
• Detect the best prospect / customers
• Select the best communication channel for
prospects / customers
• Select an appropriate message to
prospects / customers
• Cross-selling, Up-selling and sales
recommendation engines
18.
Data Mining andCRM (2)
• Improve direct marketing campaign results
• Customer base segmentation
• Reduce credit risk exposure
• Customer Lifetime Value
• Customer retention and loss
Classification
• Customers bypurchase behaviour
• Customers by payment behaviour
• Customers by resources devoted/needed
to their service
• Customers by credit profile
• Customers by attention required
Prediction / Forecasting
•Revenue Projection
• Payment Projection
• Number of Products sold Projection
• Cash Flow Projection
23.
Some other DMcases
• Key Influencers
• Predictions Calculator
24.
Some Possible
Problems (1)
• To learn things that are not true
• The patterns may not represent any underlying rule
• The model may not represent a relevant number of
examples
• Data may be in a detail level not enough for analysis
25.
Possible Problems... (1I)
•To learn things that are true, but not
useful
• Learn things that we already knew
• Learn things that cannot be applied