Data mining by example - building predictive model using microsoft decision trees

Data Mining By Example – Building
Predictive Model Using Microsoft
Decision Trees
by Shaoli Lu

Microsoft Decision Trees
• Developed by Microsoft research team, the
Microsoft Decision Trees algorithm is a hybrid
decision tree algorithm that supports
classification and regression

Goal
• To predict a prospect’s likelihood of
purchasing a bike

Prerequisite
• An SQL Server instance created (2005 or above)
• SQL Server Analysis Service (SSAS) –
Multidimensional Feature Installed
(this is used to host and browse the mining structures; cube is not required for data mining!)
• AdventureWorksDW database attached
(download from CodePlex - tailor to the SQL Server version you have)
• Visual Studio 2010 or above with SQL Server
Data Tools (SSDT) installed

My Demo Setup
• Visual Studio 2010
• SQL Server 2012

Create Data Mining Project
• Name the project as DM Decision Trees
(DM = Data Mining)

Create Data Source and Impersonation

Create Mining Structure
• Choose Microsoft Decision Trees model
• Select Data Source View
• Choose training data
• Select Input/Predict parameters
• Set content types
• Set Holdout percentage
• Name the mining structure and model

Deploy the mining structure and
model

Process the mining model
• This is also called “training the model”

Mining Model Viewer
• Identify dominant attributes
• Left is associative with more important
attributes
• Rich visualization is good for data exploration
as well

Mining Model Accuracy Chart
• This is called “Testing the Model” using the
Holdout data
• Lift chart
• Profit chart

Mining Model Prediction
• Singleton query
• Mass prediction

Browse mining model on SQL Server
• Decision trees
• Dependency network

Summary
• Microsoft Decision Trees is a powerful data
mining model, yet it is easy to build, train and use
• Can perform both Singleton (e.g. embed in an
app) and Mass Predictions (e.g. targeted
marketing)
• Holdout data can be used to test trained model
• Rich visualizations such as Lift/Profit Charts and
Dependency Network can facilitate analysis and
data exploration
• Relational database can be used for data mining;
cube is not required

Data mining by example - building predictive model using microsoft decision trees

More Related Content

What's hot

Similar to Data mining by example - building predictive model using microsoft decision trees

Recently uploaded

Data mining by example - building predictive model using microsoft decision trees