Submit Search
Upload
Salford Systems - On the Cutting Edge of Technology
•
1 like
•
205 views
V
Vladyslav Frolov
Follow
Dan Steinberg and Mikhail Golovnya present Data Mining and Machine Learning in practice.
Read less
Read more
Data & Analytics
Report
Share
Report
Share
1 of 23
Download now
Download to read offline
Recommended
Changing Requirements of Business Analytics in Financial Services
Changing Requirements of Business Analytics in Financial Services
Salford Systems
Evolution of regression ols to gps to mars
Evolution of regression ols to gps to mars
Salford Systems
Introduction to MARS (1999)
Introduction to MARS (1999)
Salford Systems
Predictive Analytics with Hadoop
Predictive Analytics with Hadoop
DataWorks Summit
Importance of data standards for large scale data integration in chemistry
Importance of data standards for large scale data integration in chemistry
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
Smart Urban Planning Support through Web Data Science on Open and Enterprise ...
Smart Urban Planning Support through Web Data Science on Open and Enterprise ...
Gloria Re Calegari
Introduction to basic statistics
Introduction to basic statistics
IBM
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
Salford Systems
Recommended
Changing Requirements of Business Analytics in Financial Services
Changing Requirements of Business Analytics in Financial Services
Salford Systems
Evolution of regression ols to gps to mars
Evolution of regression ols to gps to mars
Salford Systems
Introduction to MARS (1999)
Introduction to MARS (1999)
Salford Systems
Predictive Analytics with Hadoop
Predictive Analytics with Hadoop
DataWorks Summit
Importance of data standards for large scale data integration in chemistry
Importance of data standards for large scale data integration in chemistry
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
Smart Urban Planning Support through Web Data Science on Open and Enterprise ...
Smart Urban Planning Support through Web Data Science on Open and Enterprise ...
Gloria Re Calegari
Introduction to basic statistics
Introduction to basic statistics
IBM
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
Salford Systems
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Ted Dunning
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
STEP_scotland
Srikanta Mishra
Srikanta Mishra
Society of Petroleum Engineers
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWX
Kirk Haslbeck
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
Salford Systems
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
STEP_scotland
An Empirical Study of Reliability Growth of Open versus Closed Source Softwar...
An Empirical Study of Reliability Growth of Open versus Closed Source Softwar...
najeeb1984
IoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDB
Ivo Andreev
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Ted Dunning
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Seattle DAML meetup
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
STEP_scotland
Prospering from the Energy Revolution: Six in Sixty - Data and Digitalisation
Prospering from the Energy Revolution: Six in Sixty - Data and Digitalisation
KTN
The Paradigm of Fog Computing with Bio-inspired Search Methods and the “5Vs” ...
The Paradigm of Fog Computing with Bio-inspired Search Methods and the “5Vs” ...
israel edem
Gray-Box Models for Performance Assessment of Spark Applications
Gray-Box Models for Performance Assessment of Spark Applications
ATMOSPHERE .
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
Revolution Analytics
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
Revolution Analytics
20150814 Wrangling Data From Raw to Tidy vs
20150814 Wrangling Data From Raw to Tidy vs
Ian Feller
Big data
Big data
canara engineering college
Big data
Big data
Harshit Namdev
Hadoop PDF
Hadoop PDF
1904saikrishna
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
GovindSinghDasila
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
kojalkojal131
More Related Content
Similar to Salford Systems - On the Cutting Edge of Technology
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Ted Dunning
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
STEP_scotland
Srikanta Mishra
Srikanta Mishra
Society of Petroleum Engineers
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWX
Kirk Haslbeck
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
Salford Systems
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
STEP_scotland
An Empirical Study of Reliability Growth of Open versus Closed Source Softwar...
An Empirical Study of Reliability Growth of Open versus Closed Source Softwar...
najeeb1984
IoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDB
Ivo Andreev
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Ted Dunning
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Seattle DAML meetup
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
STEP_scotland
Prospering from the Energy Revolution: Six in Sixty - Data and Digitalisation
Prospering from the Energy Revolution: Six in Sixty - Data and Digitalisation
KTN
The Paradigm of Fog Computing with Bio-inspired Search Methods and the “5Vs” ...
The Paradigm of Fog Computing with Bio-inspired Search Methods and the “5Vs” ...
israel edem
Gray-Box Models for Performance Assessment of Spark Applications
Gray-Box Models for Performance Assessment of Spark Applications
ATMOSPHERE .
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
Revolution Analytics
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
Revolution Analytics
20150814 Wrangling Data From Raw to Tidy vs
20150814 Wrangling Data From Raw to Tidy vs
Ian Feller
Big data
Big data
canara engineering college
Big data
Big data
Harshit Namdev
Hadoop PDF
Hadoop PDF
1904saikrishna
Similar to Salford Systems - On the Cutting Edge of Technology
(20)
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
Srikanta Mishra
Srikanta Mishra
Spark-Zeppelin-ML on HWX
Spark-Zeppelin-ML on HWX
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Quality Steering Group - Modelling & Monitoring Workshop -...
An Empirical Study of Reliability Growth of Open versus Closed Source Softwar...
An Empirical Study of Reliability Growth of Open versus Closed Source Softwar...
IoT with Azure Machine Learning and InfluxDB
IoT with Azure Machine Learning and InfluxDB
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Alex Korbonits, "AUC at what costs?" Seattle DAML June 2016
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
Scottish Urban Air Qualtiy Steering Group - Modelling & Monitoring Workshop -...
Prospering from the Energy Revolution: Six in Sixty - Data and Digitalisation
Prospering from the Energy Revolution: Six in Sixty - Data and Digitalisation
The Paradigm of Fog Computing with Bio-inspired Search Methods and the “5Vs” ...
The Paradigm of Fog Computing with Bio-inspired Search Methods and the “5Vs” ...
Gray-Box Models for Performance Assessment of Spark Applications
Gray-Box Models for Performance Assessment of Spark Applications
Taking R Analytics to SQL and the Cloud
Taking R Analytics to SQL and the Cloud
In-Database Analytics Deep Dive with Teradata and Revolution
In-Database Analytics Deep Dive with Teradata and Revolution
20150814 Wrangling Data From Raw to Tidy vs
20150814 Wrangling Data From Raw to Tidy vs
Big data
Big data
Big data
Big data
Hadoop PDF
Hadoop PDF
Recently uploaded
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
GovindSinghDasila
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
kojalkojal131
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
HyderabadDolls
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
HyderabadDolls
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
kumargunjan9515
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
HyderabadDolls
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
SOFTTECHHUB
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
Elaine Werffeli
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
HyderabadDolls
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
kumargunjan9515
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
gargpaaro
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
amy56318795
Recently uploaded
(20)
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
Salford Systems - On the Cutting Edge of Technology
1.
May 2016 Dan Steinberg Mikhail
Golovnya Salford Systems On the Cutting Edge of Technology
2.
Salford Systems: Company
Overview • Founded 1983 by Dan Steinberg o First products were SAS® procedures: MLOGIT, MPROBIT • 1993 CART Version 1.0 • 1994-1997 Extensive data mining consulting • 1999 MARS Version 1.0 • 2000 CART 4.0 MARS 2.0 • 2002 CART 5.0 TreeNet 1.0 • 2004 Random Forests • 2005 TreeNet 2.0 • 2006 CART 6.0 Salford Systems ©2014 Company and Product Line Overview 2
3.
Recent Product Line
Development • 2008-2013 Series of Products o CART® 6.0 in Standard, EX, and PRO versions o TreeNet® 2.0 PRO EX version o MARS® 3.0 with Time Series Support o RandomForests® 2.0 o Generalized Path Seeker • 2014 Data Mining Suite (integrates all tools) o SPM 7.0 o Multithreading and modular design o Engine APIs o Ongoing work on SPM 8.0 and Big Data approaches Salford Systems ©2014 Company and Product Line Overview 3
4.
Salford Competitive Awards •
2010 Direct Marketing Association. 1st place* • 2009 KDDCup IDAnalytics*, and FEG Japan* 1st Runner Up • 2008 DMA Direct Marketing Association 1st Runner Up • 2007 Pacific Asia PAKDD: Credit Card Cross Sell. 1st place • 2006 DMA Direct Marketing Association: Predictive Modeling* • 2006 PAKDD Pacific Asia KDD: Telco Customer Type Profiling • 2005 BI-Cup Latin America: Predictive Modeling E-commerce* 1st place • 2004 KDDCup: Predictive Modeling ‘Most Accurate”* • 2002 NCR/Teradata Duke University: Predictive Modeling-Churn o all four separate predictive modeling challenges 1st place • 2000 KDDCup: Predictive Modeling- Online behavior 1st place • 2000 KDDCup: CRM Analysis 1st place Salford Systems ©2014 Company and Product Line Overview *Won either by Salford or by client using Salford tools 4
5.
Salford Systems: R&D
Staff and Partners • Dan Steinberg, Ph.D Econometrics, Harvard (CART, MARS, Discrete Choice) • Nicholas Scott Cardell, PhD Econometrics, Harvard (Data Mining, Discrete Choice) • Jerome H. Friedman, Stanford University (algorithm coder CART, MARS, Treenet, HotSpotDetector) • Leo Breiman, UC Berkeley (algorithm developer, ensembles of trees, randomization techniques to improve trees) • Richard Olshen, Stanford University (Survival CART, Tree- Based Clustering) • Charles Stone, UC Berkeley (CART large sample theory) • Richard Carson, UC San Diego, Visualization Methods, Super Computer methods Salford Systems ©2014 Company and Product Line Overview 5
6.
Introduction to SPM
Salford Systems ©2014 Machine Learning Defined • Machine Learning is the search for patterns in data using modern highly automated, computer intensive methods o Data mining may be best defined as the use of a specific class of tools (data mining methods) in the analysis of data o The term “search” is key to this definition, as is “automated” • The literature often refers to finding hidden information in data • We will focus on patterns that allow us to accomplish two tasks: o Classification o Regression • There is also a third common task o Finding groups in data (clustering, density estimation) This is known as “supervised learning” This is known as “unsupervised learning” 6
7.
Introduction to SPM
Salford Systems ©2014 The Essence of Machine Learning • In a nutshell: Use historical data to gain insights and/or make predictions on the new data Population Analyst Model Scoring DM Engine Historical Data New Data Insights Predictions 7
8.
Boston Housing Data
Set • Concerns the housing values in Boston area • Harrison, D. and D. Rubinfeld. Hedonic Prices and the Demand For Clean Air. Journal of Environmental Economics and Management, v5, 81-102 , 1978 • Combined information from 10 separate governmental and educational sources to produce this data set • 506 census tracts in City of Boston for the year 1970 o Goal: study relationship between quality of life variables and property values o MV median value of owner-occupied homes in tract ($1,000’s) o CRIM per capita crime rates o NOX concentration of nitric oxides (pp 10 million) o AGE percent built before 1940 o DIS weighted distance to centers of employment o RM average number of rooms per house o LSTAT % lower status of the population o RAD accessibility to radial highways o CHAS borders Charles River (0/1) o INDUS percent non-retail business o TAX property tax rate per $10,000 o PT pupil teacher ratio Introduction to SPM Salford Systems ©2014 8
9.
Target: Median House
Value (MV) The distribution of the target variable (in thousands $) Clear manifestation of the inflation over the past 40 years Introduction to SPM Salford Systems ©2014 9
10.
• The data
violates all conventional modeling assumptions • Clearly some non-normal distributions and non-linear relationships Mutual Dependency Introduction to SPM Salford Systems ©2014 10
11.
OLS Regression • OLS
– ordinary least squares regression o Discovered by Legendre (1805) and Gauss (1809) to solve problems in astronomy using pen and paper o Solid statistical foundation by Fisher in 1920s o 1950s – use of electro-mechanical calculators • The model is always of the form • The response surface is a hyper-plane! • A – the intercept term • B1, B2, B3, … – parameter estimates • A usually unique combination of values exists which minimizes the mean squared error of predictions on the learn sample • Step-wise approaches to determine model size Response = A + B1X1 + B2X2 + B3X3 + … Introduction to SPM Salford Systems ©2014 11
12.
OLS on Boston
Data • 414 records in the learn sample • 92 records in the test sample • Good agreement o LEARN MSE = 27.455 o TEST MSE = 26.147 3-variable Solution -0.597 +5.247 -0.858 Introduction to SPM Salford Systems ©2014 12
13.
Unique Personalities – the
“Founding Fathers” of CART Leo Breiman Jerome Friedman Richard Olshen Charles Stone Salford Systems ©2014 Company and Product Line Overview 13
14.
1984 CART Monograph ©
Copyright Salford Systems 1999-2015
15.
Introduction to SPM
Salford Systems ©2014 15 Regression Tree Model • All cases in the given node are assigned the same predicted response – the node average of the original target • Nodes are color-coded according to the predicted response • We have a convenient segmentation of the population according to the average response levels
16.
Introduction to SPM
Salford Systems ©2014 16 The Best and the Worst Segments
17.
Gradient Boosting • Begin
with a very small tree as initial model • Compute “residuals” (prediction errors) for this simple model for every record in data • Grow a second small tree to predict the residuals from the first tree • Compute residuals from this new 2-tree model and grow a 3rd tree to predict revised residuals • Repeat this process to grow a sequence of trees + + + … Tree 1 Tree 2 Tree 3 More trees Salford Systems ©2014 Company and Product Line Overview 17
18.
Illustration: Saddle Function •
500 {X1,X2} points randomly drawn from a [-3,+3] box to produce the XOR response surface Y = X1 * X2 • Will use 3-node trees to show the evolution of TreeNet response surface Salford Systems ©2014 Company and Product Line Overview 1 Tree 2 Trees 3 Trees 4 Trees 10 Trees 20 Trees 30 Trees 40 Trees 100 Trees 195 Trees 18
19.
Delinquency Dataset VARIABLE DESCRIPTION DELINQUENT
Person experienced 90 days past due delinquency or worse AGE Age of borrower in years DEBT_RATIO Monthly debt payments, alimony, living costs divided by monthly gross income MONTH_INCOME Monthly income N_OPEN_LINES Number of open loans (mortgages, car loans, credit cards, etc.) N_MORTGAGES Number of mortgage and real estate loans N_DEPENDENTS Number of dependents in family excluding yourself Salford Systems ©2015 Advanced Uses of SPM 19
20.
• TreeNet TreeNet Model
for Delinquency Salford Systems ©2015 Advanced Uses of SPM 20 • Logistic Regression • Here we show the performance of the 6-variable TreeNet model compared to the performance of the equivalent Logistic Regression model • TreeNet has a clear edge of 5 points over the Logistic Regression in terms of ROC-area!
21.
Gathering Up Transformations •
TreeNet provides powerful insights into the inner workings of the model by constructing partial dependence plots • We can now use the partial dependence plots to construct 1-st and 2-nd order univariate spline transformations • The resulting transforms are added to the dataset and the code is saved for future use Salford Systems ©2015 Advanced Uses of SPM 21
22.
Enhanced Logistic Regression
Model • We can now build a logistic regression model using the transformed features • Our new model almost completely recovers performance of the original unconstrained TreeNet model! • This is because the data exhibit virtually no interactions which can be easily confirmed by building a constrained additive model in TreeNet Salford Systems ©2015 Advanced Uses of SPM 22
23.
Salford Predictive Modeler
SPM • Download a current version from our website http://www.salford-systems.com • Version will run without a license key for 10-days • Request a license key from unlock@salford-systems.com • Request configuration to meet your needs o Data handling capacity o Data mining engines made available Salford Systems ©2014 Company and Product Line Overview 23
Download now