SlideShare a Scribd company logo
1 of 10
Download to read offline
Salford Systems
General Features SPM 7.0 Feature Matrix
Components Basic Pro ProEx Ultra
Modeling Engine: CART (Decision Trees) o o o o
Modeling Engine: MARS (Nonlinear Regression) o o o o
Modeling Engine: TreeNet (Stochastic Gradient Boosting) o o o o
Modeling Engine: RandomForests for Classification o o o o
Reporting ROC curves during model building and model scoring o o o o
Model performance stats based on Cross Validation o o o o
Model performance stats based on out of bag data during
bootstrapping
o o o o
Reporting performance summaries on learn and test data
partitions
o o o o
Reporting Gains and Lift Charts during model building and model
scoring
o o o o
Automatic creation of Command Logs o o o o
Built-in support to create, edit, and execute command files o o o o
Translating models into SAS-compatible language o o o o
Reading and writing datasets in all current database/statistical
file formats
o o o o
Option to save processed datasets into all current
database/statistical file formats
o o o o
Automation: Build a series of models using every available data
mining engine (Battery MODELS)
o o o o
9685 Via Excelencia, Suite 208, San Diego, CA, 92126
Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com
Salford Systems
Components Basic Pro ProEx Ultra
Additional Modeling Engines: Regression, Logistic
Regression, RandomForests for Regession, Regularized
Regression (LASSO/Ridge/LARS/Elastic Net/GPS)
o o o
Automatic creation of missing value indicators o o o
Option to treat missing value in a categorical predictor as a new
level
o o o
License to any level supported by RAM (currently 32MB to 1TB) o o o
License for multi-core capabilities o o o
Using built-in BASIC Programming Language during data
preparation
o o o
Automatic creation of lag variables based on user specifications
during data preparation
o o o
Automatic creation and reporting of key overall and stratified
summary statistics for user supplied list of variables
o o o
Display charts, histograms, and scatter plots for user selected
variables
o o o
Command Line GUI Assistant to simplify creating and editing
command files
o o o
Translating models into SAS/PMML/C/Java/Classic o o o
An alternative to variable importance based on Leo Breiman's
scrambler
o o o
Unsupervised Learning - Breiman's column scrambler o o o
Scoring any Battery (pre-packaged scenario of runs) as an
ensemble model
o o o
Custom selection of a new predictors list from an existing
variable importance report
o o o
User defined bins for Cross Validation o o o
Automated imputation of all missing values o o o
Automation: Build two models reversing the roles of the learn and test
samples (Battery FLIP)
o o o
Automation: Explore model stability by repeated random drawing of the
learn sample from the original dataset (Battery DRAW)
o o o
Automation: For time series applications, build models based on
sliding time window using a large array of user options (Battery
DATASHIFT)
o o o
Automation: Explore mutual multivariate dependencies among
available predictors (Battery TARGET)
o o o
Automation: Explore the effects of the learn sample size on the model
performance (Battery SAMPLE)
o o o
9685 Via Excelencia, Suite 208, San Diego, CA, 92126
Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com
Salford Systems
Components Basic Pro ProEx Ultra
Automation: Explore alternative strategies to handling of missing
values (Battery MVI)
o o o
Automation: Check the validity of model performance using Monte
Carlo shuffling of the target (Battery MCT)
o o o
Automation: Build a series of models varying the number of bins for
Cross Validation (Battery CV)
o o o
Automation: Repeat Cross Validation process many times to explore
the variance of estimates (Battery CVR)
o o o
Automation: Build a series of models using a user-suppllied list of
binning variables for cross-validation (Battery CVBIN)
o o o
Automation: Build a series of models by varying the random number
seed (Battery SEED)
o o o
Automation: Explore the marginal contribution of each predictor to the
existing model (Battery LOVO)
o o o
Save out of bag predictions during Cross Validation o o
Automation: Generate detailed univariate stats on every
continuous predictor to spot potential outliers and problematic
records (Battery OUTLIERS)
o o
Automation: Convert (bin) all continuous variables into
categorical (discrete) versions using a large array of user options
(equal width, weights of evidence, Naïve Bayes, supervised)
(Battery BIN)
o o
Automation: Explore model stability by repeated repartitioning of the
data into learn, test, and possibly hold-out samples (Battery
PARTITION)
o o
Automation: Build a series of models using different backward variable
selection strategies (Battery SHAVING)
o o
Automation: Build a series of models using the forward-stepwise
variable selection strategy (Battery STEPWISE)
o o
Automation: Explore nonlinear univariate relationships between the
target and each available predictor (Battery ONEOFF, Battery XONY)
o o
Automation: Build a series of models using randomly sampled
predictors (Battery KEEP)
o o
Automation: Explore the impact of a potential replacement of a given
predictor by another one (Battery SWAP)
o o
9685 Via Excelencia, Suite 208, San Diego, CA, 92126
Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com
Salford Systems
Components Basic Pro ProEx Ultra
Automation: Explore the impact of penalty on categorical predictors
(Battery PENALTY=HLC)
o o
Automation: Explore the impact of penalty on missing values (Battery
PENALTY=MISSING)
o o
Automation: Bootstrapping process (sampling with replacement from
the learn sample) with a large array of user options (Random Forests-
style sampling of predictors, saving in-bag and out-of-bag scores,
proximity matrix, and node dummies) (Battery BOOTSTRAP)
o o
Automation: Parametric bootstrap process (Battery PBOOT) o o
Automation: Build a series of models for each strata defined in the
dataset (Battery STRATA)
o o
Automation: Build two linked models, where the first one predicts the
binary event while the second one predicts the amount (Battery
RELATED). For example, predicting whether someone will buy and how
much will be spent
o o
Automation: Build a series of models limiting the number of nodes in a
tree thus controlling the order of interactions (Battery NODES)
o o
Automation: Build a series of models varying the speed of learning
(Battery LEARNRATE)
o o
Automation: Build a series of models by progressively imposing
additivity on individual predictors (Battery ADDITIVE)
o o
Automation: Build a series of models utilizing different regression loss
functions (Battery TNREG)
o o
Automation: Build a series of models by varying subsampling fraction
(Battery TNSUBSAMPLE)
o o
Automation: Build a series of models using varying degree of penalty
on added variables (Battery ADDEDVAR)
o o
Modeling Pipelines: RuleLearner, ISLE o
Build a CART tree utilizing the TreeNet engine to gain speed as
well as alternative reporting
o
RandomForests inspired sampling of predictors at each node
during model building
o
Build a RandomForests model utilizing the TreeNet engine to
gain speed as well as alternative reporting
o
Build a Random Forests model utilizing the CART engine to gain
alternative handling of missing values via surrogate splits
(Battery BOOTSTRAP RSPLIT)
o
9685 Via Excelencia, Suite 208, San Diego, CA, 92126
Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com
Salford	
  Systems
	
  9685	
  Via	
  Excelencia,	
  Suite	
  208,	
  San	
  Diego,	
  CA,	
  92126
Tel:	
  619-­‐543-­‐8880	
  Fax:	
  619-­‐543-­‐8888	
  E-­‐Mail:	
  info@salford-­‐systems.com	
  Web	
  Site:	
  www.salford-­‐systems.com
Additional	
  CART	
  Features SPM	
  7.0	
  Feature	
  Matrix
Components Basic Pro ProEx Ultra
Modeling Engine: CART (Decision Trees) o o o o
Linear Combination Splits o o o o
Optimal tree selection based on area under ROC curve o o o o
User defined splits for the root node and its children o o o
Automation: Generate models with alternative handling of
missing values (Battery MVI)
o o o
Automation: Build a series of models using all available splitting
strategies (six for classification, two for regression) (Battery RULES)
o o o
Automation: Build a series of models varying the depth of the tree
(Battery DEPTH)
o o o
Automation: Build a series of models changing the minimum
required size on parent nodes (Battery ATOM)
o o o
Automation: Build a series of models changing the minimum
required size on child nodes (Battery MINCHILD)
o o o
Automation: Explore accuracy versus speed trade-off due to
potential sampling of records at each node in a tree (Battery
SUBSAMPLE)
o o o
Multiple user defined lists for linear combinations o o
Constrained trees o o
Ability to create and save dummy variables for every node in
the tree during scoring
o o
Report basic stats on any variable of user choice at every
node in the tree
o o
Comparison of learn vs. test performance at every node of
every tree in the sequence
o o
Hot-Spot detection to identify the richest nodes across multiple
trees
o o
Automation: Vary the priors for the specified class (Battery
PRIORS)
o o
Automation: Build a series of models limiting the number of nodes in
a tree (Battery NODES)
o o
Automation: Build a series of models trying each available predictor
as the root node splitter (Battery ROOT)
o o
Automation: Explore the impact of favoring equal sized child nodes
(Battery POWER)
o o
Salford	
  Systems
	
  9685	
  Via	
  Excelencia,	
  Suite	
  208,	
  San	
  Diego,	
  CA,	
  92126
Tel:	
  619-­‐543-­‐8880	
  Fax:	
  619-­‐543-­‐8888	
  E-­‐Mail:	
  info@salford-­‐systems.com	
  Web	
  Site:	
  www.salford-­‐systems.com
Components Basic Pro ProEx Ultra
Automation: Build a series of models by progressively removing
misclassified records thus increasing the robustness of trees and
posssibly reducing model complexity (Battery REFINE)
o o
Automation: Bagging and ARCing using the legacy code
(COMBINE)
o o
Build a CART tree utilizing the TreeNet engine to gain speed
as well as alternative reporting
o
Build a Random Forests model utlizing the CART engine to
gain alternative handling of missing values via surrogate splits
(Battery BOOTSTRAP RSPLIT)
o
Salford	
  Systems
	
  9685	
  Via	
  Excelencia,	
  Suite	
  208,	
  San	
  Diego,	
  CA,	
  92126
Tel:	
  619-­‐543-­‐8880	
  Fax:	
  619-­‐543-­‐8888	
  E-­‐Mail:	
  info@salford-­‐systems.com	
  Web	
  Site:	
  www.salford-­‐systems.com
Additional	
  MARS	
  Features SPM	
  7.0	
  Feature	
  Matrix
Components Basic Pro ProEx Ultra
Modeling Engine: MARS (Nonlinear Regression) o o o o
Automation: Build a series of models varying the maximum
number of basis functions (Battery BASIS)
o o o
Automation: Build a series of models varying the smoothness
parameter (Battery MINSPAN)
o o
Automation: Build a series of models varying the order of
interactions (Battery INTERACTIONS)
o o
Automation: Build a series of models varying the modeling speed
(Battery SPEED)
o o
Automation: Build a series of models using varying degree of
penalty on added variables (Battery PENALTY MARS)
o
Salford	
  Systems
	
  9685	
  Via	
  Excelencia,	
  Suite	
  208,	
  San	
  Diego,	
  CA,	
  92126
Tel:	
  619-­‐543-­‐8880	
  Fax:	
  619-­‐543-­‐8888	
  E-­‐Mail:	
  info@salford-­‐systems.com	
  Web	
  Site:	
  www.salford-­‐systems.com
Additional	
  TreeNet	
  Features SPM	
  7.0	
  Feature	
  Matrix
Components Basic Pro ProEx Ultra
Modeling Engine: TreeNet (Stochastic Gradient Boosting) o o o o
Spline-based approximations to the TreeNet dependency plots o o o
Exporting TreeNet dependency plots into XML file o o o
Automation: Build a series of models changing the minimum
required size on child nodes (Battery MINCHILD)
o o o
Flexible control over interactions in a TreeNet model (ICL) o o
Interaction strength reporting o o
Build a CART tree utilizing the TreeNet engine to gain speed
as well as alternative reporting
o
Build a RandomForests model utilizing the TreeNet engine to
gain speed as well as alternative reporting
o
RandomForests inspired sampling of predictors at each node
during model building
o
Automation: Explore the impact of influence trimming (outlier
removal) for logistic and classification models (Battery INFLUENCE)
o
Automation: Exhaustive search and ranking for all interactions of
the specified order (Battery ICL)
o
Salford	
  Systems
	
  9685	
  Via	
  Excelencia,	
  Suite	
  208,	
  San	
  Diego,	
  CA,	
  92126
Tel:	
  619-­‐543-­‐8880	
  Fax:	
  619-­‐543-­‐8888	
  E-­‐Mail:	
  info@salford-­‐systems.com	
  Web	
  Site:	
  www.salford-­‐systems.com
Additional	
  Random	
  Forests	
  Features SPM	
  7.0	
  Feature	
  Matrix
Components Basic Pro ProEx Ultra
Modeling Engine: RandomForests for Classification o o o o
Additional Modeling Engine: Random Forests for Regession o o o
Automation: Vary the number of randomly selected predictors
at the node-level (Battery RFNPREDS)
o o o
Salford Systems
Additional GPS Features SPM 7.0 Feature Matrix
Components Basic Pro ProEx Ultra
Modeling Engines: Regularized Regression
(LASSO/Ridge/LARS/Elastic Net/GPS)
o o o
Automation: Build a series of models by forcing different limit on the
maximum correlation among predictors (Battery MAXCORR)
o o
9685 Via Excelencia, Suite 208, San Diego, CA, 92126
Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com

More Related Content

Similar to SPM v7.0 Feature Matrix

Clipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemClipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemDatabricks
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsAntonio Severien
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured StreamingKnoldus Inc.
 
Complex Test Pattern Generation for high speed fault diagnosis in FPGA based ...
Complex Test Pattern Generation for high speed fault diagnosis in FPGA based ...Complex Test Pattern Generation for high speed fault diagnosis in FPGA based ...
Complex Test Pattern Generation for high speed fault diagnosis in FPGA based ...IJMERJOURNAL
 
Application of Artificial Intelligence for Automotive Applications
Application of Artificial Intelligence for Automotive ApplicationsApplication of Artificial Intelligence for Automotive Applications
Application of Artificial Intelligence for Automotive ApplicationsKonfHubTechConferenc
 
YABench: A Comprehensive Framework for RDF Stream Processor Correctness and P...
YABench: A Comprehensive Framework for RDF Stream Processor Correctness and P...YABench: A Comprehensive Framework for RDF Stream Processor Correctness and P...
YABench: A Comprehensive Framework for RDF Stream Processor Correctness and P...Maxim Kolchin
 
Artificial Intelligence Database Performance Tuning
Artificial Intelligence Database Performance TuningArtificial Intelligence Database Performance Tuning
Artificial Intelligence Database Performance TuningRoel Van de Paar
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Databricks
 
SERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolSERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolHenry Muccini
 
SERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENEWorkshop
 
Extent3 exactpro the_future_of_risk_controls
Extent3 exactpro the_future_of_risk_controlsExtent3 exactpro the_future_of_risk_controls
Extent3 exactpro the_future_of_risk_controlsextentconf Tsoy
 
SBML, SBML Packages, SED-ML, 
 COMBINE Archive, and more
SBML, SBML Packages, SED-ML, 
 COMBINE Archive, and moreSBML, SBML Packages, SED-ML, 
 COMBINE Archive, and more
SBML, SBML Packages, SED-ML, 
 COMBINE Archive, and moreMike Hucka
 
Finding the right Machine Learning method for predictive modeling
Finding the right Machine Learning method for predictive modelingFinding the right Machine Learning method for predictive modeling
Finding the right Machine Learning method for predictive modelingSK Reddy
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Spark Summit
 
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...Daniel Varro
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Deepak Shankar
 
Java Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderJava Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderIsuru Perera
 
M|18 Ingesting Data with the New Bulk Data Adapters
M|18 Ingesting Data with the New Bulk Data AdaptersM|18 Ingesting Data with the New Bulk Data Adapters
M|18 Ingesting Data with the New Bulk Data AdaptersMariaDB plc
 
lec2 - Modern Processors - SIMD.pptx
lec2 - Modern Processors - SIMD.pptxlec2 - Modern Processors - SIMD.pptx
lec2 - Modern Processors - SIMD.pptxRakesh Pogula
 

Similar to SPM v7.0 Feature Matrix (20)

Clipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemClipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving System
 
Scalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data StreamsScalable Distributed Real-Time Clustering for Big Data Streams
Scalable Distributed Real-Time Clustering for Big Data Streams
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
 
Complex Test Pattern Generation for high speed fault diagnosis in FPGA based ...
Complex Test Pattern Generation for high speed fault diagnosis in FPGA based ...Complex Test Pattern Generation for high speed fault diagnosis in FPGA based ...
Complex Test Pattern Generation for high speed fault diagnosis in FPGA based ...
 
Application of Artificial Intelligence for Automotive Applications
Application of Artificial Intelligence for Automotive ApplicationsApplication of Artificial Intelligence for Automotive Applications
Application of Artificial Intelligence for Automotive Applications
 
YABench: A Comprehensive Framework for RDF Stream Processor Correctness and P...
YABench: A Comprehensive Framework for RDF Stream Processor Correctness and P...YABench: A Comprehensive Framework for RDF Stream Processor Correctness and P...
YABench: A Comprehensive Framework for RDF Stream Processor Correctness and P...
 
Artificial Intelligence Database Performance Tuning
Artificial Intelligence Database Performance TuningArtificial Intelligence Database Performance Tuning
Artificial Intelligence Database Performance Tuning
 
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
Deploying and Monitoring Heterogeneous Machine Learning Applications with Cli...
 
SERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolSERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_school
 
SERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the CloudSERENE 2014 School: Incremental Model Queries over the Cloud
SERENE 2014 School: Incremental Model Queries over the Cloud
 
Extent3 exactpro the_future_of_risk_controls
Extent3 exactpro the_future_of_risk_controlsExtent3 exactpro the_future_of_risk_controls
Extent3 exactpro the_future_of_risk_controls
 
SBML, SBML Packages, SED-ML, 
 COMBINE Archive, and more
SBML, SBML Packages, SED-ML, 
 COMBINE Archive, and moreSBML, SBML Packages, SED-ML, 
 COMBINE Archive, and more
SBML, SBML Packages, SED-ML, 
 COMBINE Archive, and more
 
Finding the right Machine Learning method for predictive modeling
Finding the right Machine Learning method for predictive modelingFinding the right Machine Learning method for predictive modeling
Finding the right Machine Learning method for predictive modeling
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
 
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...
 
airvolt poster
airvolt posterairvolt poster
airvolt poster
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
 
Java Performance and Using Java Flight Recorder
Java Performance and Using Java Flight RecorderJava Performance and Using Java Flight Recorder
Java Performance and Using Java Flight Recorder
 
M|18 Ingesting Data with the New Bulk Data Adapters
M|18 Ingesting Data with the New Bulk Data AdaptersM|18 Ingesting Data with the New Bulk Data Adapters
M|18 Ingesting Data with the New Bulk Data Adapters
 
lec2 - Modern Processors - SIMD.pptx
lec2 - Modern Processors - SIMD.pptxlec2 - Modern Processors - SIMD.pptx
lec2 - Modern Processors - SIMD.pptx
 

More from Salford Systems

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4Salford Systems
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsSalford Systems
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Salford Systems
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Salford Systems
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningSalford Systems
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerSalford Systems
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like YouSalford Systems
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To RememberSalford Systems
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetSalford Systems
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideSalford Systems
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to marsSalford Systems
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher EducationSalford Systems
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingSalford Systems
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hivSalford Systems
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning CombinationSalford Systems
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSalford Systems
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998Salford Systems
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPMSalford Systems
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7Salford Systems
 
TreeNet Overview - Updated October 2012
TreeNet Overview  - Updated October 2012TreeNet Overview  - Updated October 2012
TreeNet Overview - Updated October 2012Salford Systems
 

More from Salford Systems (20)

Datascience101presentation4
Datascience101presentation4Datascience101presentation4
Datascience101presentation4
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
 
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
Improved Predictions in Structure Based Drug Design Using Cart and Bayesian M...
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications
 
The Do's and Don'ts of Data Mining
The Do's and Don'ts of Data MiningThe Do's and Don'ts of Data Mining
The Do's and Don'ts of Data Mining
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele Cutler
 
9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You9 Data Mining Challenges From Data Scientists Like You
9 Data Mining Challenges From Data Scientists Like You
 
Statistically Significant Quotes To Remember
Statistically Significant Quotes To RememberStatistically Significant Quotes To Remember
Statistically Significant Quotes To Remember
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
 
Evolution of regression ols to gps to mars
Evolution of regression   ols to gps to marsEvolution of regression   ols to gps to mars
Evolution of regression ols to gps to mars
 
Data Mining for Higher Education
Data Mining for Higher EducationData Mining for Higher Education
Data Mining for Higher Education
 
Comparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modelingComparison of statistical methods commonly used in predictive modeling
Comparison of statistical methods commonly used in predictive modeling
 
Molecular data mining tool advances in hiv
Molecular data mining tool  advances in hivMolecular data mining tool  advances in hiv
Molecular data mining tool advances in hiv
 
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees:  A Winning CombinationTreeNet Tree Ensembles & CART Decision Trees:  A Winning Combination
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
 
SPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARSSPM User's Guide: Introducing MARS
SPM User's Guide: Introducing MARS
 
Hybrid cart logit model 1998
Hybrid cart logit model 1998Hybrid cart logit model 1998
Hybrid cart logit model 1998
 
Session Logs Tutorial for SPM
Session Logs Tutorial for SPMSession Logs Tutorial for SPM
Session Logs Tutorial for SPM
 
Some of the new features in SPM 7
Some of the new features in SPM 7Some of the new features in SPM 7
Some of the new features in SPM 7
 
TreeNet Overview - Updated October 2012
TreeNet Overview  - Updated October 2012TreeNet Overview  - Updated October 2012
TreeNet Overview - Updated October 2012
 

Recently uploaded

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

SPM v7.0 Feature Matrix

  • 1. Salford Systems General Features SPM 7.0 Feature Matrix Components Basic Pro ProEx Ultra Modeling Engine: CART (Decision Trees) o o o o Modeling Engine: MARS (Nonlinear Regression) o o o o Modeling Engine: TreeNet (Stochastic Gradient Boosting) o o o o Modeling Engine: RandomForests for Classification o o o o Reporting ROC curves during model building and model scoring o o o o Model performance stats based on Cross Validation o o o o Model performance stats based on out of bag data during bootstrapping o o o o Reporting performance summaries on learn and test data partitions o o o o Reporting Gains and Lift Charts during model building and model scoring o o o o Automatic creation of Command Logs o o o o Built-in support to create, edit, and execute command files o o o o Translating models into SAS-compatible language o o o o Reading and writing datasets in all current database/statistical file formats o o o o Option to save processed datasets into all current database/statistical file formats o o o o Automation: Build a series of models using every available data mining engine (Battery MODELS) o o o o 9685 Via Excelencia, Suite 208, San Diego, CA, 92126 Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com
  • 2. Salford Systems Components Basic Pro ProEx Ultra Additional Modeling Engines: Regression, Logistic Regression, RandomForests for Regession, Regularized Regression (LASSO/Ridge/LARS/Elastic Net/GPS) o o o Automatic creation of missing value indicators o o o Option to treat missing value in a categorical predictor as a new level o o o License to any level supported by RAM (currently 32MB to 1TB) o o o License for multi-core capabilities o o o Using built-in BASIC Programming Language during data preparation o o o Automatic creation of lag variables based on user specifications during data preparation o o o Automatic creation and reporting of key overall and stratified summary statistics for user supplied list of variables o o o Display charts, histograms, and scatter plots for user selected variables o o o Command Line GUI Assistant to simplify creating and editing command files o o o Translating models into SAS/PMML/C/Java/Classic o o o An alternative to variable importance based on Leo Breiman's scrambler o o o Unsupervised Learning - Breiman's column scrambler o o o Scoring any Battery (pre-packaged scenario of runs) as an ensemble model o o o Custom selection of a new predictors list from an existing variable importance report o o o User defined bins for Cross Validation o o o Automated imputation of all missing values o o o Automation: Build two models reversing the roles of the learn and test samples (Battery FLIP) o o o Automation: Explore model stability by repeated random drawing of the learn sample from the original dataset (Battery DRAW) o o o Automation: For time series applications, build models based on sliding time window using a large array of user options (Battery DATASHIFT) o o o Automation: Explore mutual multivariate dependencies among available predictors (Battery TARGET) o o o Automation: Explore the effects of the learn sample size on the model performance (Battery SAMPLE) o o o 9685 Via Excelencia, Suite 208, San Diego, CA, 92126 Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com
  • 3. Salford Systems Components Basic Pro ProEx Ultra Automation: Explore alternative strategies to handling of missing values (Battery MVI) o o o Automation: Check the validity of model performance using Monte Carlo shuffling of the target (Battery MCT) o o o Automation: Build a series of models varying the number of bins for Cross Validation (Battery CV) o o o Automation: Repeat Cross Validation process many times to explore the variance of estimates (Battery CVR) o o o Automation: Build a series of models using a user-suppllied list of binning variables for cross-validation (Battery CVBIN) o o o Automation: Build a series of models by varying the random number seed (Battery SEED) o o o Automation: Explore the marginal contribution of each predictor to the existing model (Battery LOVO) o o o Save out of bag predictions during Cross Validation o o Automation: Generate detailed univariate stats on every continuous predictor to spot potential outliers and problematic records (Battery OUTLIERS) o o Automation: Convert (bin) all continuous variables into categorical (discrete) versions using a large array of user options (equal width, weights of evidence, Naïve Bayes, supervised) (Battery BIN) o o Automation: Explore model stability by repeated repartitioning of the data into learn, test, and possibly hold-out samples (Battery PARTITION) o o Automation: Build a series of models using different backward variable selection strategies (Battery SHAVING) o o Automation: Build a series of models using the forward-stepwise variable selection strategy (Battery STEPWISE) o o Automation: Explore nonlinear univariate relationships between the target and each available predictor (Battery ONEOFF, Battery XONY) o o Automation: Build a series of models using randomly sampled predictors (Battery KEEP) o o Automation: Explore the impact of a potential replacement of a given predictor by another one (Battery SWAP) o o 9685 Via Excelencia, Suite 208, San Diego, CA, 92126 Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com
  • 4. Salford Systems Components Basic Pro ProEx Ultra Automation: Explore the impact of penalty on categorical predictors (Battery PENALTY=HLC) o o Automation: Explore the impact of penalty on missing values (Battery PENALTY=MISSING) o o Automation: Bootstrapping process (sampling with replacement from the learn sample) with a large array of user options (Random Forests- style sampling of predictors, saving in-bag and out-of-bag scores, proximity matrix, and node dummies) (Battery BOOTSTRAP) o o Automation: Parametric bootstrap process (Battery PBOOT) o o Automation: Build a series of models for each strata defined in the dataset (Battery STRATA) o o Automation: Build two linked models, where the first one predicts the binary event while the second one predicts the amount (Battery RELATED). For example, predicting whether someone will buy and how much will be spent o o Automation: Build a series of models limiting the number of nodes in a tree thus controlling the order of interactions (Battery NODES) o o Automation: Build a series of models varying the speed of learning (Battery LEARNRATE) o o Automation: Build a series of models by progressively imposing additivity on individual predictors (Battery ADDITIVE) o o Automation: Build a series of models utilizing different regression loss functions (Battery TNREG) o o Automation: Build a series of models by varying subsampling fraction (Battery TNSUBSAMPLE) o o Automation: Build a series of models using varying degree of penalty on added variables (Battery ADDEDVAR) o o Modeling Pipelines: RuleLearner, ISLE o Build a CART tree utilizing the TreeNet engine to gain speed as well as alternative reporting o RandomForests inspired sampling of predictors at each node during model building o Build a RandomForests model utilizing the TreeNet engine to gain speed as well as alternative reporting o Build a Random Forests model utilizing the CART engine to gain alternative handling of missing values via surrogate splits (Battery BOOTSTRAP RSPLIT) o 9685 Via Excelencia, Suite 208, San Diego, CA, 92126 Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com
  • 5. Salford  Systems  9685  Via  Excelencia,  Suite  208,  San  Diego,  CA,  92126 Tel:  619-­‐543-­‐8880  Fax:  619-­‐543-­‐8888  E-­‐Mail:  info@salford-­‐systems.com  Web  Site:  www.salford-­‐systems.com Additional  CART  Features SPM  7.0  Feature  Matrix Components Basic Pro ProEx Ultra Modeling Engine: CART (Decision Trees) o o o o Linear Combination Splits o o o o Optimal tree selection based on area under ROC curve o o o o User defined splits for the root node and its children o o o Automation: Generate models with alternative handling of missing values (Battery MVI) o o o Automation: Build a series of models using all available splitting strategies (six for classification, two for regression) (Battery RULES) o o o Automation: Build a series of models varying the depth of the tree (Battery DEPTH) o o o Automation: Build a series of models changing the minimum required size on parent nodes (Battery ATOM) o o o Automation: Build a series of models changing the minimum required size on child nodes (Battery MINCHILD) o o o Automation: Explore accuracy versus speed trade-off due to potential sampling of records at each node in a tree (Battery SUBSAMPLE) o o o Multiple user defined lists for linear combinations o o Constrained trees o o Ability to create and save dummy variables for every node in the tree during scoring o o Report basic stats on any variable of user choice at every node in the tree o o Comparison of learn vs. test performance at every node of every tree in the sequence o o Hot-Spot detection to identify the richest nodes across multiple trees o o Automation: Vary the priors for the specified class (Battery PRIORS) o o Automation: Build a series of models limiting the number of nodes in a tree (Battery NODES) o o Automation: Build a series of models trying each available predictor as the root node splitter (Battery ROOT) o o Automation: Explore the impact of favoring equal sized child nodes (Battery POWER) o o
  • 6. Salford  Systems  9685  Via  Excelencia,  Suite  208,  San  Diego,  CA,  92126 Tel:  619-­‐543-­‐8880  Fax:  619-­‐543-­‐8888  E-­‐Mail:  info@salford-­‐systems.com  Web  Site:  www.salford-­‐systems.com Components Basic Pro ProEx Ultra Automation: Build a series of models by progressively removing misclassified records thus increasing the robustness of trees and posssibly reducing model complexity (Battery REFINE) o o Automation: Bagging and ARCing using the legacy code (COMBINE) o o Build a CART tree utilizing the TreeNet engine to gain speed as well as alternative reporting o Build a Random Forests model utlizing the CART engine to gain alternative handling of missing values via surrogate splits (Battery BOOTSTRAP RSPLIT) o
  • 7. Salford  Systems  9685  Via  Excelencia,  Suite  208,  San  Diego,  CA,  92126 Tel:  619-­‐543-­‐8880  Fax:  619-­‐543-­‐8888  E-­‐Mail:  info@salford-­‐systems.com  Web  Site:  www.salford-­‐systems.com Additional  MARS  Features SPM  7.0  Feature  Matrix Components Basic Pro ProEx Ultra Modeling Engine: MARS (Nonlinear Regression) o o o o Automation: Build a series of models varying the maximum number of basis functions (Battery BASIS) o o o Automation: Build a series of models varying the smoothness parameter (Battery MINSPAN) o o Automation: Build a series of models varying the order of interactions (Battery INTERACTIONS) o o Automation: Build a series of models varying the modeling speed (Battery SPEED) o o Automation: Build a series of models using varying degree of penalty on added variables (Battery PENALTY MARS) o
  • 8. Salford  Systems  9685  Via  Excelencia,  Suite  208,  San  Diego,  CA,  92126 Tel:  619-­‐543-­‐8880  Fax:  619-­‐543-­‐8888  E-­‐Mail:  info@salford-­‐systems.com  Web  Site:  www.salford-­‐systems.com Additional  TreeNet  Features SPM  7.0  Feature  Matrix Components Basic Pro ProEx Ultra Modeling Engine: TreeNet (Stochastic Gradient Boosting) o o o o Spline-based approximations to the TreeNet dependency plots o o o Exporting TreeNet dependency plots into XML file o o o Automation: Build a series of models changing the minimum required size on child nodes (Battery MINCHILD) o o o Flexible control over interactions in a TreeNet model (ICL) o o Interaction strength reporting o o Build a CART tree utilizing the TreeNet engine to gain speed as well as alternative reporting o Build a RandomForests model utilizing the TreeNet engine to gain speed as well as alternative reporting o RandomForests inspired sampling of predictors at each node during model building o Automation: Explore the impact of influence trimming (outlier removal) for logistic and classification models (Battery INFLUENCE) o Automation: Exhaustive search and ranking for all interactions of the specified order (Battery ICL) o
  • 9. Salford  Systems  9685  Via  Excelencia,  Suite  208,  San  Diego,  CA,  92126 Tel:  619-­‐543-­‐8880  Fax:  619-­‐543-­‐8888  E-­‐Mail:  info@salford-­‐systems.com  Web  Site:  www.salford-­‐systems.com Additional  Random  Forests  Features SPM  7.0  Feature  Matrix Components Basic Pro ProEx Ultra Modeling Engine: RandomForests for Classification o o o o Additional Modeling Engine: Random Forests for Regession o o o Automation: Vary the number of randomly selected predictors at the node-level (Battery RFNPREDS) o o o
  • 10. Salford Systems Additional GPS Features SPM 7.0 Feature Matrix Components Basic Pro ProEx Ultra Modeling Engines: Regularized Regression (LASSO/Ridge/LARS/Elastic Net/GPS) o o o Automation: Build a series of models by forcing different limit on the maximum correlation among predictors (Battery MAXCORR) o o 9685 Via Excelencia, Suite 208, San Diego, CA, 92126 Tel: 619-543-8880 Fax: 619-543-8888 E-Mail: info@salford-systems.com Web Site: www.salford-systems.com