Colleen M. Farrelly
Small Sample Size Examples
Rare
Diseases
(ex.
chordoma)
Rare
Exposures
(ex.
Fukushima
workers…)
Restricted
Sample
Trials (ex.
Phase I
trials)
•Find subgroups that exist in data
•Understand subgroup differences
Clustering
•Identify associations between predictors and
an outcome
Predictor
Identification
•Predict outcome/group assignment of new
observations based on a set of predictors
Forecasting/
Classifying
Violation of
typical model
assumptions
(inaccurate model
estimates)
Computational issues
with large
dimensionality (lots of
variables)
Low power to
detect significant
associations
• Bad
prediction
• Little insight
• Computability
issues
• Inaccurate
estimates
Topology provides a feature engineering and unsupervised
learning method robust to sample size, which provide
good input features for tree-based predictive models.
Data
Collection
Unsupervised
Learning
Feature
Engineering
Predictive
Modeling
Insight
Good Prediction/Classification Models
Recoverable Insight into Problem
Avoidance of Statistical Model Power Issues
Robust analysis with sample sizes

Robust analysis with sample sizes