SlideShare a Scribd company logo
Survival Analysis Dimension Reduction Techniques:
A Comparison of Select Methods
Iván Rodríguez†
and Claressa L. Ullmayer∗
† The University of Arizona * The University of Alaska, Fairbanks
Researchers often obtain copious and/or
incomplete data that can be superfluous/collinear
in terms of explaining particular outcomes—even
sophisticated analysis software and machinery
struggle under these particular conditions.
Purpose: compare Principal Component Analysis
(PCA), Partial Least Squares (PLS), and
Johnson-Lindenstrauss inspired Random
Matrices (RM) in terms of reducing dataset
dimensionality while retaining generality.
Survival Analysis: overcomes limitations of
standard regression approaches, able to include
positive values, can handle censoring.
PCA: obtains components/eigenvalues from
data’s variance-covariance matrix, maximizes
covariance and correlation of linear combinations
of predictor variable, produces new less
correlated variables by orthogonal
transformations of covariates.
PLS: same as PCA, except maximizes
covariance and correlation of linear combinations
of predictor and response variables, projects
predictor and response variables into new space
to model covariance structure.
RM: matrix with predetermined qualities is
randomly generated and applied to predictor
matrix.
Accelerated Failure Time (AFT) Model: provides
intuitive interpretation of predictor and response
variables via survivor curves, directly models
survival times.
1. simulate datasets using R statistical software
2. generate data and associated true survivor curve
3. implement all dimension reduction techniques on data
4. obtain survivor curve estimates from each procedure
5. calculate bias and mean-squared error on fixed times
6. repeat 1-5 for desired amount of iterations
7. receive final plots to analyze performance
Achlioptas, D. Database-friendly random projections: Johnson-Lindenstrauss with
binary coins. Journal of Computer and System Sciences 66(4): 671-687, 2003.
Dasgupta, S. and A. Gupta. An elementary proof of a theorem of Johnson and
Lindenstrauss. Random Structures and Algorithms 22(1): 60-65, 2003.
Johnson, W.B. and J. Lindenstrauss. Extensions of Lipschitz maps into a Hilbert
space. Contemp Math 26: 189-206, 1984.
Nguyen, Tuan S. and Javier Rojo. Dimension Reduction of Microarray Data in the
Presence of a Censored Survival Response: A Simulation Study. Statistical
Applications in Genetics and Molecular Biology 8(1): 2009.
Nguyen, Tuan S. and Javier Rojo. Dimension Reduction of Microarray Gene
Expression Data: The Accelerated Failure Time Model. Journal of Bioinformatics
and Computational Biology 7(6): 939-954, 2009.
According to the results, PCA outperforms PLS,
all three RM variants are comparable, and all
RMs are superior to PCA and PLS.
1. integrate censored data into investigation
2. apply findings to real datasets such as
microarray gene cancer datasets
3. utilize more powerful software than R and
advanced technology
4. observe effects of altering regression model—
e.g., instead of AFT, implement Cox
Proportional Hazards Model
This project was realized in part thanks to the generous guidance and support from
PI Javier Rojo Jiménez along with Kyle Bradford, Nathan C. Wiseman, Raul Cruz-
Cano, and Rashidul Hasan.
This research was supported by the National Security Agency through Grant
H98230-15-1-0048 to the University of Nevada at Reno, Javier Rojo PI.
Analysis and Discussion
RMs unexpectedly outdo PCA and PLS—could
be connected to the limits of R’s accuracy when
generating datasets and/or not incorporating
censored data.
PLS performs in-depth analysis of predictor and
response variables, yet is bested by PCA—could
also have to do with dataset generation.
Further Inquiry
Acknowledgements
Literature Cited
Results
Assessment Sample CurveIntroduction
Methods

More Related Content

What's hot

notes on correlational research
notes on correlational researchnotes on correlational research
notes on correlational research
Siti Ishark
 
Correlational Designs
Correlational Designs Correlational Designs
Correlational Designs
Norhidayah Badrul Hisham
 
Multiple imputation of missing data
Multiple imputation of missing dataMultiple imputation of missing data
Multiple imputation of missing data
Statistics Specialist
 
Nonparametric Methods and Evolutionary Algorithms in Genetic Epidemiology
Nonparametric Methods and Evolutionary Algorithms in Genetic EpidemiologyNonparametric Methods and Evolutionary Algorithms in Genetic Epidemiology
Nonparametric Methods and Evolutionary Algorithms in Genetic Epidemiology
Colleen Farrelly
 
Analytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure ScienceAnalytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure Science
Larry Michael
 
J itendra cca stat
J itendra cca statJ itendra cca stat
Scoring function
Scoring functionScoring function
Scoring function
SAURABH KUMAR
 
Normalized Citation Indexes: a theoretical methodological study applied to sc...
Normalized Citation Indexes: a theoretical methodological study applied to sc...Normalized Citation Indexes: a theoretical methodological study applied to sc...
Normalized Citation Indexes: a theoretical methodological study applied to sc...
Ely Francina Tannuri Oliveira
 
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
Satigayatri
 
A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...
Emily Castner
 
Correlation Research Design
Correlation Research DesignCorrelation Research Design
Correlation Research Design
Su Qee
 
Correlational research design (Kartika Ajeng A)
Correlational research design (Kartika Ajeng A)Correlational research design (Kartika Ajeng A)
Correlational research design (Kartika Ajeng A)
Kartika Anggraeni
 
Factor Analysis
Factor Analysis Factor Analysis
Factor Analysis
Raja Adapa
 
Correlational Research
Correlational ResearchCorrelational Research
Canonical correlation
Canonical correlationCanonical correlation
Canonical correlation
National Institute of Biologics
 
Presentation STAT 639 (TAMU) final project
Presentation STAT 639 (TAMU) final projectPresentation STAT 639 (TAMU) final project
Presentation STAT 639 (TAMU) final project
RhythmVerma4
 
Haldna presentation
Haldna presentationHaldna presentation
Haldna presentation
Irina Koksharova
 
Comparison and evaluation of alternative designs
Comparison and evaluation of alternative designsComparison and evaluation of alternative designs
Comparison and evaluation of alternative designs
De La Salle University-Manila
 
06 quantitative data processing
06 quantitative data processing06 quantitative data processing
06 quantitative data processing
Kanagaraj Easwaran
 
Structure alignment methods
Structure alignment methodsStructure alignment methods
Structure alignment methods
Samvartika Majumdar
 

What's hot (20)

notes on correlational research
notes on correlational researchnotes on correlational research
notes on correlational research
 
Correlational Designs
Correlational Designs Correlational Designs
Correlational Designs
 
Multiple imputation of missing data
Multiple imputation of missing dataMultiple imputation of missing data
Multiple imputation of missing data
 
Nonparametric Methods and Evolutionary Algorithms in Genetic Epidemiology
Nonparametric Methods and Evolutionary Algorithms in Genetic EpidemiologyNonparametric Methods and Evolutionary Algorithms in Genetic Epidemiology
Nonparametric Methods and Evolutionary Algorithms in Genetic Epidemiology
 
Analytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure ScienceAnalytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure Science
 
J itendra cca stat
J itendra cca statJ itendra cca stat
J itendra cca stat
 
Scoring function
Scoring functionScoring function
Scoring function
 
Normalized Citation Indexes: a theoretical methodological study applied to sc...
Normalized Citation Indexes: a theoretical methodological study applied to sc...Normalized Citation Indexes: a theoretical methodological study applied to sc...
Normalized Citation Indexes: a theoretical methodological study applied to sc...
 
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
QSAR statistical methods for drug discovery(pharmacology m.pharm2nd sem)
 
A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...A distance-based method for phylogenetic tree reconstruction using algebraic ...
A distance-based method for phylogenetic tree reconstruction using algebraic ...
 
Correlation Research Design
Correlation Research DesignCorrelation Research Design
Correlation Research Design
 
Correlational research design (Kartika Ajeng A)
Correlational research design (Kartika Ajeng A)Correlational research design (Kartika Ajeng A)
Correlational research design (Kartika Ajeng A)
 
Factor Analysis
Factor Analysis Factor Analysis
Factor Analysis
 
Correlational Research
Correlational ResearchCorrelational Research
Correlational Research
 
Canonical correlation
Canonical correlationCanonical correlation
Canonical correlation
 
Presentation STAT 639 (TAMU) final project
Presentation STAT 639 (TAMU) final projectPresentation STAT 639 (TAMU) final project
Presentation STAT 639 (TAMU) final project
 
Haldna presentation
Haldna presentationHaldna presentation
Haldna presentation
 
Comparison and evaluation of alternative designs
Comparison and evaluation of alternative designsComparison and evaluation of alternative designs
Comparison and evaluation of alternative designs
 
06 quantitative data processing
06 quantitative data processing06 quantitative data processing
06 quantitative data processing
 
Structure alignment methods
Structure alignment methodsStructure alignment methods
Structure alignment methods
 

Similar to Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS

Evolutionary techniques-for-model-order-reduction-of-large-scale-linear-systems
Evolutionary techniques-for-model-order-reduction-of-large-scale-linear-systemsEvolutionary techniques-for-model-order-reduction-of-large-scale-linear-systems
Evolutionary techniques-for-model-order-reduction-of-large-scale-linear-systems
Cemal Ardil
 
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
ijcsa
 
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Technical_Report
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Technical_ReportRodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Technical_Report
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Technical_Report
​Iván Rodríguez
 
Performance analysis of regularized linear regression models for oxazolines a...
Performance analysis of regularized linear regression models for oxazolines a...Performance analysis of regularized linear regression models for oxazolines a...
Performance analysis of regularized linear regression models for oxazolines a...
ijcsity
 
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MINING
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MININGUNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MINING
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MINING
IJDKP
 
High Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and VisualizationHigh Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and Visualization
Dmitry Grapov
 
Mixed Models: How to Effectively Account for Inbreeding and Population Struct...
Mixed Models: How to Effectively Account for Inbreeding and Population Struct...Mixed Models: How to Effectively Account for Inbreeding and Population Struct...
Mixed Models: How to Effectively Account for Inbreeding and Population Struct...
Golden Helix Inc
 
Treatment by alternative methods of regression gas chromatographic retention ...
Treatment by alternative methods of regression gas chromatographic retention ...Treatment by alternative methods of regression gas chromatographic retention ...
Treatment by alternative methods of regression gas chromatographic retention ...
ijics
 
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
ijcisjournal
 
Mixed models
Mixed modelsMixed models
Mixed models
Arun Nagarajan
 
research journal
research journalresearch journal
research journal
rikaseorika
 
published in the journal
published in the journalpublished in the journal
published in the journal
rikaseorika
 
journals public
journals publicjournals public
journals public
rikaseorika
 
journal in research
journal in research journal in research
journal in research
rikaseorika
 
Application of Multivariate Adaptive Regression Splines (MARSplines) for Pred...
Application of Multivariate Adaptive Regression Splines (MARSplines) for Pred...Application of Multivariate Adaptive Regression Splines (MARSplines) for Pred...
Application of Multivariate Adaptive Regression Splines (MARSplines) for Pred...
Maciej Przybyłek
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
mathsjournal
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IJDKP
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
mathsjournal
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
mathsjournal
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
mathsjournal
 

Similar to Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS (20)

Evolutionary techniques-for-model-order-reduction-of-large-scale-linear-systems
Evolutionary techniques-for-model-order-reduction-of-large-scale-linear-systemsEvolutionary techniques-for-model-order-reduction-of-large-scale-linear-systems
Evolutionary techniques-for-model-order-reduction-of-large-scale-linear-systems
 
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
 
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Technical_Report
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Technical_ReportRodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Technical_Report
Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Technical_Report
 
Performance analysis of regularized linear regression models for oxazolines a...
Performance analysis of regularized linear regression models for oxazolines a...Performance analysis of regularized linear regression models for oxazolines a...
Performance analysis of regularized linear regression models for oxazolines a...
 
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MINING
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MININGUNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MINING
UNDERSTANDING LEAST ABSOLUTE VALUE IN REGRESSION-BASED DATA MINING
 
High Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and VisualizationHigh Dimensional Biological Data Analysis and Visualization
High Dimensional Biological Data Analysis and Visualization
 
Mixed Models: How to Effectively Account for Inbreeding and Population Struct...
Mixed Models: How to Effectively Account for Inbreeding and Population Struct...Mixed Models: How to Effectively Account for Inbreeding and Population Struct...
Mixed Models: How to Effectively Account for Inbreeding and Population Struct...
 
Treatment by alternative methods of regression gas chromatographic retention ...
Treatment by alternative methods of regression gas chromatographic retention ...Treatment by alternative methods of regression gas chromatographic retention ...
Treatment by alternative methods of regression gas chromatographic retention ...
 
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
TREATMENT BY ALTERNATIVE METHODS OF REGRESSION GAS CHROMATOGRAPHIC RETENTION ...
 
Mixed models
Mixed modelsMixed models
Mixed models
 
research journal
research journalresearch journal
research journal
 
published in the journal
published in the journalpublished in the journal
published in the journal
 
journals public
journals publicjournals public
journals public
 
journal in research
journal in research journal in research
journal in research
 
Application of Multivariate Adaptive Regression Splines (MARSplines) for Pred...
Application of Multivariate Adaptive Regression Splines (MARSplines) for Pred...Application of Multivariate Adaptive Regression Splines (MARSplines) for Pred...
Application of Multivariate Adaptive Regression Splines (MARSplines) for Pred...
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
 
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
LASSO MODELING AS AN ALTERNATIVE TO PCA BASED MULTIVARIATE MODELS TO SYSTEM W...
 

More from ​Iván Rodríguez

Rodriguez_NRMC_Presentation
Rodriguez_NRMC_PresentationRodriguez_NRMC_Presentation
Rodriguez_NRMC_Presentation
​Iván Rodríguez
 
Rodriguez_THINK_TANK_Testimonial
Rodriguez_THINK_TANK_TestimonialRodriguez_THINK_TANK_Testimonial
Rodriguez_THINK_TANK_Testimonial
​Iván Rodríguez
 
Rodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_PosterRodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_Poster
​Iván Rodríguez
 
Rodriguez_UROC_Final_Presentation
Rodriguez_UROC_Final_PresentationRodriguez_UROC_Final_Presentation
Rodriguez_UROC_Final_Presentation
​Iván Rodríguez
 
Rodriguez_THINK_TANK_Difficult_Problem_12
Rodriguez_THINK_TANK_Difficult_Problem_12Rodriguez_THINK_TANK_Difficult_Problem_12
Rodriguez_THINK_TANK_Difficult_Problem_12
​Iván Rodríguez
 
Rodriguez_THINK_TANK_Difficult_Problem_9
Rodriguez_THINK_TANK_Difficult_Problem_9Rodriguez_THINK_TANK_Difficult_Problem_9
Rodriguez_THINK_TANK_Difficult_Problem_9
​Iván Rodríguez
 
Rodriguez_THINK_TANK_Mathematics_Tutoring_Philosophy
Rodriguez_THINK_TANK_Mathematics_Tutoring_PhilosophyRodriguez_THINK_TANK_Mathematics_Tutoring_Philosophy
Rodriguez_THINK_TANK_Mathematics_Tutoring_Philosophy
​Iván Rodríguez
 
Rodriguez_DRT_Abstract_Beamer
Rodriguez_DRT_Abstract_BeamerRodriguez_DRT_Abstract_Beamer
Rodriguez_DRT_Abstract_Beamer
​Iván Rodríguez
 
Rodriguez_Survival_Abstract_Beamer
Rodriguez_Survival_Abstract_BeamerRodriguez_Survival_Abstract_Beamer
Rodriguez_Survival_Abstract_Beamer
​Iván Rodríguez
 
Ullmayer_Rodriguez_Presentation
Ullmayer_Rodriguez_PresentationUllmayer_Rodriguez_Presentation
Ullmayer_Rodriguez_Presentation
​Iván Rodríguez
 

More from ​Iván Rodríguez (10)

Rodriguez_NRMC_Presentation
Rodriguez_NRMC_PresentationRodriguez_NRMC_Presentation
Rodriguez_NRMC_Presentation
 
Rodriguez_THINK_TANK_Testimonial
Rodriguez_THINK_TANK_TestimonialRodriguez_THINK_TANK_Testimonial
Rodriguez_THINK_TANK_Testimonial
 
Rodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_PosterRodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_Poster
 
Rodriguez_UROC_Final_Presentation
Rodriguez_UROC_Final_PresentationRodriguez_UROC_Final_Presentation
Rodriguez_UROC_Final_Presentation
 
Rodriguez_THINK_TANK_Difficult_Problem_12
Rodriguez_THINK_TANK_Difficult_Problem_12Rodriguez_THINK_TANK_Difficult_Problem_12
Rodriguez_THINK_TANK_Difficult_Problem_12
 
Rodriguez_THINK_TANK_Difficult_Problem_9
Rodriguez_THINK_TANK_Difficult_Problem_9Rodriguez_THINK_TANK_Difficult_Problem_9
Rodriguez_THINK_TANK_Difficult_Problem_9
 
Rodriguez_THINK_TANK_Mathematics_Tutoring_Philosophy
Rodriguez_THINK_TANK_Mathematics_Tutoring_PhilosophyRodriguez_THINK_TANK_Mathematics_Tutoring_Philosophy
Rodriguez_THINK_TANK_Mathematics_Tutoring_Philosophy
 
Rodriguez_DRT_Abstract_Beamer
Rodriguez_DRT_Abstract_BeamerRodriguez_DRT_Abstract_Beamer
Rodriguez_DRT_Abstract_Beamer
 
Rodriguez_Survival_Abstract_Beamer
Rodriguez_Survival_Abstract_BeamerRodriguez_Survival_Abstract_Beamer
Rodriguez_Survival_Abstract_Beamer
 
Ullmayer_Rodriguez_Presentation
Ullmayer_Rodriguez_PresentationUllmayer_Rodriguez_Presentation
Ullmayer_Rodriguez_Presentation
 

Rodriguez_Ullmayer_Rojo_RUSIS@UNR_REU_Poster_Presentation_SACNAS

  • 1. Survival Analysis Dimension Reduction Techniques: A Comparison of Select Methods Iván Rodríguez† and Claressa L. Ullmayer∗ † The University of Arizona * The University of Alaska, Fairbanks Researchers often obtain copious and/or incomplete data that can be superfluous/collinear in terms of explaining particular outcomes—even sophisticated analysis software and machinery struggle under these particular conditions. Purpose: compare Principal Component Analysis (PCA), Partial Least Squares (PLS), and Johnson-Lindenstrauss inspired Random Matrices (RM) in terms of reducing dataset dimensionality while retaining generality. Survival Analysis: overcomes limitations of standard regression approaches, able to include positive values, can handle censoring. PCA: obtains components/eigenvalues from data’s variance-covariance matrix, maximizes covariance and correlation of linear combinations of predictor variable, produces new less correlated variables by orthogonal transformations of covariates. PLS: same as PCA, except maximizes covariance and correlation of linear combinations of predictor and response variables, projects predictor and response variables into new space to model covariance structure. RM: matrix with predetermined qualities is randomly generated and applied to predictor matrix. Accelerated Failure Time (AFT) Model: provides intuitive interpretation of predictor and response variables via survivor curves, directly models survival times. 1. simulate datasets using R statistical software 2. generate data and associated true survivor curve 3. implement all dimension reduction techniques on data 4. obtain survivor curve estimates from each procedure 5. calculate bias and mean-squared error on fixed times 6. repeat 1-5 for desired amount of iterations 7. receive final plots to analyze performance Achlioptas, D. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences 66(4): 671-687, 2003. Dasgupta, S. and A. Gupta. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures and Algorithms 22(1): 60-65, 2003. Johnson, W.B. and J. Lindenstrauss. Extensions of Lipschitz maps into a Hilbert space. Contemp Math 26: 189-206, 1984. Nguyen, Tuan S. and Javier Rojo. Dimension Reduction of Microarray Data in the Presence of a Censored Survival Response: A Simulation Study. Statistical Applications in Genetics and Molecular Biology 8(1): 2009. Nguyen, Tuan S. and Javier Rojo. Dimension Reduction of Microarray Gene Expression Data: The Accelerated Failure Time Model. Journal of Bioinformatics and Computational Biology 7(6): 939-954, 2009. According to the results, PCA outperforms PLS, all three RM variants are comparable, and all RMs are superior to PCA and PLS. 1. integrate censored data into investigation 2. apply findings to real datasets such as microarray gene cancer datasets 3. utilize more powerful software than R and advanced technology 4. observe effects of altering regression model— e.g., instead of AFT, implement Cox Proportional Hazards Model This project was realized in part thanks to the generous guidance and support from PI Javier Rojo Jiménez along with Kyle Bradford, Nathan C. Wiseman, Raul Cruz- Cano, and Rashidul Hasan. This research was supported by the National Security Agency through Grant H98230-15-1-0048 to the University of Nevada at Reno, Javier Rojo PI. Analysis and Discussion RMs unexpectedly outdo PCA and PLS—could be connected to the limits of R’s accuracy when generating datasets and/or not incorporating censored data. PLS performs in-depth analysis of predictor and response variables, yet is bested by PCA—could also have to do with dataset generation. Further Inquiry Acknowledgements Literature Cited Results Assessment Sample CurveIntroduction Methods