SlideShare a Scribd company logo
1 of 8
Download to read offline
Generalized Low Rank Models
Anqi Fu
Machine Learning Scientist, H2O.ai
anqi@h2o.ai
Based on work by Madeleine Udell, Corinne Horn, Reza Zadeh and Stephen Boyd
November 6, 2015
Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 1 / 8
What is a Low Rank Model?
Given: Data table A with m rows and n columns
Find: Compressed representation as numeric tables X and Y , where
# cols in X = # rows in Y = small user-specified k max(m, n)
# cols in Y is d = (total dimension of embedded features in A) ≥ n
m




 A


n
≈ m




X


k
Y k
n
Row of Y = archetypal feature created from columns of A
Row of X = row of A in reduced feature space
Can approximately reconstruct A from product XY
Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 2 / 8
Why use Low Rank Models?
Reduce storage space, e.g. 10 GB compressed to 100 MB
Increase prediction speed, e.g. 10x speed-up with no accuracy loss
Identify and visualize important features
Impute missing data entries
Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 3 / 8
Example 1: Visualizing Walking Stances
time forehead (x) forehead (y) · · · right toe (y) right toe (z)
t1 1.4 2.7 · · · -0.5 -0.1
t2 2.7 3.5 · · · 1.3 0.9
t3 3.3 -.9 · · · 4.2 1.8
...
...
...
...
...
...
A contains 151 rows (observations over time) by 124 columns
(location of body parts)
Build a low rank model X, Y with rank k = 10
Rows of Y are principal stances person takes while walking
Rows of X decompose each bodily position into combination of
principal stances
Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 4 / 8
Example 2: Compressing Zip Codes
repeat violator ZCTA · · · violations penalty EE’s affected
N/A 70525 · · · 9 8100 0
R 75189 · · · 6 935 5
RW 95621 · · · 4 1155 3
...
...
...
...
...
...
Train: U.S. Wage & Hour Division (WHD) compliance actions
contains 208,806 rows (cases) and 252 columns (violation info)
Response: Was firm a repeat and/or willful violator?
Predictors: Zip code tabulation area (ZCTA), number of violations,
civil penalties, employees (EE’s) affected, etc
Naive approach replaces ZCTA with indicator variables, which 1) is
slow, 2) overfits, 3) cannot transfer knowledge between similar ZCTAs
Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 5 / 8
Example 2: Compressing Zip Codes
ZCTA associate degree bachelor’s degree · · · welsh west indian
01001 1584 1953 · · · 34 57
01002 510 3098 · · · 332 181
01003 27 49 · · · 40 134
...
...
...
...
...
...
American Community Survey (ACS) data contains 32,989 rows
(unique ZCTAs) by 150 columns (population info)
Build a low rank model X, Y with rank k = 10 and regularization to
sparsify features
Rows of Y are demographic archetypes
Rows of X map ZCTAs into combination of demographic archetypes
Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 6 / 8
Example 2: Compressing Zip Codes
Train =



y ZCTA ···
N/A 70525 · · ·
...
... · · ·
R 01002 · · ·


 X =





ZCTA archetypes
01001 —x1—
01002 —x2—
...
...
70525 —xp—





Replace ZCTA col of training data with low rank model (X) of ACS
Predict if firm will be a repeat violator using modified training data
repeat violator archetypes · · · violations penalty EE’s affected
N/A —xp— · · · 9 8100 0
...
...
...
...
...
...
R —x2— · · · 4 225 3
Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 7 / 8
References
M. Udell, S. Boyd, et al (2014), Generalized Low Rank Models
Example 1: Visualizing Walking Stance
Walking Gait Data
Walking Gait Data with Missing Values
Example 2: Compressing Zip Codes
Wage and Hour Division Data
American Community Survey Data
Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 8 / 8

More Related Content

What's hot

Pre-Cal 40S Slides February 29, 2008
Pre-Cal 40S Slides February 29, 2008Pre-Cal 40S Slides February 29, 2008
Pre-Cal 40S Slides February 29, 2008Darren Kuropatwa
 
Spline Interpolation
Spline InterpolationSpline Interpolation
Spline InterpolationaiQUANT
 
AP Calculus Slides September 18, 2007
AP Calculus Slides September 18, 2007AP Calculus Slides September 18, 2007
AP Calculus Slides September 18, 2007Darren Kuropatwa
 
Curve fitting and Optimization
Curve fitting and OptimizationCurve fitting and Optimization
Curve fitting and OptimizationSyahrul Senin
 
Contrastive Divergence Learning
Contrastive Divergence LearningContrastive Divergence Learning
Contrastive Divergence Learningpenny 梁斌
 
Functions for Grade 10
Functions for Grade 10Functions for Grade 10
Functions for Grade 10Boipelo Radebe
 
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...Magnify Analytic Solutions
 
rational expression
rational expressionrational expression
rational expressionrey castro
 
Lesson 25: The Definite Integral
Lesson 25: The Definite IntegralLesson 25: The Definite Integral
Lesson 25: The Definite IntegralMatthew Leingang
 
AP Calculus Jauary 13, 2009
AP Calculus Jauary 13, 2009AP Calculus Jauary 13, 2009
AP Calculus Jauary 13, 2009Darren Kuropatwa
 
Tutorials--Factoring Quadratics
Tutorials--Factoring QuadraticsTutorials--Factoring Quadratics
Tutorials--Factoring QuadraticsMedia4math
 
Presentation of my master thesis - Image Processing
Presentation of my master thesis - Image ProcessingPresentation of my master thesis - Image Processing
Presentation of my master thesis - Image ProcessingMichaelRra
 
Day 1 examples u1f13
Day 1 examples u1f13Day 1 examples u1f13
Day 1 examples u1f13jchartiersjsd
 
AP Calculus Slides December 10, 2007
AP Calculus Slides December 10, 2007AP Calculus Slides December 10, 2007
AP Calculus Slides December 10, 2007Darren Kuropatwa
 
Day 1 examples u1f13
Day 1 examples u1f13Day 1 examples u1f13
Day 1 examples u1f13jchartiersjsd
 
Rational function representation
Rational function representationRational function representation
Rational function representationrey castro
 

What's hot (19)

Pre-Cal 40S Slides February 29, 2008
Pre-Cal 40S Slides February 29, 2008Pre-Cal 40S Slides February 29, 2008
Pre-Cal 40S Slides February 29, 2008
 
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
 
Spline Interpolation
Spline InterpolationSpline Interpolation
Spline Interpolation
 
AP Calculus Slides September 18, 2007
AP Calculus Slides September 18, 2007AP Calculus Slides September 18, 2007
AP Calculus Slides September 18, 2007
 
Curve fitting and Optimization
Curve fitting and OptimizationCurve fitting and Optimization
Curve fitting and Optimization
 
Contrastive Divergence Learning
Contrastive Divergence LearningContrastive Divergence Learning
Contrastive Divergence Learning
 
Functions for Grade 10
Functions for Grade 10Functions for Grade 10
Functions for Grade 10
 
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
Logistic Modeling with Applications to Marketing and Credit Risk in the Autom...
 
rational expression
rational expressionrational expression
rational expression
 
Functions
FunctionsFunctions
Functions
 
Lesson 25: The Definite Integral
Lesson 25: The Definite IntegralLesson 25: The Definite Integral
Lesson 25: The Definite Integral
 
AP Calculus Jauary 13, 2009
AP Calculus Jauary 13, 2009AP Calculus Jauary 13, 2009
AP Calculus Jauary 13, 2009
 
Tutorials--Factoring Quadratics
Tutorials--Factoring QuadraticsTutorials--Factoring Quadratics
Tutorials--Factoring Quadratics
 
Satalk
SatalkSatalk
Satalk
 
Presentation of my master thesis - Image Processing
Presentation of my master thesis - Image ProcessingPresentation of my master thesis - Image Processing
Presentation of my master thesis - Image Processing
 
Day 1 examples u1f13
Day 1 examples u1f13Day 1 examples u1f13
Day 1 examples u1f13
 
AP Calculus Slides December 10, 2007
AP Calculus Slides December 10, 2007AP Calculus Slides December 10, 2007
AP Calculus Slides December 10, 2007
 
Day 1 examples u1f13
Day 1 examples u1f13Day 1 examples u1f13
Day 1 examples u1f13
 
Rational function representation
Rational function representationRational function representation
Rational function representation
 

Similar to H2O World - GLRM - Anqi Fu

Generalized Low Rank Models
Generalized Low Rank ModelsGeneralized Low Rank Models
Generalized Low Rank ModelsSri Ambati
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsGuillaume Costeseque
 
Tensor Train decomposition in machine learning
Tensor Train decomposition in machine learningTensor Train decomposition in machine learning
Tensor Train decomposition in machine learningAlexander Novikov
 
Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual Amairullah Khan Lodhi
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
Parallelisation of the PC Algorithm (CAEPIA2015)
Parallelisation of the PC Algorithm (CAEPIA2015)Parallelisation of the PC Algorithm (CAEPIA2015)
Parallelisation of the PC Algorithm (CAEPIA2015)AMIDST Toolbox
 
Vlsi model question paper 3 (june 2021)
Vlsi model question paper 3 (june 2021)Vlsi model question paper 3 (june 2021)
Vlsi model question paper 3 (june 2021)PUSHPALATHAV1
 
Panel101R princeton.pdf
Panel101R princeton.pdfPanel101R princeton.pdf
Panel101R princeton.pdfJeanTaipeChvez
 
NYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeNYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeRizwan Habib
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsUniversity of Glasgow
 
Workload-aware materialization for efficient variable elimination on Bayesian...
Workload-aware materialization for efficient variable elimination on Bayesian...Workload-aware materialization for efficient variable elimination on Bayesian...
Workload-aware materialization for efficient variable elimination on Bayesian...Cigdem Aslay
 
Actuator Constrained Optimal Control of Formations Near the Libration Points
Actuator Constrained Optimal Control of Formations Near the Libration PointsActuator Constrained Optimal Control of Formations Near the Libration Points
Actuator Constrained Optimal Control of Formations Near the Libration PointsBelinda Marchand
 
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...Guillaume Costeseque
 
FINDING FREQUENT SUBPATHS IN A GRAPH
FINDING FREQUENT SUBPATHS IN A GRAPHFINDING FREQUENT SUBPATHS IN A GRAPH
FINDING FREQUENT SUBPATHS IN A GRAPHIJDKP
 
Robust Low-rank and Sparse Decomposition for Moving Object Detection
Robust Low-rank and Sparse Decomposition for Moving Object DetectionRobust Low-rank and Sparse Decomposition for Moving Object Detection
Robust Low-rank and Sparse Decomposition for Moving Object DetectionActiveEon
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesMohammad
 

Similar to H2O World - GLRM - Anqi Fu (20)

Generalized Low Rank Models
Generalized Low Rank ModelsGeneralized Low Rank Models
Generalized Low Rank Models
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equations
 
Tensor Train decomposition in machine learning
Tensor Train decomposition in machine learningTensor Train decomposition in machine learning
Tensor Train decomposition in machine learning
 
Phd Defense 2007
Phd Defense 2007Phd Defense 2007
Phd Defense 2007
 
Real Time Geodemographics
Real Time GeodemographicsReal Time Geodemographics
Real Time Geodemographics
 
Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual Digital Signal Processing Lab Manual
Digital Signal Processing Lab Manual
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
Parallelisation of the PC Algorithm (CAEPIA2015)
Parallelisation of the PC Algorithm (CAEPIA2015)Parallelisation of the PC Algorithm (CAEPIA2015)
Parallelisation of the PC Algorithm (CAEPIA2015)
 
Vlsi model question paper 3 (june 2021)
Vlsi model question paper 3 (june 2021)Vlsi model question paper 3 (june 2021)
Vlsi model question paper 3 (june 2021)
 
Panel101R princeton.pdf
Panel101R princeton.pdfPanel101R princeton.pdf
Panel101R princeton.pdf
 
NYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden LakeNYAI #9: Concepts and Questions As Programs by Brenden Lake
NYAI #9: Concepts and Questions As Programs by Brenden Lake
 
8th Semester (June; July-2014) Computer Science and Information Science Engin...
8th Semester (June; July-2014) Computer Science and Information Science Engin...8th Semester (June; July-2014) Computer Science and Information Science Engin...
8th Semester (June; July-2014) Computer Science and Information Science Engin...
 
3rd Semester (June; July-2014) Civil Engineering Question Papers
3rd Semester (June; July-2014) Civil Engineering Question Papers3rd Semester (June; July-2014) Civil Engineering Question Papers
3rd Semester (June; July-2014) Civil Engineering Question Papers
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
Workload-aware materialization for efficient variable elimination on Bayesian...
Workload-aware materialization for efficient variable elimination on Bayesian...Workload-aware materialization for efficient variable elimination on Bayesian...
Workload-aware materialization for efficient variable elimination on Bayesian...
 
Actuator Constrained Optimal Control of Formations Near the Libration Points
Actuator Constrained Optimal Control of Formations Near the Libration PointsActuator Constrained Optimal Control of Formations Near the Libration Points
Actuator Constrained Optimal Control of Formations Near the Libration Points
 
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
Contribution à l'étude du trafic routier sur réseaux à l'aide des équations d...
 
FINDING FREQUENT SUBPATHS IN A GRAPH
FINDING FREQUENT SUBPATHS IN A GRAPHFINDING FREQUENT SUBPATHS IN A GRAPH
FINDING FREQUENT SUBPATHS IN A GRAPH
 
Robust Low-rank and Sparse Decomposition for Moving Object Detection
Robust Low-rank and Sparse Decomposition for Moving Object DetectionRobust Low-rank and Sparse Decomposition for Moving Object Detection
Robust Low-rank and Sparse Decomposition for Moving Object Detection
 
Prpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfacesPrpagation of Error Bounds Across reduction interfaces
Prpagation of Error Bounds Across reduction interfaces
 

More from Sri Ambati

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxSri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thSri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMsSri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the WaySri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OSri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersSri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email AgainSri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...Sri Ambati
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneySri Ambati
 

More from Sri Ambati (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 
AI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation JourneyAI Foundations Course Module 1 - An AI Transformation Journey
AI Foundations Course Module 1 - An AI Transformation Journey
 

Recently uploaded

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 

Recently uploaded (20)

Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 

H2O World - GLRM - Anqi Fu

  • 1. Generalized Low Rank Models Anqi Fu Machine Learning Scientist, H2O.ai anqi@h2o.ai Based on work by Madeleine Udell, Corinne Horn, Reza Zadeh and Stephen Boyd November 6, 2015 Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 1 / 8
  • 2. What is a Low Rank Model? Given: Data table A with m rows and n columns Find: Compressed representation as numeric tables X and Y , where # cols in X = # rows in Y = small user-specified k max(m, n) # cols in Y is d = (total dimension of embedded features in A) ≥ n m      A   n ≈ m     X   k Y k n Row of Y = archetypal feature created from columns of A Row of X = row of A in reduced feature space Can approximately reconstruct A from product XY Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 2 / 8
  • 3. Why use Low Rank Models? Reduce storage space, e.g. 10 GB compressed to 100 MB Increase prediction speed, e.g. 10x speed-up with no accuracy loss Identify and visualize important features Impute missing data entries Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 3 / 8
  • 4. Example 1: Visualizing Walking Stances time forehead (x) forehead (y) · · · right toe (y) right toe (z) t1 1.4 2.7 · · · -0.5 -0.1 t2 2.7 3.5 · · · 1.3 0.9 t3 3.3 -.9 · · · 4.2 1.8 ... ... ... ... ... ... A contains 151 rows (observations over time) by 124 columns (location of body parts) Build a low rank model X, Y with rank k = 10 Rows of Y are principal stances person takes while walking Rows of X decompose each bodily position into combination of principal stances Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 4 / 8
  • 5. Example 2: Compressing Zip Codes repeat violator ZCTA · · · violations penalty EE’s affected N/A 70525 · · · 9 8100 0 R 75189 · · · 6 935 5 RW 95621 · · · 4 1155 3 ... ... ... ... ... ... Train: U.S. Wage & Hour Division (WHD) compliance actions contains 208,806 rows (cases) and 252 columns (violation info) Response: Was firm a repeat and/or willful violator? Predictors: Zip code tabulation area (ZCTA), number of violations, civil penalties, employees (EE’s) affected, etc Naive approach replaces ZCTA with indicator variables, which 1) is slow, 2) overfits, 3) cannot transfer knowledge between similar ZCTAs Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 5 / 8
  • 6. Example 2: Compressing Zip Codes ZCTA associate degree bachelor’s degree · · · welsh west indian 01001 1584 1953 · · · 34 57 01002 510 3098 · · · 332 181 01003 27 49 · · · 40 134 ... ... ... ... ... ... American Community Survey (ACS) data contains 32,989 rows (unique ZCTAs) by 150 columns (population info) Build a low rank model X, Y with rank k = 10 and regularization to sparsify features Rows of Y are demographic archetypes Rows of X map ZCTAs into combination of demographic archetypes Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 6 / 8
  • 7. Example 2: Compressing Zip Codes Train =    y ZCTA ··· N/A 70525 · · · ... ... · · · R 01002 · · ·    X =      ZCTA archetypes 01001 —x1— 01002 —x2— ... ... 70525 —xp—      Replace ZCTA col of training data with low rank model (X) of ACS Predict if firm will be a repeat violator using modified training data repeat violator archetypes · · · violations penalty EE’s affected N/A —xp— · · · 9 8100 0 ... ... ... ... ... ... R —x2— · · · 4 225 3 Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 7 / 8
  • 8. References M. Udell, S. Boyd, et al (2014), Generalized Low Rank Models Example 1: Visualizing Walking Stance Walking Gait Data Walking Gait Data with Missing Values Example 2: Compressing Zip Codes Wage and Hour Division Data American Community Survey Data Anqi Fu (H2O.ai) Generalized Low Rank Models November 6, 2015 8 / 8