CLIM Program: Remote Sensing Workshop, Computational and Statistical Trade-offs in Data Analysis - Venkat Chandrasekaran, Feb 12, 2018

Computational and Statistical Tradeoffs
Venkat Chandrasekaran
Caltech
Joint with Michael Jordan (Berkeley), Yong Sheng
Soh (Caltech), Quentin Berthet (Cambridge)

Time-Constrained Inference
o Require decision after a fixed (usually small) amount of time
Ad exchange

Classical Analysis in Statistics
o Previously, key bottleneck: amount of data
o Consider ``best’’ estimator without much regard for
computational considerations
o Minimax analysis
o More recently, time/memory/other infrastructure is bottleneck
o Data is plentiful in several domains
o Need to incorporate these additional constraints formally

A Thought Experiment
o The standard inferential tradeoff
o Process 5MB of data in 1 hour, error = 0.1
o Process 5TB in 1 month, error = 0.0001
o Given 5TB of data, happy with error = 0.1
o Don’t care about small improvements in error
o Statistical models are only approximations of reality

A Thought Experiment
o The standard inferential tradeoff
o Process 5MB of data in 1 hour, error = 0.1
o Process 5TB in 1 month, error = 0.0001
o Given 5TB of data, happy with error = 0.1
o Don’t care about small improvements in error
o Statistical models are only approximations of reality
o More data useful for less computation?

Computer Science vs. Statistics
Error
Amount
of data
Runtime
/ Space /
…
Computer
ScienceComputation /
Statistics tradeoffs
Statistics

Time-Data Tradeoffs
o Consider an inference problem with fixed (desired) error
o Inference procedures viewed as points in plot
Runtime
Minimax risk lower bound
- well understood
Computational lower bound
- poorly understood
- depends on computational model
A
B
C
D E F
Number of samples

Time-Data Tradeoffs
o Consider an inference problem with fixed (desired) error
o Need “weaker” algorithms
for larger datasets to reduce
runtime
o At some stage, throw away
dataRuntime
A
B
C
D E F
Number of samples

An Estimation Problem
o Signal from known (bounded) set
o Noise
o Observation model
o Observe i.i.d. samples

Convex Programming Estimator
o Sample mean is sufficient statistic
o Natural M-estimator
o Convex programming M-estimator
o Constraint is a convex set such that

Convex Programming Estimator
o Long history of shrinkage estimation in statistics
o James, Stein (1961)
o Donoho, Johnstone (early 1990s)
o Shrinkage onto convex sets for tractability
o Many surprises in high dimensions, i.e., large p
o More recently
o L1 norm, trace norm, max norm, …

Statistical Performance of Estimator
o Defn 1: The cone of feasible directions into a convex set C is
defined as

o Defn 1: The cone of feasible directions into a convex set C is
defined as
o Defn 2: The Gaussian (squared) complexity of a cone T is defined
as

o Thm: The risk of the estimator is
o Proof: Apply optimality conditions
o Intuition: Only consider error in feasible cone
o Many improvements possible to this theorem

Weakening via Convex Relaxation
o Thm: The risk of the estimator is
o Corr: To obtain risk of at most 1,

o Corr: To obtain risk of at most 1,
Monotonic in C

If we have access to larger n, can use larger C
 Obtain “weaker” estimation algorithm

Hierarchy of Convex Relaxations
o If “algebraic”, then one can obtain family of outer convex
approximations
o Polyhedral, semidefinite, hyperbolic, relative entropy relaxations
(Sherali-Adams, Boyd, Parrilo, Lasserre, Renegar, C., Shah)
o Sets ordered by computational complexity
o Central role played by lift-and-project

Before we get to examples …
o How do we calculate runtime?
o Total runtime = np + # ops for projection
Computing
sample mean
Subsequent
processing
With more data,
this increases
With more data,
this decreases

Example 1
o consists of cut matrices
o E.g., collaborative filtering, clustering

Example 2
o Signal set consists of all perfect matchings in complete graph
o E.g., network inference

Example 3
o Banding estimators for covariance matrices
o Bickel-Levina [2007]; Cai, Zhang, Zhou [2010]
o Assume known variable ordering
o Stylized problem: let M be known tridiagonal matrix
o Signal set

Example 4
o consists of all adjacency matrices of graphs with only a clique
on square-root of the nodes
o E.g., sparse PCA, gene expression patterns
o Kolar et al. (2010)

Example 4
o What if we use an even weaker relaxation?
o E.g., just take the sample mean as the estimate

Example 4
o What if we use an even weaker relaxation?
o E.g., just take the sample mean as the estimate
o
o In this case, makes sense to throw away data …

Recall Plot …
o At some stage, throw away
data
o Several caveats
‒ Need computational lower
bounds
‒ Depends on computational
architecture
‒ Parallel vs. serial, …
Runtime
A
B
C
D E F
Number of samples

Beyond Denoising Problems …
o Consider a high-dimensional time series
o Goal: reliably estimate change-points

Change-Point Estimation
o Suppose we sample more frequently, but number of changes
remains the same
o Can again use convex relaxation to obtain time-data tradeoffs
o Joint work with Y.S. Soh (Caltech)
Soh, C., “High-Dimensional Changepoint Estimation: Combining Filtering and Convex Optimization,” ACHA, 2016

Heterogenous Data
o Preceding setups assume unrealistically that data are
“stationary” (same type, quality, etc.)
o How do we optimally deploy computational and other
resources to process data from disparate sources?
o Preliminary efforts (with Q. Berthet), but a lot remains to be done
Geo-monitoring
stations around
California
Cellphone
seismo-meters
in Pasadena
+ = ?
Berthet, C., “Resource Allocation for Statistical Estimation,” Proc. of the IEEE, 2016

Summary
o Many action items…
o Large-scale practical demonstrations
o Other types of inference problems, eg., covariance computation
o Mathematically codify network / communication / computing
constraints?
o Ultimately inform data-gathering exercises?
users.cms.caltech.edu/~venkatc

CLIM Program: Remote Sensing Workshop, Computational and Statistical Trade-offs in Data Analysis - Venkat Chandrasekaran, Feb 12, 2018

Recommended

Recommended

More Related Content

Similar to CLIM Program: Remote Sensing Workshop, Computational and Statistical Trade-offs in Data Analysis - Venkat Chandrasekaran, Feb 12, 2018

Similar to CLIM Program: Remote Sensing Workshop, Computational and Statistical Trade-offs in Data Analysis - Venkat Chandrasekaran, Feb 12, 2018 (20)

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Recently uploaded

Recently uploaded (20)

CLIM Program: Remote Sensing Workshop, Computational and Statistical Trade-offs in Data Analysis - Venkat Chandrasekaran, Feb 12, 2018