Tensor Abuse - how to reuse machine learning frameworks

© 2017 MapR TechnologiesMapR Confidential 1
Tensor Abuse
In the Workplace

Agenda
• What are tensors?
• How modern tensor systems work
• It's not just machine learning
• It’s also not magic
• Machine learning as optimization
• Animation of optimization algorithm

Spoilers
• Trigger warning: Greek letters ahead
– AND lots of animations for developing intuitions
• Tensors have a history in physics
• We don’t use them that way in machine learning
• Tensors are cool because they make it easier to code
numerical algorithms
• But auto-derivatives are as big a deal
• And we can abuse this machinery

Linear Operators on To the N-th Degree
• Tensors were originally invented for differential geometry
• That gave us relativity

But This Isn’t Physics
• In computing, we use some of the same words
• But they don’t mean the same thing
• We don’t need the Greek letters
• The point is that we have important patterns of computation
– We mostly don’t care about changing coordinate systems

Basic Operations
• Element-wise operations
• Outer products
• Reductions
• Matrix and vector products are special cases

This is news?!?

This is news?!?
Sounds like APL in sheep’s clothing

But it really is important
because
loop structuring is critical

Why This Matters for Machine Learning
• Sums of products and element-wise evaluations are ubiquitous
in machine learning
• And this is often surrounded by an outer loop
• This is susceptible to pipelining in GPU’s, but only if you can
see the large-scale patterns in the code

But, tensors are only
half of the story

Machine Learning Has Changed
• Machine learning used to be really hard
– Numeric performance was zilch
– Training data was poor and small
– Learning many layers of a MLP was impossible
– Productivity for new approaches hampered by code complexity
• Recent advances have changed things (a lot)
– Important new regularization techniques
– Per coefficient learning rate techniques
– New gradient based optimization algorithms
– Automated differentiation

Gradients are a Big Deal
• In low dimensions, simple search techniques work well
– Line search, evolutionary processes, polyhedron warping all work
• It’s different in high dimensions
– Need some guide about which direction to go
– Too many ways to get sideways

Yeah, But …
• Gradients are also a pain in the ***
– Traditionally you needed to derive them by hand
– For some problems like robot kinematics or astrodynamics, equations
would cover many pages
• The big change in recent years is automatic differentiation
– Some limits on how problems are formulated
– But pretty complicated forms can be allowed easily

hidden = tf.nn.elu(x - biases)
y_pred = tf.matmul(hidden, weights)
loss = tf.reduce_mean((y - y_pred) ** 2)

The Lessons to Learn
• These new systems are astounding
– Tensors allow us to encode very complex algorithms
– Auto-differentiation allows us to use gradients
– New optimizers solve very nasty problems
– The same code can drive CPU, GPU or clusters
– The code is strange and abstract, but not that hard
• Not just for breakfast any more
– Machine learning isn’t the only thing you can do with these systems
– Go for it!

Q&A
@mapr
maprtechnologies
yourname@mapr.com
ENGAGE WITH US

Tensor Abuse - how to reuse machine learning frameworks

More Related Content

What's hot

Similar to Tensor Abuse - how to reuse machine learning frameworks

More from Ted Dunning

Recently uploaded

Tensor Abuse - how to reuse machine learning frameworks