• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)
 

[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research)

on

  • 3,291 views

Abstract:...

Abstract:

Machine learning researchers and practitioners develop computer
algorithms that "improve performance automatically through
experience". At Google, machine learning is applied to solve many
problems, such as prioritizing emails in Gmail, recommending tags for
YouTube videos, and identifying different aspects from online user
reviews. Machine learning on big data, however, is challenging. Some
"simple" machine learning algorithms with quadratic time complexity,
while running fine with hundreds of records, are almost impractical to
use on billions of records.

In this talk, I will describe lessons drawn from various Google
projects on developing large scale machine learning systems. These
systems build on top of Google's computing infrastructure such as GFS
and MapReduce, and attack the scalability problem through massively
parallel algorithms. I will present the design decisions made in
these systems, strategies of scaling and speeding up machine learning
systems on web scale data.


Speaker biography:

Max Lin is a software engineer with Google Research in New York City
office. He is the tech lead of the Google Prediction API, a machine
learning web service in the cloud. Prior to Google, he published
research work on video content analysis, sentiment analysis, machine
learning, and cross-lingual information retrieval. He had a PhD in
Computer Science from Carnegie Mellon University.

Statistics

Views

Total Views
3,291
Views on SlideShare
3,000
Embed Views
291

Actions

Likes
5
Downloads
96
Comments
1

6 Embeds 291

http://hungyilo.blogspot.com 189
http://skaasj.wordpress.com 81
http://hungyilo.blogspot.tw 12
http://hungyilo.blogspot.ca 7
http://hungyilo.blogspot.ru 1
http://hungyilo.blogspot.hk 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Add fine-grained thread level parallelism for machine learning to Hadoop with zero MapReduce. http://bigdata.pervasive.com/Products/Analytic-Engine-Pervasive-DataRush.aspx
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research) [Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Google Projects (Max Lin, Google Research) Presentation Transcript