Ikhlaq Sidhu, content author
IKHLAQ SIDHU
Chief Scientist and Founding Director
Sutardja Center for Entrepreneurship & Technology
IEOR Emerging Area Professor
Department of Industrial Engineering & Operations Research, UC Berkeley
DATA AND AI
APPLICATIONS, TOOLS, TECHNOLOGY DIRECTIONS
Ikhlaq Sidhu, content author
SUTARDJA CENTER
AT BERKELEY
M ETR I C S AT A G LANC E
CHALLENGE
LAB
GLOBAL
PROFESSORS
NZTV
SELF-DRIVING
COLLIDER
DATA-X SCET IN TAIWAN
JOHN BATTELLE
FOUNDER WIRED
MAGAZINE
NEWS
COVERAGE
Undergraduate:
• 12-14 Courses, 1500+
Undergraduates
X at Berkeley: Graduate, Labs and
Professional
• 80+ Grad students
• 100+ Executives
• Labs: Data-X, Blockchain,
Sustainable Food
Ecosystem:
• 14+ Global Partners
• 500+ Executives
• 50+ Investors
MARRISA
MAYER
Ikhlaq Sidhu, content author
IEOR135/290AppliedDataScience
withVentureApplications
SupportedbytheData-XLab
Ikhlaq Sidhu, content author
• Detection of fake news
• Prediction of long-term energy prices
• Automatic recycling through image recognition
• AI for crime detection, traffic guidance, medical
diagnostics, etc.
• A version of Zillow that is recalculated with the
effects of AirBnB income
• Signal processing and pattern analysis to
improve earthquake warning systems
• Early Autism Detection
• Secure Health Records stored on a Blockchain
find many, many more at:
www.data-x.blog/projects
Data-X Project Examples
Ikhlaq Sidhu, content author
• 350 alumni students
• 50% avg enrollment increase / semester
• 80+ great projects completed
• 8+ published research papers
• 100+ industry experts in network
• 20+ students got employment as data
scientists only because they took Data-X
Amazing testimonials:
I think this class is so awesome because it teaches the tools and concepts that are most
commonly used in workplace teams that are involved with data science and applied
machine learning.
We are building on expertise from Data-X, our highly applied
Data Science & AI/ML Lab and Course
Ikhlaq Sidhu, content author
Berkeley X-Labs
A New Model for Applied Research Labs
Future of X:
Data, AI,
Blockchain
Expert team
formation
Innovation
Mindset
X
Better University to Industry Connection Models
ü Projects with Business Results
ü Global Teams, Visiting Scholars
ü Company Collaborations
ü Executive Education and Train Trainer Models
ü Self Evaluation Tools
ü Startups Acceleration
Ikhlaq Sidhu, content author
Data and AI Application
Ikhlaq Sidhu, content author
Basic Concept of Working with Data
• Data Wrangling
• In Production
Ikhlaq Sidhu, content author
Real-life Example: ZestCash
• “All data is credit data”
Online Loan Application
Name: JOE SMITH
Online Loan Application
Name: Joe Smith
The data says: greater credit risk! The data says: lesser credit risk!
Reference: Shomit Ghose
Example: Data and Information is a competitive advantage
Ikhlaq Sidhu, content author
Harrah’s Casino: Knowing your customer
• Service provider
of Gambling and
Casinos
• Entry Card
• Pain points
• Intervention
Reference: Supercrunchers
Ikhlaq Sidhu, content author
Customer
Insight/
Engagement
Operations:
Reliable &
Predictable
Security &
Fraud
Financial Firms
Network Security
Common Applications as of today …
Ikhlaq Sidhu, content author
Who Will Control the Automobile?
Does car need to buy gas? Gas
stations will want to know.
You can sell a new
car once, but you
can sell the data
every minute of
the day
The world’s two largest
sources of data
• Google? or Ford?
– Whoever has the better software and data science
team
– Winner will get the vast (and incredibly valuable)
streams of auto data
Who caused the accident?
Insurance companies will
want to know.
Shomit Ghose
Ikhlaq Sidhu, content author
Where Does Data Come From?
Ikhlaq Sidhu, content author
Where Does Data Come From?
Real-life Example: ZestCash
• “All data is credit data”
Online Loan Application
Name: JOE SMITH
Online Loan Application
Name: Joe Smith
The data says: greater credit risk! The data says: lesser credit risk!
Your Own Web Site Public Data Sets
Stock market, etc.
IOT/Sensors Other Web Sites
Ikhlaq Sidhu, content author
Web Scraping
https://github.com/ikhlaqsidhu/data-x/tree/master/03-tools-webscraping-crawling_api_afo
https://github.com/ikhlaqsidhu/data-x
Ikhlaq Sidhu, content author
Formatting Data
Ikhlaq Sidhu, content author
An ML High Level Framework
• Objects
• Events /
Experiments
• People /
Customers
• Products
• Stocks
• …
In Real Life
Features, but also
loss of information
In Sample
Out of Sample
Person 1
Person 2
Person 3
.
.
.
Person
N
- Characteristics
- Patterns
- Models
- Predictions
- Similarities
- Differences
- Distance
Some data
has observed results
Ikhlaq Sidhu, content author
CS: Table
Math: Matrix X, with N rows – each person
m columns, each feature (age, salary, ..)
X =
• Objects
• Events /
Experiments
• People /
Customers
• Products
• Stocks
• …
In Real Life
Features, but also
loss of information
In Sample
Out of Sample
Person 1
Person 2
Person 3
.
.
.
Person
N
- Characteristics
- Patterns
- Models
- Predictions
- Similarities
- Differences
- Distance
Some data
has observed results
An ML High Level Framework
Ikhlaq Sidhu, content author
Traditionally 2 Tasks: Classification & Predictive Scoring
The most famous
application has been
recommendation:
“which other user is
most like you”
Extracted Data
often in
Table
Format
Classification:
Cats and Dogs, Speech Recognition
Movie Recommendation
Scoring:
Credit Score, Movie Rating
Heath Score, Any Isoquant…
Ikhlaq Sidhu, content author
X Y
X Y
X YML Algorithms Guess
this function F(x)
We have now switched
to Neural Networks as
Function Approximators
Ikhlaq Sidhu, content author
Neural net results are close t human results
Ikhlaq Sidhu, content author
Data and AI Future Directions
Ikhlaq Sidhu, content author
Peter Abbeel – Deep
Reinforcement Learning
Peter Abbeel
Professor at UC Berkeley
Ikhlaq Sidhu, content author
Ikhlaq Sidhu, content author
Recent AI News
Source: Ken Goldberg, CPAR, People and Robotics Initiative
Ikhlaq Sidhu, content author
Does this mean AI Can Do
Everything Better than Humans
Ikhlaq Sidhu, content author
Perfect Information vs. Real World
fully observed uncertain
discrete multi-agentsingle agent
infinite time horizon
continuous
finite
Ken Goldberg UC Berkeley
Even then, AI Cannot Solve Real Life Problems Better Than Humans
And in fact, AI Can not even Work without Humans
Ken Goldberg
Leading AI
Researcher at
Berkeley
Professor and
Department Chair,
IEOR
William S. Floyd Jr.
Distinguished Chair
Ikhlaq Sidhu, content author
Acknowledgement to Ken Goldberg UC Berkeley
AI Systems Only Work because of Human are Part of the System
Google Operations
People Write Web Pages People at Google Tune
the Results
People Click on What
They Want
Result
Feedback
By clicks
Massive
Data
There is no “Intelligence”, “Desire”, or “Existence” in AI without People
There are only people who “invest in, design and operate the machines”
Ikhlaq Sidhu, content author
37
faculty
At Berkeley, we have a lot of research on
“How Machines Will Work as Part of Larger Systems
that Work with People”
Ikhlaq Sidhu, content author
My
Drive
Home
From
Berkeley
Ikhlaq Sidhu, content author
Autonomous Driving and Driver-Assist
•Communicating intent
•Driver-in-the-loop modeling
•Two-way learning: knowledge
transfer between vehicle and
driver
•Safety in autonomous and
assisted driving
Principal investigators:
Ken Goldberg
UC Berkeley
Anca Dragan
UC Berkeley
Trevor Darrell
UC Berkeley
Francesco Borrelli
UC Berkeley
Ruzena Bajcsy
UC Berkeley
Source: Ken Goldberg, CPAR, People and Robotics Initiative
Ikhlaq Sidhu, content authorSource: Ken Goldberg, CPAR, People and Robotics Initiative
Safety in Human-Robot Interaction:
Guarantees and Verification
Safety-constrained motion
planning for efficiency in factory
human-robot interaction
Learning and prediction for safety
in HRI
Provably safe human-centric
autonomy
Masayoshi Tomizuka
UC Berkeley
Principal investigators:
Claire Tomlin
UC Berkeley
Francesco Borrelli
UC Berkeley
Ikhlaq Sidhu, content author
• Large-scale machine learning - amounts of data
• Deep learning - recognition, classification
• Reinforcement learning - time sequence, aided by Neural Networks
• Robotics - beyond navigation, to safe interaction
• Computer vision - most prominent perception, better than human
• Natural Language Processing - interacting with people/dialog
• Collaborative systems - autonomous systems w/people + machines using
complimentary functions
• Crowdsourcing and human computation – harness human intelligence, uses
other AI, vision, ML, NLP, …
• Algorithmic game theory and computational social choice – systems using
social computing, incentives, prediction markets, game theory, peer
prediction, scoring rules, no regret learning
• Internet of Things (IoT) – using AI to unravel sensory information, interfaces,
and protocols
• Neuromorphic Computing – new computing fabrics based on biological
models
http://ai100.stanford.edu/2016-report
New
Data/AI
Systems
Most Common Data/AI Research Trends in 2017
Ikhlaq Sidhu, content author
Unsupervised Image to Image
[CycleGAN: Zhu, Park, Isola & Efros, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Generative Models
n “What I cannot create, I do not understand.”
n Ability to generate data that look real entails some form of
understanding
[Radford, Metz & Chintala, ICLR 2016]
Typical CNN converts image to output vector of features
Ikhlaq Sidhu, content author
Ikhlaq Sidhu, content author
Large Investment and Valuations
Some Very High Valuations
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Big Investments in China
Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Big Investments in China
Pieter Abbeel -- UC Berkeley | Gradescope
Recent Headlines
Pieter Abbeel -- UC B
Recent Headlines
[Macron (France), March 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
Ikhlaq Sidhu, content author
Contact:
Ikhlaq Sidhu
Founding Faculty Director, Center for Entrepreneurship & Technology
IEOR Emerging Area Professor, UC Berkeley
sidhu @berkeley.edu, scet.berkeley.edu

DATA AND AI APPLICATIONS, TOOLS, TECHNOLOGY DIRECTIONS

  • 1.
    Ikhlaq Sidhu, contentauthor IKHLAQ SIDHU Chief Scientist and Founding Director Sutardja Center for Entrepreneurship & Technology IEOR Emerging Area Professor Department of Industrial Engineering & Operations Research, UC Berkeley DATA AND AI APPLICATIONS, TOOLS, TECHNOLOGY DIRECTIONS
  • 2.
    Ikhlaq Sidhu, contentauthor SUTARDJA CENTER AT BERKELEY M ETR I C S AT A G LANC E CHALLENGE LAB GLOBAL PROFESSORS NZTV SELF-DRIVING COLLIDER DATA-X SCET IN TAIWAN JOHN BATTELLE FOUNDER WIRED MAGAZINE NEWS COVERAGE Undergraduate: • 12-14 Courses, 1500+ Undergraduates X at Berkeley: Graduate, Labs and Professional • 80+ Grad students • 100+ Executives • Labs: Data-X, Blockchain, Sustainable Food Ecosystem: • 14+ Global Partners • 500+ Executives • 50+ Investors MARRISA MAYER
  • 3.
    Ikhlaq Sidhu, contentauthor IEOR135/290AppliedDataScience withVentureApplications SupportedbytheData-XLab
  • 4.
    Ikhlaq Sidhu, contentauthor • Detection of fake news • Prediction of long-term energy prices • Automatic recycling through image recognition • AI for crime detection, traffic guidance, medical diagnostics, etc. • A version of Zillow that is recalculated with the effects of AirBnB income • Signal processing and pattern analysis to improve earthquake warning systems • Early Autism Detection • Secure Health Records stored on a Blockchain find many, many more at: www.data-x.blog/projects Data-X Project Examples
  • 5.
    Ikhlaq Sidhu, contentauthor • 350 alumni students • 50% avg enrollment increase / semester • 80+ great projects completed • 8+ published research papers • 100+ industry experts in network • 20+ students got employment as data scientists only because they took Data-X Amazing testimonials: I think this class is so awesome because it teaches the tools and concepts that are most commonly used in workplace teams that are involved with data science and applied machine learning. We are building on expertise from Data-X, our highly applied Data Science & AI/ML Lab and Course
  • 6.
    Ikhlaq Sidhu, contentauthor Berkeley X-Labs A New Model for Applied Research Labs Future of X: Data, AI, Blockchain Expert team formation Innovation Mindset X Better University to Industry Connection Models ü Projects with Business Results ü Global Teams, Visiting Scholars ü Company Collaborations ü Executive Education and Train Trainer Models ü Self Evaluation Tools ü Startups Acceleration
  • 7.
    Ikhlaq Sidhu, contentauthor Data and AI Application
  • 8.
    Ikhlaq Sidhu, contentauthor Basic Concept of Working with Data • Data Wrangling • In Production
  • 9.
    Ikhlaq Sidhu, contentauthor Real-life Example: ZestCash • “All data is credit data” Online Loan Application Name: JOE SMITH Online Loan Application Name: Joe Smith The data says: greater credit risk! The data says: lesser credit risk! Reference: Shomit Ghose Example: Data and Information is a competitive advantage
  • 10.
    Ikhlaq Sidhu, contentauthor Harrah’s Casino: Knowing your customer • Service provider of Gambling and Casinos • Entry Card • Pain points • Intervention Reference: Supercrunchers
  • 11.
    Ikhlaq Sidhu, contentauthor Customer Insight/ Engagement Operations: Reliable & Predictable Security & Fraud Financial Firms Network Security Common Applications as of today …
  • 12.
    Ikhlaq Sidhu, contentauthor Who Will Control the Automobile? Does car need to buy gas? Gas stations will want to know. You can sell a new car once, but you can sell the data every minute of the day The world’s two largest sources of data • Google? or Ford? – Whoever has the better software and data science team – Winner will get the vast (and incredibly valuable) streams of auto data Who caused the accident? Insurance companies will want to know. Shomit Ghose
  • 13.
    Ikhlaq Sidhu, contentauthor Where Does Data Come From?
  • 14.
    Ikhlaq Sidhu, contentauthor Where Does Data Come From? Real-life Example: ZestCash • “All data is credit data” Online Loan Application Name: JOE SMITH Online Loan Application Name: Joe Smith The data says: greater credit risk! The data says: lesser credit risk! Your Own Web Site Public Data Sets Stock market, etc. IOT/Sensors Other Web Sites
  • 15.
    Ikhlaq Sidhu, contentauthor Web Scraping https://github.com/ikhlaqsidhu/data-x/tree/master/03-tools-webscraping-crawling_api_afo https://github.com/ikhlaqsidhu/data-x
  • 16.
    Ikhlaq Sidhu, contentauthor Formatting Data
  • 17.
    Ikhlaq Sidhu, contentauthor An ML High Level Framework • Objects • Events / Experiments • People / Customers • Products • Stocks • … In Real Life Features, but also loss of information In Sample Out of Sample Person 1 Person 2 Person 3 . . . Person N - Characteristics - Patterns - Models - Predictions - Similarities - Differences - Distance Some data has observed results
  • 18.
    Ikhlaq Sidhu, contentauthor CS: Table Math: Matrix X, with N rows – each person m columns, each feature (age, salary, ..) X = • Objects • Events / Experiments • People / Customers • Products • Stocks • … In Real Life Features, but also loss of information In Sample Out of Sample Person 1 Person 2 Person 3 . . . Person N - Characteristics - Patterns - Models - Predictions - Similarities - Differences - Distance Some data has observed results An ML High Level Framework
  • 19.
    Ikhlaq Sidhu, contentauthor Traditionally 2 Tasks: Classification & Predictive Scoring The most famous application has been recommendation: “which other user is most like you” Extracted Data often in Table Format Classification: Cats and Dogs, Speech Recognition Movie Recommendation Scoring: Credit Score, Movie Rating Heath Score, Any Isoquant…
  • 20.
    Ikhlaq Sidhu, contentauthor X Y X Y X YML Algorithms Guess this function F(x) We have now switched to Neural Networks as Function Approximators
  • 21.
    Ikhlaq Sidhu, contentauthor Neural net results are close t human results
  • 22.
    Ikhlaq Sidhu, contentauthor Data and AI Future Directions
  • 23.
    Ikhlaq Sidhu, contentauthor Peter Abbeel – Deep Reinforcement Learning Peter Abbeel Professor at UC Berkeley
  • 24.
  • 25.
    Ikhlaq Sidhu, contentauthor Recent AI News Source: Ken Goldberg, CPAR, People and Robotics Initiative
  • 26.
    Ikhlaq Sidhu, contentauthor Does this mean AI Can Do Everything Better than Humans
  • 27.
    Ikhlaq Sidhu, contentauthor Perfect Information vs. Real World fully observed uncertain discrete multi-agentsingle agent infinite time horizon continuous finite Ken Goldberg UC Berkeley Even then, AI Cannot Solve Real Life Problems Better Than Humans And in fact, AI Can not even Work without Humans Ken Goldberg Leading AI Researcher at Berkeley Professor and Department Chair, IEOR William S. Floyd Jr. Distinguished Chair
  • 28.
    Ikhlaq Sidhu, contentauthor Acknowledgement to Ken Goldberg UC Berkeley AI Systems Only Work because of Human are Part of the System Google Operations People Write Web Pages People at Google Tune the Results People Click on What They Want Result Feedback By clicks Massive Data There is no “Intelligence”, “Desire”, or “Existence” in AI without People There are only people who “invest in, design and operate the machines”
  • 29.
    Ikhlaq Sidhu, contentauthor 37 faculty At Berkeley, we have a lot of research on “How Machines Will Work as Part of Larger Systems that Work with People”
  • 30.
    Ikhlaq Sidhu, contentauthor My Drive Home From Berkeley
  • 31.
    Ikhlaq Sidhu, contentauthor Autonomous Driving and Driver-Assist •Communicating intent •Driver-in-the-loop modeling •Two-way learning: knowledge transfer between vehicle and driver •Safety in autonomous and assisted driving Principal investigators: Ken Goldberg UC Berkeley Anca Dragan UC Berkeley Trevor Darrell UC Berkeley Francesco Borrelli UC Berkeley Ruzena Bajcsy UC Berkeley Source: Ken Goldberg, CPAR, People and Robotics Initiative
  • 32.
    Ikhlaq Sidhu, contentauthorSource: Ken Goldberg, CPAR, People and Robotics Initiative Safety in Human-Robot Interaction: Guarantees and Verification Safety-constrained motion planning for efficiency in factory human-robot interaction Learning and prediction for safety in HRI Provably safe human-centric autonomy Masayoshi Tomizuka UC Berkeley Principal investigators: Claire Tomlin UC Berkeley Francesco Borrelli UC Berkeley
  • 33.
    Ikhlaq Sidhu, contentauthor • Large-scale machine learning - amounts of data • Deep learning - recognition, classification • Reinforcement learning - time sequence, aided by Neural Networks • Robotics - beyond navigation, to safe interaction • Computer vision - most prominent perception, better than human • Natural Language Processing - interacting with people/dialog • Collaborative systems - autonomous systems w/people + machines using complimentary functions • Crowdsourcing and human computation – harness human intelligence, uses other AI, vision, ML, NLP, … • Algorithmic game theory and computational social choice – systems using social computing, incentives, prediction markets, game theory, peer prediction, scoring rules, no regret learning • Internet of Things (IoT) – using AI to unravel sensory information, interfaces, and protocols • Neuromorphic Computing – new computing fabrics based on biological models http://ai100.stanford.edu/2016-report New Data/AI Systems Most Common Data/AI Research Trends in 2017
  • 34.
    Ikhlaq Sidhu, contentauthor Unsupervised Image to Image [CycleGAN: Zhu, Park, Isola & Efros, 2017] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI Generative Models n “What I cannot create, I do not understand.” n Ability to generate data that look real entails some form of understanding [Radford, Metz & Chintala, ICLR 2016] Typical CNN converts image to output vector of features
  • 35.
  • 36.
    Ikhlaq Sidhu, contentauthor Large Investment and Valuations Some Very High Valuations Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI Big Investments in China Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI Big Investments in China Pieter Abbeel -- UC Berkeley | Gradescope Recent Headlines Pieter Abbeel -- UC B Recent Headlines [Macron (France), March 2018] Pieter Abbeel -- UC Berkeley | Gradescope | Covariant.AI
  • 37.
    Ikhlaq Sidhu, contentauthor Contact: Ikhlaq Sidhu Founding Faculty Director, Center for Entrepreneurship & Technology IEOR Emerging Area Professor, UC Berkeley sidhu @berkeley.edu, scet.berkeley.edu