McKinsey predicts that AI and robotics will create $50 trillion of value over the next 10 years. Many predict that the recent technology of “deep learning” will be a big part of the transformation. Over 250 deep learning startup companies have attracted more than $1 billion of venture investment in the past year. Deep learning systems have recently broken records in speech recognition, image recognition, image captioning, translation, drug discovery and other tasks. Why is this happening now and how is it likely to play out? We review the development of AI and the pendulum swings between the “neats” and the “scruffies”. We describe traditional approaches to semantics through logics and grammars and the new deep learning vector semantics. We relate it to Roger Shepard’s cognitive geometry and the structure of biological networks. We also describe limitations of deep learning for safety and regulation. We show how it fits into the rational agent framework and discuss what the next steps may be.
Exosphere Chile Talk: Semantics, Deep Learning, and the Transformation of Business
Chile: Semantics, Deep Learning,
Transformation of Business
Steve Omohundro, Ph.D.
Neats vs. Scruffies
Multi-Billion Dollar Investments
• 2013 Facebook – AI lab, DeepFace
• 2013 Yahoo - LookFlow
• 2013 Ebay – AI lab
• 2013 Allen Institute for AI
• 2013 Google – DNNresearch, SCHAFT, Industrial Perception,
Redwood Robotics, Meka Robotics, Holomni, Bot & Dolly, Boston
• 2014 IBM - $1 billion in Watson
• 2014 Google - DeepMind $500 million
• 2014 Vicarious - $70 million
• 2014 Microsoft – Project Adam, Cortana
• 2015 Fanuc – Machine Learning for Robotics
• 2015 Toyota – $1 billion AI and Robotics Lab, Silicon Valley
McKinsey: $50 Trillion to 2025
AI Knowledge Work: $25 Trillion to 2025
Marketing, ERP, Big Data, Smart Assistants
Internet of Things: $15 Trillion to 2025
100 Billion devices by 2025
Cars, Appliances, Cameras, Meters, Wearables, etc.
Robot Manufacturing: $10 Trillion to 2025
Work 24 hours/day
No breaks, food, medical
Don’t quit, get bored, get depressed
Don’t leak secrets
Work well with others
Easy to replicate
Foxconn Technology Group
• World’s largest contract
• Assembles 40% of all consumer
• iPhone, iPad, Kindle, Xbox,
Playstation 4, etc.
• 1.3 million employees, $8K
• Employee suicides
• “Foxbot” robots, cost $25K, 2nd
• Building 30K robots/year
March 2015: China Brain
Robin Li Yanhong, CEO of Baidu proposed a state-level
Chinese initiative to develop AI
“comparable to the Apollo space programme”.
Health Care: $10 Trillion to 2025
Robot surgery, medical records, AI diagnosis
Self-Driving Vehicles: $10 Trillion by 2025
Disrupt Dealers, Insurance, Parking, Finance, Trucking, Taxis
10 million jobs
World’s largest job creator: 50,000 per month
Uber valuation: $51 billion, 20% of fares
Center for research on self-driving cars
36 second wait, $.50/mile, 100% of fares
3D Printing: $2 Trillion by 2025
April 2014: Chinese WinSun 3D printed
10 houses, 2100 sq ft, $4800
WinSun 3D printed 12,000 sq ft villa
US Building construction: $1 Trillion/yr
5.8 million employees
“Neats” vs. “Scruffies”
1963: John McCarthy
Stanford AI Lab
Thinking = Logical Inference
1963: Marvin Minsky
MIT MAC AI Group
1957: Rosenblatt’s “Perceptron”
“The embryo of an
electronic computer that [the
Navy] expects will be able to
walk, talk, see, write,
reproduce itself and be
conscious of its existence."
1957 Chomsky Grammar
"English as a Formal Language". In: Bruno Visentini (ed.): Linguaggi
nella società e nella tecnica. Mailand 1970, 189–223.
1970 Montague Semantics
Linguistic Rules are Complicated!
2006: Simple n-gram
models with lots of data
beat complicated hand
built linguistic models!
2009: And data is cheap
Much cheaper than
linguists or programmers!
1962: Roger Shepard Cognitive Geometry
Word2Vec – Mikolov 2013
• Distributional Semantics – Firth 1957
• Represent words by vectors
• Close vectors represent similar contexts
• Certain relations represented by translation:
King – Man + Woman = Queen
• Also tense, temperature, location, plurals,…
Why? Same context shift for all male -> female
The man ate his lunch.
The king ate his lunch.
The woman at her lunch.
The queen ate her lunch.
More Semantic Relations
• Paris – France + Italy = Rome
• Human – Animal = Ethics
• Obama – USA + Russia = Putin
• Library – Books = Hall
• Biggest – Big + Small = Smallest
• Ethical – Possibly + Impossibly = Unethical
• Picasso – Einstein + Scientist = Painter
• Forearm – Leg + Knee = Elbow
• Architect – Building + Software = Programmer
Deep Neural Net Face Recognition
Google FaceNet, June 2015
Record accuracy 99.63% on Labeled Faces in the Wild dataset
Cuts best previous error rate by 30%
22 layer feedforward net, 140M weights, 1.6 GFLOP/image, conv/pool/norm
Trained on triples pushing same faces together, different apart
CMU OpenFace, Oct. 2015
Open Source version of FaceNet
84.83% accuracy, <.1 training faces
Brin’s “Transparent Society”
$3.20 on Alibaba $2.95 on ebay
Recurrent Net Hallucinates C Code
Karpathy: 464MB of C code, 3 layer LSTM, 10 million parameters
The rat escaped.
The rat the cat attacked escaped.
The rat the cat the dog chased attacked escaped.
NeuralTalk and Walk Demo
DeepMind Deep-Q Networkshttp://www.nature.com/nature/journal/v518/n7540/full/nature14236.html
49 Atari 2600 Games
Same net all games
Beat previous Ais
Beat humans on half
Beat Ais from pixels
100’s of games
Deep Learning Has Blindspots
• Typically have problems to solve rather than
• Want confidence that system solves problem
• Want confidence in no unintended behaviors
• Systems often have to obey legal, corporate,
or design constraints
Technology Needs Semantics!
• Analyzing camera, sensor, weather data
• Better search, question answering, info
• Analysis and optimization of business processes
• Health monitoring, medical diagnosis
• Financial markets trading, stabilization
• Autonomous cars, trucks, boats, subs, planes
• Pollution monitoring and cleanup
• Improved robotic manufacturing
• Software and Hardware design
Approaches to Semantics
• Montague – map into Typed Lambda Calculus
• Denotational – map into CS Domains
• Mathematical – map into Set Theory
• Categorical – map into Category Theory
• Distributional – Statistics of Contexts
Representation, Encoding, Learning,
New Possibilities Coming Soon!