AI Beyond Deep Learning

Andre Freitas
Department of Computer Science
AI Beyond Deep Learning
AI Systems Lab
Talk@Peak Ensemble, October 2019

This Talk
• Articulate a more comprehensive representation-
based perspective of AI (NLP-centered).
• Complements a Deep Learning based
perspective.
• Targeting the AI Engineer.
• How:
– Pointers to existing works from the AI Systems lab
linking to major trends.

Who we are?
• Established in September 2018.
• Applied, systems-oriented research.
– Fuzziness between science and engineering.
• Service to areas outside computer science.
– Bio, engineering, law.
• NLP as epistemological engineering.

Current Limits of Deep Learning
1. Data hungry
2. Shallow and has limited capacity for transfer
3. Has no natural way to deal with hierarchical structure
4. Is not sufficiently transparent
5. Has not been well integrated with prior knowledge
6. Cannot inherently distinguish causation from correlation
7. Presumes a largely stable world, in ways that may be
problematic
8. Works well as an approximation, but its answers often
cannot be fully trusted
9. Is difficult to engineer with
Marcus, Gary. "Deep learning: A critical appraisal." arXiv preprint arXiv:1801.00631 (2018).
Deep learning thus far is:

Common Requirements
(End-users & AI Developers)
• Data is available but it is unstructured/messy. How can I use it to
solve my problem (answering, predicting, discovering)?
• How can I develop systems which generalise from fewer
examples?
• How can I be sure that the system got it right? How do I verify it?
• I built a model for data/population X. How does it transport to
data/population Y?
• Which architecture, representation and tools should I use to build
my AI system?

Common Requirements
(End-users & AI Developers)
• Data is available but it is unstructured/messy. How can I use it to
solve my problem (answering, predicting, discovering)?
• How can I develop systems which generalise from fewer
examples?
• How can I be sure that the system got it right? How do I verify it?
• I built a model for data/population X. How does it transport to
data/population Y?
• Which architecture, representation and tools should I use to build
my AI system?
Extracting and Representing Meaning
Few-shot / Small Data Learning
Explainability
Transportability
AI Engineering, Multi-component AI Systems

Outline
• Extracting and Representing Meaning
– (from Text, at Scale)
– Few-shot Learning
– Explainability
– Transportability
• Explainable Reasoning (From Deep Learning to Deep
Semantics)
– Neuro-symbolic models
• NLP Engineering
– Building Multi-Component NLP Systems

Extracting and Representing
Meaning (from Text, at Scale)

Message #1:
A little linguistics goes a
long way.

The Challenge of Text Interpretaion
Where to start?
“A fluoroscopic study known as an upper gastrointestinal series is typically
the next step in management, although if volvulus is suspected, caution
with non water soluble contrast is mandatory as the usage of barium can
impede surgical revision and lead to increased post operative
complications.”
?

Sentence Spliting
A fluoroscopic study is
typically the next step in
management.
This fluoroscopic study
is known as an upper
gastrointestinal series.
Caution with non water soluble
contrast is mandatory.
The usage of barium can
impede surgical revision.
Vovulus is suspected
The usage of barium can
lead to increased post
operative complications.
ELABORATIO
N
core
LIST
BACKGROUND
CONTRAST
CONDITION
context
context
context
core
context
core core
core
core

In a Nutshell
• Transformation of complex sentences into a set of
hierarchically ordered and semantically interconnected
sentences that present a simplified syntax:
○ minimal semantic units
○ semantic hierarchy
• Manually curated rules over syntactic and lexical patterns
Transforming Complex Sentences into a Semantic Hierarchy, ACL 2019.
https://github.com/Lambda-3/DiscourseSimplification

https://github.com/Lambda-3/Graphene
A Sentence Simplification System for Improving Relation Extraction, COLING
(2017)
Extracting Knowledge Graphs
from Text

Assumption
ProofStart
Q and D are positive
integers.
Q and D are relatively
prime.
Fact
Fact
ProofStep
Q and D are
relatively prime
entails
exponential
Has fact
Fact
Q² can have 2 in
its prime
decomposition
Q² exponent
must be one
Has fact
Odd integer
x
Has fact
Q² is square
the exponents
for all of its
prime factors
must be even
entails
KB-
Lookup
entails
searches
Contradicts
entails
Fact Fact
ProofStep
Fact
Fact
Fact
ProofEnd
Mathematical
Knowledge
Graphs

Crossing Modalities
Visual Genome

Message #2:
Build your knowledge graph
one semantic phenomenon
at a time.

Vector Representations
of Meaning (Embeddings)
Commonsense is here
θ
car
dog
cat
bark
run
leash
Semantic Approximation is
here
Semantic Model with
low acquisition effort

https://colah.github.io/posts/2015-01-Visualizing-Representations/

BERT (Devlin et al. 2018)
• Transformer: attention mechanism that learns contextual
relations between tokens/affixes in text.
• Word and Sentence-level representation.
https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-
for-nlp-f8b21a9b6270

Neuro-KGs
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
…
W2V, GLOVE, BERT, …
Distributional Relational Networks, AAAI Symposium (2013).
A Compositional-Distributional Semantic Model for Searching Complex Entity
Categories, ACL *SEM (2016)
Natural Language Queries over Heterogeneous Linked Data Graphs:
A Distributional-Compositional Semantics Approach, IUI )2014)

Summary
• KG Models (Fine-grained):
– Facilitates querying, integration and explainability.
• Embedding Models (Coarse-grained):
– Supports semantic approximation, coping with vocabulary
variation.
• Have complementary semantic values.

Summary
• Building Knowledge Graphs:
– Text Simplification, Contextual Open IE.
– Discourse relations (argumentation, stories, pragmatics).
– Entity, relation and event extraction.
– Co-reference resolution, entity linking.

Message #3:
KGs + Embeddings are a
representation (data model)
convenient for AI
engineering.

Query Processing, Indexing,
Caching …
s0 p0 o0
s0
q
Inverted index
sharding
disk access
optimization
…
Multiple Randomized
K-d Tree Algorithm
Priority Search
K-Means Tree algorithm
Database + IR
Query planning
Cardinality
Indexing
Skyline
Bitmap indexes
…
Structured Queries Approximation Queries

DBee
DBee: A Database for Creating and Managing Knowledge Graphs and
Embeddings, TextGraphs@EMNLP (2019).

Software
https://github.com/Lambda-3/Stargraph
Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, IUI (2014).
Semantic Relatedness for All (Languages): A Comparative Analysis of
Multilingual Semantic Relatedness using Machine Translation, EKAW,
(2016).
https://github.com/Lambda-3/indra

Message #4:
Processing KGs +
Embeddings at scale require
specific data management
strategies.

Explainable Reasoning
(From Deep Learning to Deep Semantics)

Graph Networks (GNs)
• Deep learning methods often:
– Follow an end-to-end design philosophy.
– Emphasizes minimal a priori representational and
computational assumptions.
– Seeks to avoid explicit structures.
• GNs as a solution: projects embeddings and graphs
into the same representation space.
• Abstract away from fine-grained differences and
capture more general commonalities.

Multi-hop QA & Document
Graph Networks

Document Graph Networks (DGNs)
DGN = Gated Graph Neural Networks (GGNN) over Entity/Passage KGs.
Identifying Supporting Facts for Multi-hop Question Answering with
Document Graph Networks, TextGraphs@EMNLP

T: IBM cleared $18.2 billion in the first quarter.
H: IBM’s revenue in the first quarter was $18.2 billion.
Entailment?
YES
Why?
- To clear is to yield as a net profit
- A net profit is an excess of revenues
over outlays in a given period of time
Exploring Knowledge Graphs in an Interpretable Composite Approach for Text Entailment, AAAI
(2019).
Recognizing and Justifying Text Entailment through Distributional Navigation on Definition Graphs,
AAAI (2018).
Explainable Reasoning using
Distributional Navigation (DNA)

Explainable Science QA
Q: A student climbs up a rocky mountain trail in Maine. She sees many small pieces of rock
on the path. Which action most likely made the small pieces of rock?
[0]: sand blowing into cracks
[1]: leaves pressing down tightly
[2]: ice breaking large rocks apart
[3]: shells and bones sticking together
A: ice breaking large rocks apart
Explanation:
• weathering means breaking down (rocks ; surface materials) from larger whole into smaller
pieces by weather
• ice wedging is a kind of mechanical weathering
• ice wedging is when ice causes rocks to crack by expanding in openings
• cycles of freezing and thawing water cause ice wedging
• cracking something may cause that something to break apart
• to cause means to make
Jansen, Peter A., et al. "Worldtree: A corpus of explanation graphs for elementary science questions supporting multi-hop inference."
arXiv preprint arXiv:1802.03052 (2018).

Message #5:
Learning + Inference on
Neuro-KG models facilitate
explainability and
generalisation.

Engineering Multi-Component
NLP Systems

Text Classification
Inference &
Integration
Embeddings
Querying Components
NL Query
Answers
Explanations
Open IE
NERArgument
Definition Extr.
OpinionStory
KG
Completion
NL Inference
Entity Linking
Co-reference
Resolution
Semantic
Parsing
Query By
Example
Systems-level
Representation
&
Evaluation
Neuro-symbolic
Representation

Measuring diagram quality through semiotic morphisms, Semiotica (2019).
The Diagrammatic AI Language (DIAL): Version 0.1 (2019).
Switching contexts: Transportability measures for NLP (under review).

Evaluating Explanations &
Interpretability
On the Semantic Interpretability of Artificial Intelligence Models (under review).
An Explainable Few-Shot Learning Semantic Parser for Natural Language Programming
(under review).

Summary
• AI Engineering has still a long way to go. Open questions:
• How to represent AI systems at an architectural level (so
that we facilitate systems design, reasoning, learning and
communication)?
• How do we measure and estimate performance and cost
in AI systems for new domains and languages?
• How to measure explainability and transportability?

Final Considerations
• AI Engineers as representation and algorithm
designers.
• Working with NLP?
– Understanding language/semantics precedes ML.
• Critical roles:
– AI Architect
– Data/Representation/Resources/Experiment/Workflow Curator
– DBA4AI
– AI Evaluator and Tester
– Data Engineer

Take-away Message
• Many meaningful AI tasks require large-scale background
knowledge.
• Knowledge graphs + Embeddings provide a realistic and
scalable representation model.
• Neuro-Symbolic (GNs, DNA) models operating on the top of
these representations, enabling:
– Few-shot learning
– Explainability
– Transportability/Transferability

References
http://andrefreitas.org
andre.freitas@manchester.ac.uk
These slides:
https://www.slideshare.net/andrenfreitas/ai-
beyond-deep-learning

AI Beyond Deep Learning

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to AI Beyond Deep Learning

Similar to AI Beyond Deep Learning (20)

More from Andre Freitas

More from Andre Freitas (20)

Recently uploaded

Recently uploaded (20)

AI Beyond Deep Learning