Grakn is a knowledge graph for building intelligent systems. It provides a flexible schema for modeling complex domains and automated logical inference. Grakn also features a distributed analytics language for large-scale analytics. The knowledge graph foundation enables developments in areas like finance, life sciences, and more by providing a unified representation of knowledge.
1. T H E K N O W L E D G E G R A P H
Join our community at grakn.ai/community
The Knowledge Graph
for Intelligent Systems
By Haikal Pribadi
Founder and CEO of GRAKN.AI
@graknlabs
@haikalpribadi
2. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Punch cards
& Tapes
Record Keeping
A BRIEF HISTORY OF DATABASES
3. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Punch cards
& Tapes
Navigational
Databases
SCALE
Record Keeping
A BRIEF HISTORY OF DATABASES
4. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Business Intelligence (BI)
Punch cards
& Tapes
Navigational
Databases
SCALE
Record Keeping
“Wouldn’t it be nice if you could express the question at a higher level and
let the system figure out how to do the navigation?”
Edgar F. Codd, Inventor of Relational Databases
A BRIEF HISTORY OF DATABASES
5. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Relational/SQL
Databases
Business Intelligence (BI)
SCALE
COMPLEXITY
RELATIONAL DB WAS INVENTED TO SOLVE COMPLEXITY
“Wouldn’t it be nice if you could express the question at a higher level and
let the system figure out how to do the navigation?”
Edgar F. Codd, Inventor of Relational Databases
Punch cards
& Tapes
Navigational
Databases
Record Keeping
6. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Relational/SQL
Databases
Business Intelligence (BI)
Web Applications
COMPLEXITY
A BRIEF HISTORY OF DATABASES
Punch cards
& Tapes
Navigational
Databases
Record Keeping
SCALE
7. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Relational/SQL
Databases
NoSQL & NewSQL
Databases
SCALE
Business Intelligence (BI)
Web Applications
COMPLEXITY
A BRIEF HISTORY OF DATABASES
Punch cards
& Tapes
Navigational
Databases
Record Keeping
SCALE
8. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Relational/SQL
Databases
NoSQL & NewSQL
Databases
SCALE
Business Intelligence (BI)
Web Applications
Artificial Intelligence (AI)
COMPLEXITY
A BRIEF HISTORY OF DATABASES
Punch cards
& Tapes
Navigational
Databases
Record Keeping
SCALE
9. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Relational/SQL
Databases
NoSQL & NewSQL
Databases
SCALE
COMPLEXITY
COMPLEXITY
Business Intelligence (BI)
Web Applications
Intelligent Systems
?
INTELLIGENT SYSTEMS PROCESS DATA THAT IS TOO COMPLEX FOR CURRENT DATABASES
Punch cards
& Tapes
Navigational
Databases
Record Keeping
SCALE
10. Follow us @GraknLabs
1960 1970 1980 1990 2000 2010 2020 2030
Relational/SQL
Databases
NoSQL & NewSQL
Databases
Business Intelligence (BI)
Web Applications
Artificial Intelligence (AI)
SCALE
COMPLEXITY
SCALE
COMPLEXITY
WHAT RELATIONAL DID FOR BI, IS WHAT GRAKN WILL DO FOR AI
Punch cards
& Tapes
Navigational
Databases
Record Keeping
11. Follow us @GraknLabs
What is the problem with complex data?
Too complex to model
Current modelling
techniques only based on
binary relationships
Could not model complex
domains
Too complex to query
Current languages only allow
you to query for explicitly
stored data
Could not simplify verbose
queries
Too expensive analytics
Automated distributed
algorithms (BSP) expensive
and not reusable
Could not reuse analytics
algorithms
DB QLs are too low-level
Strong abstraction over low-
level constructs and
complex relationships
Difficult to work with complex
data
12. Follow us @GraknLabs
GRAKN.AI the knowledge base
foundation for intelligent systems
i.e.
GRAKN.AI is a knowledge graphKnowledge Storage System
Novel Knowledge Representation System based on Hypergraph
Theory
Knowledge Inference
OLTP Reasoning Engine
Knowledge Analytics
OLAP Distributed Analytics
13. Follow us @GraknLabs
What is a knowledge graph?
Knowledge schema
Flexible Entity-Relationship
concept-level schema to
build knowledge models
Model complex
domains
Logical Inference
Automated deductive
reasoning of data points
during runtime (OLTP)
Derive implicit facts &
simplification
Distributed Analytics
Automated distributed
algorithms (BSP) as a
language (OLAP)
Automated large scale
analytics
Higher-Level Language
Strong abstraction over low-
level constructs and
complex relationships
Easier to work with
complex data
14. Follow us @GraknLabs
THE KNOWLEDGE SCHEMA
A knowledge base needs to be able to model the real world and all the
type hierarchies, hyper-relationships and rules contained within it.
15. Follow us @GraknLabs
Schema Example: Basic Model
Employ-
ment
Person CompanyName
Employee Employer
has has
relates relates
plays plays
16. Follow us @GraknLabs
Schema Example: Type-Hierarchy
Employ-
ment
Person
Customer
Company
Startup
Name
Employee Employer
has has
sub sub
relates relates
plays plays
plays plays
17. Follow us @GraknLabs
Schema Example: Type-Hierarchy
Employ-
ment
Person
Customer
Company
Startup
Name
Employee Employer
has has
sub sub
relates relates
plays plays
Husband
Wife
Marriage
plays
plays
relates
relates
18. Follow us @GraknLabs
Valid Data Insertion
Alice Bob
IBM
Grakn
mar
emp
emp
employer
employer
wife husband
em
ployee
em
ployee
✓ Write commit success
customerperson
startup
19. Follow us @GraknLabs
Invalid Data insertions – [intelligent] Schema Constraints are Back!
Charlie Applemar
husband wife
companyperson
Write commit fails
Invalid relationship
20. Follow us @GraknLabs
Hyper-Relationship Example: Nested-Relationship
Alice Bob
Austin
mar
loc
locating
wife husband
located
personperson
City
07/01/2017
has
date
22. Follow us @GraknLabs
Rule Example: Transitive Relationship
Kings
Cross London
loc
countryward
UK
loc
city
loc
located locating
locating
locating
located
located
23. Follow us @GraknLabs
Rule Example: Simple Business Rule
Schedule A
Schedule B
A Start B Start A End B end
24. Follow us @GraknLabs
THE INFERENCE OLTP LANGUAGE
A knowledge-oriented query language should not only be able to
retrieve explicitly stored data, but also implicitly derived information.
25. Follow us @GraknLabs
Complex Query Example
drive
drive
drive
travel
travel
travel
Alice
Full-time Emp
Bob
Part-time Emp
Charlie
Temporary Emp
AB123
Bus
BC234
Van
CD345
Truck
Kings
Cross
Ward
London
City
UK
Country
loc
loc
Who are all the
drivers that will be
arriving in the UK?
driver
driver
locating
located
locatedlocating
driver
driven
driven
driven
destination
destination
destination
travellertraveller
The query would be very
long and complex in SQL,
NoSQL or even Graphs
26. Follow us @GraknLabs
Complex Query Example: Type and Relationship Inference
drive
drive
drive
travel
travel
travel
Alice
Full-time Emp
Bob
Part-time Emp
Charlie
Temporary Emp
AB123
Bus
BC234
Van
CD345
Truck
Kings
Cross
Ward
London
City
UK
Country
loc
loc
Who are all the
drivers that will be
arriving in the UK?
driver
driver
locating
located
locatedlocating
driver
driven
driven
driven
destination
destination
destination
travellertraveller
27. Follow us @GraknLabs
THE ANALYTICS OLAP LANGUAGE
Large-scale analytics is like teenage sex: everyone talks about it,
nobody really knows how to do it, everyone thinks everyone else is
doing it, so everyone claims they are doing it too.
At the end of the day, very few people know how to code it.
28. Follow us @GraknLabs
Example of a Distributed Analytics Algorithm
For each vertex V,
Superstep 1:
V sends its own id via both out going and incoming edges
V sets its own id as cluster label
Do superstep n:
For every received message m of V, compare it to its current cluster label L:
If m > L, set the label to m;
If the cluster label has not changed in this super step, vote to halt;
Else, send the new cluster label via all edges;
Global operation:
While not every vertex votes to halt, and n < N, do another superstep n + 1.
Connected Component: a clustering algorithm (pseudocode)
An efficient implementation
of this algorithm is about
200 lines of code in Java
29. Follow us @GraknLabs
Example of a Distributed Analytics Algorithm
For each vertex V,
Superstep 1:
V sends its own id via both out going and incoming edges
V sets its own id as cluster label
Do superstep n:
For every received message m of V, compare it to its current cluster label L:
If m > L, set the label to m;
If the cluster label has not changed in this super step, vote to halt;
Else, send the new cluster label via all edges;
Global operation:
While not every vertex votes to halt, and n < N, do another superstep n + 1.
Connected Component: a clustering algorithm (pseudocode)
An efficient implementation
of this algorithm is about
200 lines of code in Java
30. Follow us @GraknLabs
Graql Distributed Analytics Queries
And we’ll continue to add more
algorithms into the language,
such as PageRank, K-Core, Triangle
Count, Density, Cliques, Centrality,
and so on
32. Follow us @GraknLabs
G R A K N
G R A Q L
Grakn is the distributed knowledge base to store complex data. It contains a knowledge
representation system built on top of distributed computing technology stacks.
Graql is a query language that uses machine reasoning to interpret complex relationships &
retrieve implicitly derived knowledge from Grakn. It has a reasoning and analytics engine.
Reasoning Engine
Real-time inference for OLTP
Analytics Engine
Distributed analytics for OLAP
Knowledge Representation System
Novel approach based on hypergraph theory
Automated Reasoning OLTP query language
Interprets complex relationships and infer implicit information
Guarantees logical integrity, like SQL
Real time validation of data wrt. a more expressive schema constraint
Distributed Analytics OLAP query language
Interprets complex relationships and infer implicit information
Expressive Knowledge Representation System
Contains types, subtypes, hyper-relations, rules and instances
High Scale of Relationships, like Graph DBs
Relationships are first class citizens and easy to query without joins
Scales Horizontally, like NoSQL
Scaling by sharding and replication, with linear query throughput
What makes Grakn a Knowledge Graph?
33. Follow us @GraknLabs
“For a computer to pass a Turing Test,
it needs to possess: Natural Language
Processing, Knowledge
Representation, Automated Reasoning
and Machine Learning”
Peter Norvig (Research Director, Google) and
Stuart J. Russell (CS Professor, UC Berkeley),
“Artificial Intelligence: A Modern Approach”,
1994
Wait, why do we need a knowledge base/graph?
34. Follow us @GraknLabs
The Architecture of Cognition
Comprehension and production of
language: communication
Natural Language Processing
Reasoning, problem solving, logical
deduction, and decision making
Automated Reasoning
Expression, Conceptualisation,
memory and understanding
Knowledge Representation
Judgment and evaluation:
To adapt to new
circumstances and to
detect and extrapolate
new patterns
Machine Learning
Information Retrieval, Natural
Language Understanding:
User data, Enterprise data,
Financial data, Web data, etc.
Knowledge Acquisition
COGNITION is "the mental action or
process of acquiring knowledge and
understanding through thought,
experience, and the senses."
35. Follow us @GraknLabs
Knowledge Base/Graph
The Architecture of Cognition
Comprehension and production of
language: communication
Judgment and evaluation:
To adapt to new
circumstances and to
detect and extrapolate
new patterns
Information Retrieval, Natural
Language Understanding:
User data, Enterprise data,
Financial data, Web data, etc.
Storage of knowledge (i.e.
complex information), and
retrieval of explicitly stored data
and derive new conclusions.
Natural Language Processing
Machine LearningKnowledge Acquisition
COGNITION is "the mental action or
process of acquiring knowledge and
understanding through thought,
experience, and the senses."
36. Follow us @GraknLabs
THE ARCHITECTURE OF A COGNITIVE SYSTEM
Natural Language Processing
Knowledge Base Machine LearningKnowledge Acquisition
37. Follow us @GraknLabs
VALUE TO AI: BE THE UNIFIED REPRESENTATION OF KNOWLEDGE
Inference of low-level patterns and
automation of analytics algorithms
Machine translation for parsed
query interpretation
Expressive and extensible
knowledge model
INPUT SYSTEMS
e.g. Information Retrieval, Entity Extraction,
Natural Language Understanding
LEARNING SYSTEMS
e.g. Neural Networks, Bayesian Networks, Kernel
Machines, Genetics Programming
OUTPUT SYSTEMS
e.g. Natural Language Query,
Natural Language Generation
39. Follow us @GraknLabs
GRAKN IS ENABLING DEVELOPMENTS OF AI IN FINANCE & LIFE SCIENCE
FINANCIAL MARKET
KNOWLEDGE BASE
Building a financial market knowledge
base by aggregating information of
real world events to predict the price
movements of different asset classes
CROP SCIENCE
KNOWLEDGE BASE
Building a crop science knowledge
base from half a million field crop trials
data to understand the performance of
different crop varietals and strains
HUMAN GENOMICS
KNOWLEDG BASE
Building a life science knowledge base
by aggregating public & proprietary bio
datasets to drive scientific discovery in
the fields of human genomics