When we set out to build a knowledge graph at Zalando, most people did not know how to build one, or considered machine learning as the better solution. However, endorsement from upper management led to the current project, where we use ontologies to improve the customer search and browsing experience.
There are many unique things about the way we built our ontology for Enterprise purposes. Our ontology is peer-reviewed, use case-driven, and we apply special techniques to keep the graph and our APIs and data in sync.
Communicating the graph to different professionals also has its challenges. Backend engineers and machine learning experts have a hard time understanding knowledge graph quirks. Product people accept it only if it creates a clear improvement for customers. How do you reconcile them all?
2. 2
Zalando at a Glance
Enterprise Knowledge Graph
Definition of the Knowledge Graph
In the Beginning...
What we implemented and added value
BUILDING THE FASHION
KNOWLEDGE GRAPH
3. 3
ZALANDO AT A GLANCE
~ 4.5billion EUR
revenue 2017
> 200
million
visits
per
month
> 15,000
employees in
Europe
> 90
million
orders
> 24
million
active customers
> 300,000
product choices
~ 2,000
brands
17
countries
as at Aug 2018
10. 10
COMMUNICATING THE KNOWLEDGE GRAPH
PRODUCT MANAGERS
Does it improve our
customer experience?
Does it make money?
BACKEND ENGINEERS
Open World
Assumption?
Why Graph Databases?
MACHINE LEARNING
EXPERTS
Only see the graph as a data
source like any other data and
complain there is too little of
the data.
12. “Search can be improved with many Machine Learning algorithms.
Most successful search engines also use Knowledge Graphs to
improve the search. We should explore this possibility.”
13. “Is a static Category Tree the best way
to represent fashion contents?”
14. 14
IN THE BEGINNING...
Little Semantic Web
Knowledge Inside the
Company
“Ontologies were used in
one project I worked on
in another company.
They did not really
work.”
“Machine Learning
works better.”
“So it is manual work?
Will it scale?”
Upper Management
Endorsement
Team of Backend Developers
was put together and some
research engineers
Knowledge
sharing on
Ontologies
RDF,
SPARQL
15. 15
GETTING INTO THE TOPIC OF SEMANTIC WEB
Do you the benefit and added value of
knowledge graphs?
Team skills
Research
Backend
Engineering
Backend &
Frontend
Engineering
Product
16. 16
GETTING INTO THE TOPIC OF SEMANTIC WEB
Modelling
RDF
GraphDB
OntoClean
“Proper
modelling.
How should it
be done?”
SPARQL
RDF
Syntax,likeTurtle
GraphDB
Modelling
OntoClean
HARD TO LEARN
“Enough
high-quality
data”
“Knowledge modelling. Data
has to be correct at all times,
but at the same time simple
and easy to follow”
“Performance and use
of graph databases”
“balance between a clean
graph and use cases”
EASY TO LEARN
17. 17
COOLEST TOPICS OR FEATURES IN SEMANTIC WEB
“Graph
databases in
general”
“Sparql Query
language”
“interlinkedness: how one
part of the system can make
use of another, almost
unrelated, part”
“The power of SPARQL
for querying the graph.”
“SPARQL and what it can
do with normalised data”
“The possibility to connect
different kinds of information
from multiple sources into
one entity”
“There are many features and
use cases still undiscovered,
which I believe, a graph data
structure helps to fulfil.”
“Ability to create human
understanding and
'intelligence' out of relations.”
20. 20
WHERE WE SEE THE MOST VALUE OF THE GRAPH
Measured significant
improvement of search
powered by the graph
Fashion concepts do not
only fetch products, they
fetch editorials and other
kinds of content
22. 22
GRAPH CONTENTS IS PEER-REVIEWED
application
ontology
fashion
concepts
maintained in a GitHub repository
pull requests
4-eye principle
MODELLING PRINCIPLES
OntoClean adapted
Consistency in content
connections are analysed for
subsuming fashion concepts
Use Case Driven Modelling
23. 23
NO ZALANDO CONTENTS IN THE GRAPH
Zalando
contents
> 300K products cannot
be stored in the graph
MICROSERVICES TO THE RESCUE
We store rules with which products
can be retrieved from other
systems via API calls.
Another service uses those rules to
index our fashion concept
identifiers onto Zalando’s products.
So that other services can
consume it.
24. 24
GRAPH DATABASE – WHAT TO USE?
DATOMIC
Implement our own
triple store
BLAZEGRAPH
Open-source graph
database
AMAZON NEPTUNE
AWS, compliant, supported
25. 25
HOW WE DECREASED LATENCY
COMPLEX
MODELLING
What is a fashion
concept is implied via
modelling
SPARQL
Queries are long and
complex = latency
INFERENCE
We implement our own
inference rules
31. This presentation and its contents are strictly confidential. It may not, in
whole or in part, be reproduced, redistributed, published or passed on to
any other person by the recipient.
The information in this presentation has not been independently verified. No
representation or warranty, express or implied, is made as to the accuracy
or completeness of the presentation and the information contained herein
and no reliance should be placed on such information. No responsibility is
accepted for any liability for any loss howsoever arising, directly or
indirectly, from this presentation or its contents.
DISCLAIMER
31