“Linked Metadata by Design” represents the integration of the outcomes from human collaboration, starting from the design phase of data product development. This knowledge is captured in the Data Knowledge Graph. It not only enables data products being robust and compliant, but also well-understood and effectively utilized.
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
Ingka Digital: Linked Metadata by Design
1. Linked Metadata by Design
Presented by: Shirley Bacso
Data Architect at Ingka Digital, Netherlands
2. A few words
about me
• Data Architect at Ingka Digital, Netherlands
• Specialize in metadata
• Love problem-solving
• Love learning
• Graph enthusiast since June 2021
3. June 2021
Oct 2021
May 2022
Implementing
Business
Glossary
Ingka Graph
Event with
Neo4j
Feb 2023
Knowledge Graphs
2023, Prof. Dr.
Harald Sack
Oct 2023
DKG MVP
Dec 2023
NeoDash
Feb 2024
Learn graph
technology
Exam case
studies
Research metadata
management
best practices
Study organization
realities
Observe people
Hunt for metadata
Hunt for metadata
Hunt for metadata
Observe people
Study organization
realities
Hunt for metadata
Observe people
My journey
Metadata – the knowledge of our data
4. What is linked
metadata by
design?
• Linked metadata represents the integration
of outcomes from human collaboration. It
collectively forms the knowledge about
data, which is captured in the data
knowledge graph.
• By design emphasizes that the capture and
integration of metadata should start from
the conceptual level and continue through
the logical level, as human collaboration
begins at the design phase of data product
development.
5. “Metadata as a product”
Conceptual
• Right competence
• Well-defined processes
• Supporting tools
• Smart curation
Pain points of “Metadata as a by-product” and relieves
DAMA DMBOK says:
1. Do these metadata exist?
2. Which ones are correct to use?
3. Do I have the time to link
everything to my data?
7. The Ontology Development Process
Knowledge Graphs 2023, Prof. Dr. Harald Sack, FIZ Karlsruhe – Leibniz Institute for Information Infrastructure & Karlsruhe Institute of Technology
Ontology development in practice is an iterative process
that repeats continuously and improves the ontology.
8. Which knowledge
areas should be
covered by the
knowledge graph?
Collectively capture the broader knowledge about data
10. What types of questions should be answered
by the knowledge graph?
Do all the data
columns/fields have
business names?
Are the data compliant
to (retention) rules?
What are the gaps
between the data quality
dimension goals and the
reality?
Do all the databases
connected to the
software systems?
What systems does a
certain digital domain
have?
What type of personal
data does a software
system have?
Find the most relevant
dataset
(recommendation
engine) and request
access.
Which data products are
overlapping? (Think
about the Principle of
Data as a Product)
How was the data
transformed throughout
the lifetime?
(Provenance and
operational lineage)
…
12. Idea on augmented solution
- Word embeddings for semantic similarity
13. Idea on augmented solution
- Knowledge graph embeddings for link prediction
• Translational Embeddings
o Unsupervised methods, e.g. TransE
o Supervised methods for prediction,
based on embedding vectors
• Transductive Link Prediction
• Inductive Link Prediction:
o Fully-inductive Link Prediction
o Semi-inductive Link Prediction
Knowledge Graph, Semantic Web Technology
14. Bring data to people
where they work
The importance of User Interface for
seamless experience
• Pros and cons of interaction models:
“Point-to-point” vs “Hub-and-spoke”
• How to solve the pain of “hop-
between-the-solutions”?
16. Challenges on the Graph
Journey
The understanding of the benefits in graph
technology is not there yet at all levels. We need
more advocates to get buy-in and fund for
resources.
• Ontology and modeling
• Data governance
• Data engineering
• Graph data science
• UX design and UI building
• …