¿Has tenido alguna vez la sensación de estar haciendo algo sucio al crear una tabla intermedia para modelar una relación muchos a muchos? Seguramente estés intentando utilizar la base de datos incorrecta. Las bases de datos orientadas a grafos son perfectas para modelar datos muy interconectados y hacer consultas muy complejas de forma eficiente y concisa.
En esta charla te daré una introducción al mundo de las bases de datos de grafos, y presentaré Amazon Neptune, una base de datos orientada a grafos compatible con los lenguajes de grafos abiertos Gremlin y SPARQL. Si nunca has visto una base de datos de grafos, lo vas a flipar.
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
En un mundo hiperconectado, las bases de datos de grafos son tu arma secreta
1. MAD · NOV 22-23 · 2019
En un mundo hiperconectado,
las bases de datos de grafos
son tu arma secreta
Javier Ramirez
Technical Evangelist. Amazon Web Services
2. MAD · NOV 22-23 · 2019
Six degrees of Bacon
SAGIndie from Hollywood, USA - Flickr
CC BY 2.0
10. Airlines use case
Legacy systems running on mainframes backed by relational databases with complex workflow
engines and state machines.
Everything could be treated as entity and relationships between planes, parts, maintenance
locations, workstations able to perform the work, availability of parts at set workstations,
personnel and personnel skillsets. Impact of sudden logistic changes and reassignments would
be greatly simplified.
30. Edges – the routes to success
Performance depends on how much of the graph a query
must “touch”
• Choose domain-meaningful edge labels
• Discover only what is absolutely necessary
• “Grey out” unnecessary portions of the graph
31. How fine-grained should my edge labels be?
Is it an open or extensible set of label values?
Do you need to query across values in the set? (e.g. all addresses)
36. MAD · NOV 22-23 · 2019
Relational DBs should really be called “row
databases”Did you ever felt you were overcomplicating your db schema adding
intermediate tables to model a complex relationship?
Did you ever use some obscure hack to query hierarchical or nested data?
Did you experience very degraded performance when having to join many
tables (or to self-join a table)?
63. Example scenario
Employment history application
• People, companies, roles
Use cases
1. Find the companies where X has worked, and their roles at those
companies
2. Find the people who have worked for a company at a specific
location during a particular time period
3. Find the people in more senior roles at the companies where X
worked
64. Identify entities, relationships, and attributes
Find the companies where X has worked, and their roles at
those companies
Which companies has X worked for, and in what roles?
Companies Company Entity
X Person Entity
Worked for Worked for Relationship
Roles Role Attribute?
65. Identify candidate vertices, labels, and
properties
Company Entity Vertex Company
Person Entity Vertex Person
Worked for Relationship Edge WORKED_FOR
Role Attribute? Property? role
name
CompanyPerson
66. Entity or attribute? Vertex or property?
Is role an entity or an attribute?
• Does it have identity (or is it a value type)? X
• Is it a complex type (with multiple fields)? X
• Are there any structural relations between values? ?
Keep it simple
• Prefer properties to vertices/edges until the need arises
68. What questions would we have to ask of our
data?
Find the people in more senior roles at the companies where
X worked
Who were in senior roles at the companies where X worked?
69. Entity or attribute? Vertex or property?
Is role an entity or an attribute?
• Does it have identity (or is it a value type)? X
• Is it a complex type (with multiple fields)? X
• Are there any structural relations between values? ✓
Model structural relations with edges
• Promote role to being a vertex
82. Wrapping up
Graphs can be applied to a huge number of use cases
A graph database fits more naturally and performs much faster than other
databases when working with highly connected data
Scaling out a graph database is not easy. Amazon Neptune makes your
life easier
85. Mapping relational to graph (RDF)
W3C DirectMapping
• “Out-of-the-box” schema for mapping relational data to RDF
• https://www.w3.org/TR/rdb-direct-mapping/
R2RML
• Standard that allows you to specify mappings from relations to
graph
• Rules defined over logical tables
• https://www.w3.org/TR/r2rml/
Tooling
• D2RQ – access relational database as virtual, read-only RDF graph
87. Foreign keys
12 … Alice
id … f_name
37 … Bob
512 12 home High St
655 37 work Main St
700 12 work Any St
id p_id type addr_1
88. Foreign keys
12 … Alice
id … f_name
37 … Bob
512 12 home High St
655 37 work Main St
700 12 work Any St
id p_id type addr_1
89. Join tables
12 … Alice
id … f_name
37 … Bob
512 … Any Co
655 … Example
Co
700 … Example.
com
id … name
12 512 2012 2015
37 512 2011 2016
p_id c_id from to
37 655 2016 2017
12 700 2015 2017
92. Key-Value
Alice TX Austin:Dev
37 Bob TX Dallas:Dev 555-0100
99 Dan TX 2016 Austin:Ops 555-0199
id f_name state start city:dept tel
93. Data load options
Bulk loader API
• Load data from S3
into Neptune
• Low overhead,
optimized for large
datasets
• Good for append-
only loads
Online endpoints
• Gremlin or SPARQL
Bulk load
from S3
Database
Mgmt.
94. Appendix II
Ignition One. Customer use case presented at
AWS Atlanta Summit
https://www.slideshare.net/AmazonWebServic
es/using-amazon-neptune-to-power-identity-
resolution-at-scale-adb303-atlanta-aws-
summit