SlideShare a Scribd company logo
Anything-to-Graph
Knowledge Graph Conference
May 2021
Joshua Shinavier, PhD ( : joshsh)
Data @
Overview
○ Building graphs
○ Models
○ Mappings
○ Use cases
○ TinkerPop 4
Building graphs
There are graphs, and graphs
○ Domain-specific graphs are simplest
○ ETL a few data sources into a graph
○ Mappings can be written and maintained by hand
○ Off-the-shelf tools work well
○ Difficulty increases with
○ Complexity of source schemas
○ Diversity of ownership, quality, and governance in data sources
○ Lack of standardization on languages and vocabulary
○ Some challenges are organizational, others technical
○ Organizational: see Lessons From Reality (KGC 2019)
○ Technical: let’s take a closer look
The challenges of heterogeneity
○ Diverse data sources → need supporting metadata
○ E.g. see Uber’s Databook
○ Diverse data governance → need better data isolation
○ Typically, RDF triple stores do better than PG databases here
○ Diverse domain data models → need standardized schemas
○ E.g. see Uber’s Data Standardization
○ Diverse schema and data languages → need well-defined mappings
○ Enter the Dragon
Prerequisites
○ Data must conform to a schema
○ It is OK to have a mix of schemas, and schema languages
○ Unique identifiers must be clearly distinguished, and typed
○ What is this UUID field? Does it identify a User, a Document, etc.?
○ Some degree of standardization is needed
○ Are identifiers consistent across data sources?
○ Can this timestamp value be compared with that timestamp value? Etc.
○ Need well-defined mappings
○ From each data language into a graph format
○ From each schema language into a graph schema language
○ ...without losing too much information
○ ...and while maintaining consistency between schema and data
Models
Is there a “universal data model” for graphs?
○ Desirable characteristics
○ Centrality: ease of alignment with other graph and non-graph models
○ Flexibility: captures a wide variety of graph structures
○ Formality: serves as a tool for reasoning about data models
○ Practicality: serves as a basis for inference, query optimization, etc.
○ Intuitiveness: a good graph schema captures our mental model
○ Where can the right abstractions be found?
○ Logics? Set theory? Algebras? Category theory?
○ Does the data model already exist?
○ What about RDF-based languages (RDFS, OWL, SHACL, ShEx, etc.)?
○ Is that what GQL is going to be?
Graph features are very diverse
Spreadsheet by Gábor Szárnyas
Complexity is not the answer
○ No one language will ever satisfy all graph use cases
○ Implementing the model must be simple, or developers won’t
○ Who remembers CODASYL? Why aren’t we using it?
○ Go for minimalism + extensibility
○ Dimensions being explored for GQL
○ Mathematical foundations
○ Nominal vs. structural typing
○ Inheritance vs. composition
○ Graph element types vs. property types
○ Descriptive vs. prescriptive schemas
A little algebra goes a long way
○ Algebraic Property Graphs1
○ Pragmatic and opinionated graph data model
○ Schemas are vertex/edge/etc. labels bound to algebraic data types
○ Formally defined using category theory
○ Effectively a subset of SHACL
○ Ideally, GQL will also be a superset
[1] https://arxiv.org/abs/1909.04881
Mappings
Desirable properties
○ Composability
○ If we have f : A → B and g : B → C, we should also have f;g : A → C
○ Bidirectionality
○ If we have f : A → B, we should also have f-1
: B → A, with f;f-1
≅ f-1
;f ≅ id
○ Data : schema consistency
○ Mappings should be defined for data and schemas languages in parallel
○ If data d conforms to schema s, then we need f(d) to conform to f(s)
Topology of mappings
○ Arbitrary / ad-hoc topology
○ Pros: seems easiest; just use the pairwise mappings you have
○ Cons: more indirection, and mappings may not compose well
○ Star topology
○ Pros: mappings compose well, and all paths are short
○ Cons: need to define/identify a central data model (the “dragon”)
Transforming schemas and data
○ Schema-level mappings
○ Transformations on data types
○ Data-level mappings
○ Transformations on instances of data types
○ Schema and data-level mappings must be consistent
○ a :: A ⇒ F(a) :: F(A)
○ Use common intermediate representations
Dragon
○ Framework for data and schema migration
○ Developed for Uber’s Data Standardization
○ Dragon data model
○ Extension of Algebraic Property Graphs
○ Covers the majority of schemas at Uber
○ Sources and targets
○ RPC / interface languages: Protocol Buffers, Apache Thrift, Apache Avro
○ RDF-based languages: SHACL, OWL
○ “Schemaless” formats: YAML, JSON
○ Programming languages: Haskell, Java, Scala
○ Implementations in Haskell and Java
○ Dragon generates over half of its own source code (from YAML)
$ dragon transform -i Protobuf -o SHACL $SCHEMAS -d maps /tmp/shacl
______. ______/ .______
Dragon 0.3.4 ----___ O .. O ___----
---------------------vvv----vvvv----vvv---------------------
Loading configuration from dragon.yaml
Reading from Protobuf starting at maps in base directory /Users/joshsh/projects/uber/example/idl
Found 3 *.proto sources in /Users/joshsh/projects/uber/example/idl/maps
Visiting 3 files (total of 0 so far):
/Users/joshsh/projects/uber/example/idl/maps/coordinates.proto
/Users/joshsh/projects/uber/example/idl/maps/geometry.proto
/Users/joshsh/projects/uber/example/idl/maps/position.proto
Visiting 3 files (total of 3 so far):
/Users/joshsh/projects/uber/example/idl/physics/units.proto
/Users/joshsh/projects/uber/example/idl/time/duration.proto
/Users/joshsh/projects/uber/example/idl/time/epoch.proto
Instantiated a graph of 12 schemas
Note 1 alert:
info: unsigned integers not supported for RDF targets. Using signed integers
Writing 12 SHACL artifacts to /tmp/shacl
Use cases
JSON-to-Graph
○ Graph data formats are used nowhere in the enterprise
○ RDF serialization formats, GraphML, etc. are a hard sell
○ JSON and YAML are used everywhere
○ Give the JSON or YAML a schema, then treat it like a serialized graph
○ E.g. Databook1
snapshots to RDF
[1] https://eng.uber.com/metadata-insights-databook
gRPC-to-Graph
○ Don’t let developers write individual edges / triples to the graph
○ Schema constraints are harder to enforce, errors are harder to understand
○ Transform graph schemas into developer-facing schemas
○ Use familiar schema languages and protocols like Protobuf + gRPC
○ Transform payload messages into graph data
○ Schema and data transformations guarantee constraint satisfaction
Enriching domain schemas
○ Need a little more than “just Protobuf” / “just Thrift” to build a big graph
○ E.g. entity / identifier types need to be widely re-used
○ Start with a graph schema (e.g. in YAML)
○ Propagate logical data types into domain schema languages
○ At Uber: Protobuf, Thrift, Avro
○ Re-use the logical types in many domain schemas
○ Incentivize re-use, and use schema transformations for metrics
○ Now, graph-friendly types are everywhere
Toward TinkerPop 4
○ Alibaba Graph Database
○ Amazon Neptune
○ ArangoDB
○ Bitsy
○ Blazegraph
○ CosmosDB
○ ChronoGraph
○ DSEGraph
○ GRAKN.AI
○ Hadoop (Spark)
○ HGraphDB
○ Huawei Graph Engine Service
○ IBM Graph
○ JanusGraph
○ Neo4j
○ neo4j-gremlin-bolt
○ OrientDB
○ Apache S2Graph
○ Sqlg
○ Stardog
○ TinkerGraph
○ Titan
○ Titan + Tupl
○ Unipop
Dozens of graph systems
○ Clojure: ogre
○ Cypher: cypher-for-gremlin
○ Elixir: gremlex
○ Go: grammes, gremgo
○ Haskell: greskell, gremlin-haskell
○ Java: Ferma, gremlin-objects,
Peapod, spring-data-gremlin,
gremlin-driver
○ JavaScript: gremlin-javascript,
gremlin-orm,
gremlin-template-string
○ Kotlin: kotlin-gremlin-ogm
○ .NET: Gremlin.Net, Gremlinq
○ PHP: gremlin-php
○ Python: Goblin, gremlin-python,
gremlin-py, ipython-gremlin,
gremlinclient, gremlin-python, JUGRI,
gremlinrestclient, python-gremlin-rest
○ Ruby: gremlin_client
○ Rust: gremlin-rs
○ Scala: gremlin-scala,
reactive-gremlin,
scalajs-gremlin-client
○ SPARQL: sparql-gremlin
○ SQL: sql-gremlin
○ Typescript: ts-tinkerpop
Dozens of programming languages
Interoperability via mappings
○ We need parity across languages and systems
○ Make it easier to create new Gremlin language variants in a consistent way
○ “Escape from the JVM”
○ Code generation can help
○ Generate graph APIs consistently into many programming languages
○ Generate grammars for parsing/validation
A unifying schema language
○ A few vendor-specific schema languages exist
○ Weak and disconnected from each other
○ Also note TinkerPop’s Graph.Features
○ Need a real, vendor-neutral schema language
○ Facilitate composability of data and queries
○ Enable inference for type safety and optimizations
○ Propagate logical schemas into multiple graph back-ends
○ Build vendor-agnostic tools for data migration
Serialization formats for graphs
○ Currently serialization options are limited
○ GraphML (XML), GraphSON (JSON), GraphBinary, Gryo (Kryo)
○ Generic JSON and YAML are straightforward
○ What about common RPC formats?
○ Protobuf, Thrift, Avro, etc.
○ What about RDF serialization formats?
○ N-Triples, Turtle, JSON-LD, etc.
○ Others?
○ Parquet? CBOR? FlatBuffers? MessagePack? Etc.
Stay tuned
○ “How to Build a Dragon”
○ https://www.meetup.com/Category-Theory
○ Release Dragon!
○ http://bit.ly/release_dragon
○ Gremlin-users
○ https://groups.google.com/g/gremlin-users
@KGConference
linkedin.com/company/the-knowldge-graph-conference/
youtube.com/playlist?list=PLAiy7NYe9U2Gjg-600CTV1HGypiF95d_D
@joshsh
Proprietary © 2021 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any
form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval
systems, without permission in writing from Uber. This document is intended only for the use of the individual or entity to
whom it is addressed and contains information that is privileged, confidential or otherwise exempt from disclosure under
applicable law. All recipients of this document are notified that the information contained herein includes proprietary and
confidential information of Uber, and recipient may not make use of, disseminate, or in any way disclose this document or any
of the enclosed information to any person other than employees of addressee to the extent necessary for consultations with
authorized personnel of Uber.

More Related Content

What's hot

Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with PythonGokhan Atil
 
SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020andyseaborne
 
Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Julian Hyde
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...VMware Tanzu
 
Combining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASA
Combining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASACombining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASA
Combining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASANeo4j
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphIoan Toma
 
Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Stamatis Zampetakis
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache CalciteJordan Halterman
 
The Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j OverviewThe Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j OverviewNeo4j
 
SPARQL 사용법
SPARQL 사용법SPARQL 사용법
SPARQL 사용법홍수 허
 
Beyond SQL: Speeding up Spark with DataFrames
Beyond SQL: Speeding up Spark with DataFramesBeyond SQL: Speeding up Spark with DataFrames
Beyond SQL: Speeding up Spark with DataFramesDatabricks
 
Developer Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesDeveloper Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesMax De Marzi
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Julian Hyde
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge GraphsJeff Z. Pan
 
Graphql Intro (Tutorial and Example)
Graphql Intro (Tutorial and Example)Graphql Intro (Tutorial and Example)
Graphql Intro (Tutorial and Example)Rafael Wilber Kerr
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best PracticesCapgemini
 
Workshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data ScienceWorkshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data ScienceNeo4j
 

What's hot (20)

Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020
 
Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)Apache Calcite (a tutorial given at BOSS '21)
Apache Calcite (a tutorial given at BOSS '21)
 
Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...Federated Queries Across Both Different Storage Mediums and Different Data En...
Federated Queries Across Both Different Storage Mediums and Different Data En...
 
Combining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASA
Combining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASACombining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASA
Combining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASA
 
Querying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge GraphQuerying the Wikidata Knowledge Graph
Querying the Wikidata Knowledge Graph
 
Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21Apache Calcite Tutorial - BOSS 21
Apache Calcite Tutorial - BOSS 21
 
NoSQL
NoSQLNoSQL
NoSQL
 
Introduction to Apache Calcite
Introduction to Apache CalciteIntroduction to Apache Calcite
Introduction to Apache Calcite
 
The Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j OverviewThe Graph Database Universe: Neo4j Overview
The Graph Database Universe: Neo4j Overview
 
SPARQL 사용법
SPARQL 사용법SPARQL 사용법
SPARQL 사용법
 
Beyond SQL: Speeding up Spark with DataFrames
Beyond SQL: Speeding up Spark with DataFramesBeyond SQL: Speeding up Spark with DataFrames
Beyond SQL: Speeding up Spark with DataFrames
 
Developer Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker NotesDeveloper Intro Deck-PowerPoint - Download for Speaker Notes
Developer Intro Deck-PowerPoint - Download for Speaker Notes
 
SPARQL Tutorial
SPARQL TutorialSPARQL Tutorial
SPARQL Tutorial
 
Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!Don’t optimize my queries, optimize my data!
Don’t optimize my queries, optimize my data!
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
 
Graphql Intro (Tutorial and Example)
Graphql Intro (Tutorial and Example)Graphql Intro (Tutorial and Example)
Graphql Intro (Tutorial and Example)
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best Practices
 
Workshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data ScienceWorkshop Tel Aviv - Graph Data Science
Workshop Tel Aviv - Graph Data Science
 

Similar to Anything-to-Graph

HPEC 2021 sparse binary format
HPEC 2021 sparse binary formatHPEC 2021 sparse binary format
HPEC 2021 sparse binary formatErikWelch2
 
Find your way in Graph labyrinths
Find your way in Graph labyrinthsFind your way in Graph labyrinths
Find your way in Graph labyrinthsDaniel Camarda
 
Substrait Overview.pdf
Substrait Overview.pdfSubstrait Overview.pdf
Substrait Overview.pdfRinat Abdullin
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine
 
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Aaron Saray
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RGraphRM
 
Ballerina Tutorial @ SummerSOC 2019
Ballerina Tutorial @ SummerSOC 2019Ballerina Tutorial @ SummerSOC 2019
Ballerina Tutorial @ SummerSOC 2019kshanth2101
 
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...Jean Ihm
 
Comparing PGQL, G-Core and Cypher
Comparing PGQL, G-Core and CypherComparing PGQL, G-Core and Cypher
Comparing PGQL, G-Core and CypheropenCypher
 
Overview of no sql
Overview of no sqlOverview of no sql
Overview of no sqlSean Murphy
 
Graph analysis over relational database
Graph analysis over relational databaseGraph analysis over relational database
Graph analysis over relational databaseGraphRM
 
Gremlin Queries with DataStax Enterprise Graph
Gremlin Queries with DataStax Enterprise GraphGremlin Queries with DataStax Enterprise Graph
Gremlin Queries with DataStax Enterprise GraphStephen Mallette
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Databricks
 
The 2nd graph database in sv meetup
The 2nd graph database in sv meetupThe 2nd graph database in sv meetup
The 2nd graph database in sv meetupJoshua Bae
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2Dimitris Kontokostas
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriDemi Ben-Ari
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022ArangoDB Database
 
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ArangoDB Database
 

Similar to Anything-to-Graph (20)

HPEC 2021 sparse binary format
HPEC 2021 sparse binary formatHPEC 2021 sparse binary format
HPEC 2021 sparse binary format
 
Data quality in Real Estate
Data quality in Real EstateData quality in Real Estate
Data quality in Real Estate
 
Find your way in Graph labyrinths
Find your way in Graph labyrinthsFind your way in Graph labyrinths
Find your way in Graph labyrinths
 
Substrait Overview.pdf
Substrait Overview.pdfSubstrait Overview.pdf
Substrait Overview.pdf
 
Brett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4jBrett Ragozzine - Graph Databases and Neo4j
Brett Ragozzine - Graph Databases and Neo4j
 
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
 
aRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con RaRangodb, un package per l'utilizzo di ArangoDB con R
aRangodb, un package per l'utilizzo di ArangoDB con R
 
Ballerina Tutorial @ SummerSOC 2019
Ballerina Tutorial @ SummerSOC 2019Ballerina Tutorial @ SummerSOC 2019
Ballerina Tutorial @ SummerSOC 2019
 
Neo4j graph database
Neo4j graph databaseNeo4j graph database
Neo4j graph database
 
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
How To Model and Construct Graphs with Oracle Database (AskTOM Office Hours p...
 
Comparing PGQL, G-Core and Cypher
Comparing PGQL, G-Core and CypherComparing PGQL, G-Core and Cypher
Comparing PGQL, G-Core and Cypher
 
Overview of no sql
Overview of no sqlOverview of no sql
Overview of no sql
 
Graph analysis over relational database
Graph analysis over relational databaseGraph analysis over relational database
Graph analysis over relational database
 
Gremlin Queries with DataStax Enterprise Graph
Gremlin Queries with DataStax Enterprise GraphGremlin Queries with DataStax Enterprise Graph
Gremlin Queries with DataStax Enterprise Graph
 
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
 
The 2nd graph database in sv meetup
The 2nd graph database in sv meetupThe 2nd graph database in sv meetup
The 2nd graph database in sv meetup
 
Graph databases & data integration v2
Graph databases & data integration v2Graph databases & data integration v2
Graph databases & data integration v2
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
 
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
 

More from Joshua Shinavier

In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)Joshua Shinavier
 
In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)Joshua Shinavier
 
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)Joshua Shinavier
 
TinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBsTinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBsJoshua Shinavier
 
semantic markup using schema.org
semantic markup using schema.orgsemantic markup using schema.org
semantic markup using schema.orgJoshua Shinavier
 
The Real-time Web in the Age of Agents
The Real-time Web in the Age of AgentsThe Real-time Web in the Age of Agents
The Real-time Web in the Age of AgentsJoshua Shinavier
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsJoshua Shinavier
 
Real-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 charsReal-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 charsJoshua Shinavier
 
The state of the art in Linked Data
The state of the art in Linked DataThe state of the art in Linked Data
The state of the art in Linked DataJoshua Shinavier
 

More from Joshua Shinavier (11)

In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
 
In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)In Search of the Universal Data Model (Connected Data London 2019)
In Search of the Universal Data Model (Connected Data London 2019)
 
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
 
TinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBsTinkerPop: a story of graphs, DBs, and graph DBs
TinkerPop: a story of graphs, DBs, and graph DBs
 
Semantics and Sensors
Semantics and SensorsSemantics and Sensors
Semantics and Sensors
 
semantic markup using schema.org
semantic markup using schema.orgsemantic markup using schema.org
semantic markup using schema.org
 
The Real-time Web in the Age of Agents
The Real-time Web in the Age of AgentsThe Real-time Web in the Age of Agents
The Real-time Web in the Age of Agents
 
Linked Process
Linked ProcessLinked Process
Linked Process
 
Real-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter AnnotationsReal-time Semantic Web with Twitter Annotations
Real-time Semantic Web with Twitter Annotations
 
Real-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 charsReal-time #SemanticWeb in 140 chars
Real-time #SemanticWeb in 140 chars
 
The state of the art in Linked Data
The state of the art in Linked DataThe state of the art in Linked Data
The state of the art in Linked Data
 

Recently uploaded

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...Elena Simperl
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Alison B. Lowndes
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform EngineeringJemma Hussein Allen
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»QADay
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupCatarinaPereira64715
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsVlad Stirbu
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati
 

Recently uploaded (20)

When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»НАДІЯ ФЕДЮШКО БАЦ  «Професійне зростання QA спеціаліста»
НАДІЯ ФЕДЮШКО БАЦ «Професійне зростання QA спеціаліста»
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Ransomware Mallox [EN].pdf
Ransomware         Mallox       [EN].pdfRansomware         Mallox       [EN].pdf
Ransomware Mallox [EN].pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 

Anything-to-Graph

  • 1. Anything-to-Graph Knowledge Graph Conference May 2021 Joshua Shinavier, PhD ( : joshsh) Data @
  • 2. Overview ○ Building graphs ○ Models ○ Mappings ○ Use cases ○ TinkerPop 4
  • 4. There are graphs, and graphs ○ Domain-specific graphs are simplest ○ ETL a few data sources into a graph ○ Mappings can be written and maintained by hand ○ Off-the-shelf tools work well ○ Difficulty increases with ○ Complexity of source schemas ○ Diversity of ownership, quality, and governance in data sources ○ Lack of standardization on languages and vocabulary ○ Some challenges are organizational, others technical ○ Organizational: see Lessons From Reality (KGC 2019) ○ Technical: let’s take a closer look
  • 5. The challenges of heterogeneity ○ Diverse data sources → need supporting metadata ○ E.g. see Uber’s Databook ○ Diverse data governance → need better data isolation ○ Typically, RDF triple stores do better than PG databases here ○ Diverse domain data models → need standardized schemas ○ E.g. see Uber’s Data Standardization ○ Diverse schema and data languages → need well-defined mappings ○ Enter the Dragon
  • 6. Prerequisites ○ Data must conform to a schema ○ It is OK to have a mix of schemas, and schema languages ○ Unique identifiers must be clearly distinguished, and typed ○ What is this UUID field? Does it identify a User, a Document, etc.? ○ Some degree of standardization is needed ○ Are identifiers consistent across data sources? ○ Can this timestamp value be compared with that timestamp value? Etc. ○ Need well-defined mappings ○ From each data language into a graph format ○ From each schema language into a graph schema language ○ ...without losing too much information ○ ...and while maintaining consistency between schema and data
  • 8. Is there a “universal data model” for graphs? ○ Desirable characteristics ○ Centrality: ease of alignment with other graph and non-graph models ○ Flexibility: captures a wide variety of graph structures ○ Formality: serves as a tool for reasoning about data models ○ Practicality: serves as a basis for inference, query optimization, etc. ○ Intuitiveness: a good graph schema captures our mental model ○ Where can the right abstractions be found? ○ Logics? Set theory? Algebras? Category theory? ○ Does the data model already exist? ○ What about RDF-based languages (RDFS, OWL, SHACL, ShEx, etc.)? ○ Is that what GQL is going to be?
  • 9. Graph features are very diverse Spreadsheet by Gábor Szárnyas
  • 10. Complexity is not the answer ○ No one language will ever satisfy all graph use cases ○ Implementing the model must be simple, or developers won’t ○ Who remembers CODASYL? Why aren’t we using it? ○ Go for minimalism + extensibility ○ Dimensions being explored for GQL ○ Mathematical foundations ○ Nominal vs. structural typing ○ Inheritance vs. composition ○ Graph element types vs. property types ○ Descriptive vs. prescriptive schemas
  • 11. A little algebra goes a long way ○ Algebraic Property Graphs1 ○ Pragmatic and opinionated graph data model ○ Schemas are vertex/edge/etc. labels bound to algebraic data types ○ Formally defined using category theory ○ Effectively a subset of SHACL ○ Ideally, GQL will also be a superset [1] https://arxiv.org/abs/1909.04881
  • 13. Desirable properties ○ Composability ○ If we have f : A → B and g : B → C, we should also have f;g : A → C ○ Bidirectionality ○ If we have f : A → B, we should also have f-1 : B → A, with f;f-1 ≅ f-1 ;f ≅ id ○ Data : schema consistency ○ Mappings should be defined for data and schemas languages in parallel ○ If data d conforms to schema s, then we need f(d) to conform to f(s)
  • 14. Topology of mappings ○ Arbitrary / ad-hoc topology ○ Pros: seems easiest; just use the pairwise mappings you have ○ Cons: more indirection, and mappings may not compose well ○ Star topology ○ Pros: mappings compose well, and all paths are short ○ Cons: need to define/identify a central data model (the “dragon”)
  • 15. Transforming schemas and data ○ Schema-level mappings ○ Transformations on data types ○ Data-level mappings ○ Transformations on instances of data types ○ Schema and data-level mappings must be consistent ○ a :: A ⇒ F(a) :: F(A) ○ Use common intermediate representations
  • 16. Dragon ○ Framework for data and schema migration ○ Developed for Uber’s Data Standardization ○ Dragon data model ○ Extension of Algebraic Property Graphs ○ Covers the majority of schemas at Uber ○ Sources and targets ○ RPC / interface languages: Protocol Buffers, Apache Thrift, Apache Avro ○ RDF-based languages: SHACL, OWL ○ “Schemaless” formats: YAML, JSON ○ Programming languages: Haskell, Java, Scala ○ Implementations in Haskell and Java ○ Dragon generates over half of its own source code (from YAML)
  • 17. $ dragon transform -i Protobuf -o SHACL $SCHEMAS -d maps /tmp/shacl ______. ______/ .______ Dragon 0.3.4 ----___ O .. O ___---- ---------------------vvv----vvvv----vvv--------------------- Loading configuration from dragon.yaml Reading from Protobuf starting at maps in base directory /Users/joshsh/projects/uber/example/idl Found 3 *.proto sources in /Users/joshsh/projects/uber/example/idl/maps Visiting 3 files (total of 0 so far): /Users/joshsh/projects/uber/example/idl/maps/coordinates.proto /Users/joshsh/projects/uber/example/idl/maps/geometry.proto /Users/joshsh/projects/uber/example/idl/maps/position.proto Visiting 3 files (total of 3 so far): /Users/joshsh/projects/uber/example/idl/physics/units.proto /Users/joshsh/projects/uber/example/idl/time/duration.proto /Users/joshsh/projects/uber/example/idl/time/epoch.proto Instantiated a graph of 12 schemas Note 1 alert: info: unsigned integers not supported for RDF targets. Using signed integers Writing 12 SHACL artifacts to /tmp/shacl
  • 19. JSON-to-Graph ○ Graph data formats are used nowhere in the enterprise ○ RDF serialization formats, GraphML, etc. are a hard sell ○ JSON and YAML are used everywhere ○ Give the JSON or YAML a schema, then treat it like a serialized graph ○ E.g. Databook1 snapshots to RDF [1] https://eng.uber.com/metadata-insights-databook
  • 20. gRPC-to-Graph ○ Don’t let developers write individual edges / triples to the graph ○ Schema constraints are harder to enforce, errors are harder to understand ○ Transform graph schemas into developer-facing schemas ○ Use familiar schema languages and protocols like Protobuf + gRPC ○ Transform payload messages into graph data ○ Schema and data transformations guarantee constraint satisfaction
  • 21. Enriching domain schemas ○ Need a little more than “just Protobuf” / “just Thrift” to build a big graph ○ E.g. entity / identifier types need to be widely re-used ○ Start with a graph schema (e.g. in YAML) ○ Propagate logical data types into domain schema languages ○ At Uber: Protobuf, Thrift, Avro ○ Re-use the logical types in many domain schemas ○ Incentivize re-use, and use schema transformations for metrics ○ Now, graph-friendly types are everywhere
  • 23. ○ Alibaba Graph Database ○ Amazon Neptune ○ ArangoDB ○ Bitsy ○ Blazegraph ○ CosmosDB ○ ChronoGraph ○ DSEGraph ○ GRAKN.AI ○ Hadoop (Spark) ○ HGraphDB ○ Huawei Graph Engine Service ○ IBM Graph ○ JanusGraph ○ Neo4j ○ neo4j-gremlin-bolt ○ OrientDB ○ Apache S2Graph ○ Sqlg ○ Stardog ○ TinkerGraph ○ Titan ○ Titan + Tupl ○ Unipop Dozens of graph systems
  • 24. ○ Clojure: ogre ○ Cypher: cypher-for-gremlin ○ Elixir: gremlex ○ Go: grammes, gremgo ○ Haskell: greskell, gremlin-haskell ○ Java: Ferma, gremlin-objects, Peapod, spring-data-gremlin, gremlin-driver ○ JavaScript: gremlin-javascript, gremlin-orm, gremlin-template-string ○ Kotlin: kotlin-gremlin-ogm ○ .NET: Gremlin.Net, Gremlinq ○ PHP: gremlin-php ○ Python: Goblin, gremlin-python, gremlin-py, ipython-gremlin, gremlinclient, gremlin-python, JUGRI, gremlinrestclient, python-gremlin-rest ○ Ruby: gremlin_client ○ Rust: gremlin-rs ○ Scala: gremlin-scala, reactive-gremlin, scalajs-gremlin-client ○ SPARQL: sparql-gremlin ○ SQL: sql-gremlin ○ Typescript: ts-tinkerpop Dozens of programming languages
  • 25. Interoperability via mappings ○ We need parity across languages and systems ○ Make it easier to create new Gremlin language variants in a consistent way ○ “Escape from the JVM” ○ Code generation can help ○ Generate graph APIs consistently into many programming languages ○ Generate grammars for parsing/validation
  • 26. A unifying schema language ○ A few vendor-specific schema languages exist ○ Weak and disconnected from each other ○ Also note TinkerPop’s Graph.Features ○ Need a real, vendor-neutral schema language ○ Facilitate composability of data and queries ○ Enable inference for type safety and optimizations ○ Propagate logical schemas into multiple graph back-ends ○ Build vendor-agnostic tools for data migration
  • 27. Serialization formats for graphs ○ Currently serialization options are limited ○ GraphML (XML), GraphSON (JSON), GraphBinary, Gryo (Kryo) ○ Generic JSON and YAML are straightforward ○ What about common RPC formats? ○ Protobuf, Thrift, Avro, etc. ○ What about RDF serialization formats? ○ N-Triples, Turtle, JSON-LD, etc. ○ Others? ○ Parquet? CBOR? FlatBuffers? MessagePack? Etc.
  • 28. Stay tuned ○ “How to Build a Dragon” ○ https://www.meetup.com/Category-Theory ○ Release Dragon! ○ http://bit.ly/release_dragon ○ Gremlin-users ○ https://groups.google.com/g/gremlin-users @KGConference linkedin.com/company/the-knowldge-graph-conference/ youtube.com/playlist?list=PLAiy7NYe9U2Gjg-600CTV1HGypiF95d_D @joshsh
  • 29. Proprietary © 2021 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval systems, without permission in writing from Uber. This document is intended only for the use of the individual or entity to whom it is addressed and contains information that is privileged, confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified that the information contained herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate, or in any way disclose this document or any of the enclosed information to any person other than employees of addressee to the extent necessary for consultations with authorized personnel of Uber.