SlideShare a Scribd company logo
Gremlins for Amundsen
08/13/2020
p3
Content
p6
p13
p16
Gremlin Introduction
Amundsen Gremlin Overview
Lessons Learned
Upstream Plan
2
Gremlin Introduction
Gremlin Introduction
Image
● Graph traversal language
● Apache Tinkerpop
● Vertexes and Edges
● Widely supported
4
● Sample queries
g.V().hasLabel('airport').has('code','DFW')
g.V().has(Table, ‘key’, table_uri)
.outE().inV().hasLabel(Column).as_('column')
● Curious to know more? See
Practical Gremlin
Gremlin Introduction
● Cypher equivalent
MATCH (Airport {code: DFW})
MATCH (Table {key: $table_uri})
-[:COLUMN]->(column:Column)
5
Amundsen Gremlin
Overview
Amundsen Gremlin Overview
Image
● Why build this?
○ Hosted graph
○ Online backups
○ Proxy is platform-agnostic*
7
● Amundsen
8
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
Amundsen Gremlin Overview
● Gremlin shared code
9
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
Amundsen Gremlin Overview
● Metadata service
○ Gremlin proxy
10
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
Amundsen Gremlin Overview
11
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
● Abstract proxy tests
○ Construct one case, test against
every* proxy
def test_rt_table(self) -> None:
expected = Fixtures.next_table()
self.get_proxy().put_table(table=expected)
actual: Table = self.get_proxy().get_table(table_uri=expected.key)
self.assertEqual(expected, actual)
Amundsen Gremlin Overview
● Databuilder
12
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
Amundsen Gremlin Overview
Lessons Learned
Lessons Learned
Image
● Failed experiments
○ Transactional gremlin for writes:
■ V only once - prefer V(id)
● g.V(id1).as_('one').V(id2).addE(label).from_('one')
■ Smaller traversals are better
■ Minimize coalesce() in write
14
Lessons Learned
Image
● Failed experiments (cont)
○ SessionedClient
○ AWS Lambda write from Kinesis
15
Upstream Plan
Upstream Plan
TODAY
Internal refactoring
Consolidation of gremlin code into new shared
amundsen-gremlin repository. Databuilder and
metadata service will utilize the shared code.
Approx. August 17
Stabilization
Improve stability/performance of existing gremlin
code
Approx. August 7
Ship to amundsen
Clean up square-specific bits of amundsen-gremlin,
publish. Publish proxy and proxy tests utilizing
amundsen-gremlin
Approx. August 21
17
Thank you
Kudos to the rest of the Privacy Engineering team
at Square who worked on this - Dan Simms, Alyssa
Ransbury, Sarah Harvey, and Kat Hawthorne
Questions?
See also: amundsen-io issue 526

More Related Content

What's hot

Easy Blogging With Emacs -- Cheatsheet
Easy Blogging With Emacs -- CheatsheetEasy Blogging With Emacs -- Cheatsheet
Easy Blogging With Emacs -- CheatsheetDashamir Hoxha
 
A* Algorithm
A* AlgorithmA* Algorithm
A* Algorithm
maharajdey
 
Mini project title prime number generator
Mini project title prime number generatorMini project title prime number generator
Mini project title prime number generator
Md. Asad Chowdhury Dipu
 
Bayesian learning
Bayesian learning Bayesian learning
Bayesian learning
EngReads
 
Two C++ Tools: Compiler Explorer and Cpp Insights
Two C++ Tools: Compiler Explorer and Cpp InsightsTwo C++ Tools: Compiler Explorer and Cpp Insights
Two C++ Tools: Compiler Explorer and Cpp Insights
Alison Chaiken
 
Asp.net Core 5 and C# 9 - #SnetTalks2
Asp.net Core 5 and C# 9 - #SnetTalks2Asp.net Core 5 and C# 9 - #SnetTalks2
Asp.net Core 5 and C# 9 - #SnetTalks2
André Agostinho
 
Lecture 1 python arithmetic (ewurc)
Lecture 1 python arithmetic (ewurc)Lecture 1 python arithmetic (ewurc)
Lecture 1 python arithmetic (ewurc)
Al-Mamun Riyadh (Mun)
 
Tilemill gwu-wboykinm
Tilemill gwu-wboykinmTilemill gwu-wboykinm
Tilemill gwu-wboykinm
Bill Morris
 
Tower of HANOI
Tower of HANOITower of HANOI
Tower of HANOI
Er. Ganesh Ram Suwal
 
Write a program to show the example of hybrid inheritance.
Write a program to show the example of hybrid inheritance.Write a program to show the example of hybrid inheritance.
Write a program to show the example of hybrid inheritance.
Mumbai B.Sc.IT Study
 
Presentation2
Presentation2Presentation2
Presentation2chian2208
 
Presentation2
Presentation2Presentation2
Presentation2chian2208
 
Bcsl 033 data and file structures lab s5-3
Bcsl 033 data and file structures lab s5-3Bcsl 033 data and file structures lab s5-3
Bcsl 033 data and file structures lab s5-3
Dr. Loganathan R
 
Write a program to perform translation
Write a program to perform translationWrite a program to perform translation
Write a program to perform translation
Shobhit Saxena
 
sin cos & tan (Plot using MATLAB)
sin cos  & tan (Plot using MATLAB)sin cos  & tan (Plot using MATLAB)
sin cos & tan (Plot using MATLAB)
shivam choubey
 

What's hot (16)

Easy Blogging With Emacs -- Cheatsheet
Easy Blogging With Emacs -- CheatsheetEasy Blogging With Emacs -- Cheatsheet
Easy Blogging With Emacs -- Cheatsheet
 
A* Algorithm
A* AlgorithmA* Algorithm
A* Algorithm
 
Mini project title prime number generator
Mini project title prime number generatorMini project title prime number generator
Mini project title prime number generator
 
Bayesian learning
Bayesian learning Bayesian learning
Bayesian learning
 
Two C++ Tools: Compiler Explorer and Cpp Insights
Two C++ Tools: Compiler Explorer and Cpp InsightsTwo C++ Tools: Compiler Explorer and Cpp Insights
Two C++ Tools: Compiler Explorer and Cpp Insights
 
Asp.net Core 5 and C# 9 - #SnetTalks2
Asp.net Core 5 and C# 9 - #SnetTalks2Asp.net Core 5 and C# 9 - #SnetTalks2
Asp.net Core 5 and C# 9 - #SnetTalks2
 
Lecture 1 python arithmetic (ewurc)
Lecture 1 python arithmetic (ewurc)Lecture 1 python arithmetic (ewurc)
Lecture 1 python arithmetic (ewurc)
 
Tilemill gwu-wboykinm
Tilemill gwu-wboykinmTilemill gwu-wboykinm
Tilemill gwu-wboykinm
 
Tower of HANOI
Tower of HANOITower of HANOI
Tower of HANOI
 
Write a program to show the example of hybrid inheritance.
Write a program to show the example of hybrid inheritance.Write a program to show the example of hybrid inheritance.
Write a program to show the example of hybrid inheritance.
 
Presentation2
Presentation2Presentation2
Presentation2
 
Presentation2
Presentation2Presentation2
Presentation2
 
Bcsl 033 data and file structures lab s5-3
Bcsl 033 data and file structures lab s5-3Bcsl 033 data and file structures lab s5-3
Bcsl 033 data and file structures lab s5-3
 
Write a program to perform translation
Write a program to perform translationWrite a program to perform translation
Write a program to perform translation
 
sin cos & tan (Plot using MATLAB)
sin cos  & tan (Plot using MATLAB)sin cos  & tan (Plot using MATLAB)
sin cos & tan (Plot using MATLAB)
 
Block diagram design
Block diagram designBlock diagram design
Block diagram design
 

Similar to Amundsen gremlin proxy design

AfterGlow
AfterGlowAfterGlow
AfterGlow
Raffael Marty
 
Cypher for Gremlin
Cypher for GremlinCypher for Gremlin
Cypher for Gremlin
openCypher
 
openCypher: Naming and Addressing Multiple Graphs
openCypher: Naming and Addressing Multiple GraphsopenCypher: Naming and Addressing Multiple Graphs
openCypher: Naming and Addressing Multiple Graphs
openCypher
 
Advanced Cartography for the Web
Advanced Cartography for the WebAdvanced Cartography for the Web
Advanced Cartography for the Web
FOSS4G 2011
 
Deep_dive_on_Amazon_Neptune_DAT361.pdf
Deep_dive_on_Amazon_Neptune_DAT361.pdfDeep_dive_on_Amazon_Neptune_DAT361.pdf
Deep_dive_on_Amazon_Neptune_DAT361.pdf
ShaikAsif83
 
Skia & Freetype - Android 2D Graphics Essentials
Skia & Freetype - Android 2D Graphics EssentialsSkia & Freetype - Android 2D Graphics Essentials
Skia & Freetype - Android 2D Graphics Essentials
Kyungmin Lee
 
Go. Why it goes
Go. Why it goesGo. Why it goes
Go. Why it goes
Sergey Pichkurov
 
Maintainable go
Maintainable goMaintainable go
Maintainable go
Yu-Shuan Hsieh
 
MapServer #ProTips 2015
MapServer #ProTips 2015MapServer #ProTips 2015
MapServer #ProTips 2015
Jeff McKenna
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Altinity Ltd
 
Multiple graphs in openCypher
Multiple graphs in openCypherMultiple graphs in openCypher
Multiple graphs in openCypher
openCypher
 
Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015
Aysylu Greenberg
 
Loom and Graphs in Clojure
Loom and Graphs in ClojureLoom and Graphs in Clojure
Loom and Graphs in Clojure
Aysylu Greenberg
 
Loom at Clojure/West
Loom at Clojure/WestLoom at Clojure/West
Loom at Clojure/West
Aysylu Greenberg
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Altinity Ltd
 
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward
 
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Thomas Weise
 
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward
 
Airframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpecAirframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpec
Taro L. Saito
 
不深不淺,帶你認識 LLVM (Found LLVM in your life)
不深不淺,帶你認識 LLVM (Found LLVM in your life)不深不淺,帶你認識 LLVM (Found LLVM in your life)
不深不淺,帶你認識 LLVM (Found LLVM in your life)
Douglas Chen
 

Similar to Amundsen gremlin proxy design (20)

AfterGlow
AfterGlowAfterGlow
AfterGlow
 
Cypher for Gremlin
Cypher for GremlinCypher for Gremlin
Cypher for Gremlin
 
openCypher: Naming and Addressing Multiple Graphs
openCypher: Naming and Addressing Multiple GraphsopenCypher: Naming and Addressing Multiple Graphs
openCypher: Naming and Addressing Multiple Graphs
 
Advanced Cartography for the Web
Advanced Cartography for the WebAdvanced Cartography for the Web
Advanced Cartography for the Web
 
Deep_dive_on_Amazon_Neptune_DAT361.pdf
Deep_dive_on_Amazon_Neptune_DAT361.pdfDeep_dive_on_Amazon_Neptune_DAT361.pdf
Deep_dive_on_Amazon_Neptune_DAT361.pdf
 
Skia & Freetype - Android 2D Graphics Essentials
Skia & Freetype - Android 2D Graphics EssentialsSkia & Freetype - Android 2D Graphics Essentials
Skia & Freetype - Android 2D Graphics Essentials
 
Go. Why it goes
Go. Why it goesGo. Why it goes
Go. Why it goes
 
Maintainable go
Maintainable goMaintainable go
Maintainable go
 
MapServer #ProTips 2015
MapServer #ProTips 2015MapServer #ProTips 2015
MapServer #ProTips 2015
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
Multiple graphs in openCypher
Multiple graphs in openCypherMultiple graphs in openCypher
Multiple graphs in openCypher
 
Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015Loom & Functional Graphs in Clojure @ LambdaConf 2015
Loom & Functional Graphs in Clojure @ LambdaConf 2015
 
Loom and Graphs in Clojure
Loom and Graphs in ClojureLoom and Graphs in Clojure
Loom and Graphs in Clojure
 
Loom at Clojure/West
Loom at Clojure/WestLoom at Clojure/West
Loom at Clojure/West
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
 
Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019Streaming your Lyft Ride Prices - Flink Forward SF 2019
Streaming your Lyft Ride Prices - Flink Forward SF 2019
 
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
Flink Forward San Francisco 2019: Streaming your Lyft Ride Prices - Thomas We...
 
Airframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpecAirframe Meetup #3: 2019 Updates & AirSpec
Airframe Meetup #3: 2019 Updates & AirSpec
 
不深不淺,帶你認識 LLVM (Found LLVM in your life)
不深不淺,帶你認識 LLVM (Found LLVM in your life)不深不淺,帶你認識 LLVM (Found LLVM in your life)
不深不淺,帶你認識 LLVM (Found LLVM in your life)
 

More from markgrover

From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
markgrover
 
Amundsen lineage designs - community meeting, Dec 2020
Amundsen lineage designs - community meeting, Dec 2020 Amundsen lineage designs - community meeting, Dec 2020
Amundsen lineage designs - community meeting, Dec 2020
markgrover
 
Amundsen at Brex and Looker integration
Amundsen at Brex and Looker integrationAmundsen at Brex and Looker integration
Amundsen at Brex and Looker integration
markgrover
 
REA Group's journey with Data Cataloging and Amundsen
REA Group's journey with Data Cataloging and AmundsenREA Group's journey with Data Cataloging and Amundsen
REA Group's journey with Data Cataloging and Amundsen
markgrover
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
markgrover
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
markgrover
 
Data Discovery & Trust through Metadata
Data Discovery & Trust through MetadataData Discovery & Trust through Metadata
Data Discovery & Trust through Metadata
markgrover
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
markgrover
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
markgrover
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
markgrover
 
TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache BeamTensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
markgrover
 
Big Data at Speed
Big Data at SpeedBig Data at Speed
Big Data at Speed
markgrover
 
Near real-time anomaly detection at Lyft
Near real-time anomaly detection at LyftNear real-time anomaly detection at Lyft
Near real-time anomaly detection at Lyft
markgrover
 
Dogfooding data at Lyft
Dogfooding data at LyftDogfooding data at Lyft
Dogfooding data at Lyft
markgrover
 
Fighting cybersecurity threats with Apache Spot
Fighting cybersecurity threats with Apache SpotFighting cybersecurity threats with Apache Spot
Fighting cybersecurity threats with Apache Spot
markgrover
 
Fraud Detection with Hadoop
Fraud Detection with HadoopFraud Detection with Hadoop
Fraud Detection with Hadoop
markgrover
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
markgrover
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
markgrover
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
markgrover
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
markgrover
 

More from markgrover (20)

From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
 
Amundsen lineage designs - community meeting, Dec 2020
Amundsen lineage designs - community meeting, Dec 2020 Amundsen lineage designs - community meeting, Dec 2020
Amundsen lineage designs - community meeting, Dec 2020
 
Amundsen at Brex and Looker integration
Amundsen at Brex and Looker integrationAmundsen at Brex and Looker integration
Amundsen at Brex and Looker integration
 
REA Group's journey with Data Cataloging and Amundsen
REA Group's journey with Data Cataloging and AmundsenREA Group's journey with Data Cataloging and Amundsen
REA Group's journey with Data Cataloging and Amundsen
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
 
Data Discovery & Trust through Metadata
Data Discovery & Trust through MetadataData Discovery & Trust through Metadata
Data Discovery & Trust through Metadata
 
Data Discovery and Metadata
Data Discovery and MetadataData Discovery and Metadata
Data Discovery and Metadata
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
 
Disrupting Data Discovery
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
 
TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache BeamTensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
 
Big Data at Speed
Big Data at SpeedBig Data at Speed
Big Data at Speed
 
Near real-time anomaly detection at Lyft
Near real-time anomaly detection at LyftNear real-time anomaly detection at Lyft
Near real-time anomaly detection at Lyft
 
Dogfooding data at Lyft
Dogfooding data at LyftDogfooding data at Lyft
Dogfooding data at Lyft
 
Fighting cybersecurity threats with Apache Spot
Fighting cybersecurity threats with Apache SpotFighting cybersecurity threats with Apache Spot
Fighting cybersecurity threats with Apache Spot
 
Fraud Detection with Hadoop
Fraud Detection with HadoopFraud Detection with Hadoop
Fraud Detection with Hadoop
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
 
Architecting Applications with Hadoop
Architecting Applications with HadoopArchitecting Applications with Hadoop
Architecting Applications with Hadoop
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 

Recently uploaded

Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 

Amundsen gremlin proxy design

  • 2. p3 Content p6 p13 p16 Gremlin Introduction Amundsen Gremlin Overview Lessons Learned Upstream Plan 2
  • 4. Gremlin Introduction Image ● Graph traversal language ● Apache Tinkerpop ● Vertexes and Edges ● Widely supported 4
  • 5. ● Sample queries g.V().hasLabel('airport').has('code','DFW') g.V().has(Table, ‘key’, table_uri) .outE().inV().hasLabel(Column).as_('column') ● Curious to know more? See Practical Gremlin Gremlin Introduction ● Cypher equivalent MATCH (Airport {code: DFW}) MATCH (Table {key: $table_uri}) -[:COLUMN]->(column:Column) 5
  • 7. Amundsen Gremlin Overview Image ● Why build this? ○ Hosted graph ○ Online backups ○ Proxy is platform-agnostic* 7
  • 8. ● Amundsen 8 Postgres Hive Redshift ... Presto Github Source File Databuilder Crawler AWS Neptune Elastic Search Metadata Service Search Service Frontend Service Metadata Sources Gremlin shared code Amundsen Gremlin Overview
  • 9. ● Gremlin shared code 9 Postgres Hive Redshift ... Presto Github Source File Databuilder Crawler AWS Neptune Elastic Search Metadata Service Search Service Frontend Service Metadata Sources Gremlin shared code Amundsen Gremlin Overview
  • 10. ● Metadata service ○ Gremlin proxy 10 Postgres Hive Redshift ... Presto Github Source File Databuilder Crawler AWS Neptune Elastic Search Metadata Service Search Service Frontend Service Metadata Sources Gremlin shared code Amundsen Gremlin Overview
  • 11. 11 Postgres Hive Redshift ... Presto Github Source File Databuilder Crawler AWS Neptune Elastic Search Metadata Service Search Service Frontend Service Metadata Sources Gremlin shared code ● Abstract proxy tests ○ Construct one case, test against every* proxy def test_rt_table(self) -> None: expected = Fixtures.next_table() self.get_proxy().put_table(table=expected) actual: Table = self.get_proxy().get_table(table_uri=expected.key) self.assertEqual(expected, actual) Amundsen Gremlin Overview
  • 12. ● Databuilder 12 Postgres Hive Redshift ... Presto Github Source File Databuilder Crawler AWS Neptune Elastic Search Metadata Service Search Service Frontend Service Metadata Sources Gremlin shared code Amundsen Gremlin Overview
  • 14. Lessons Learned Image ● Failed experiments ○ Transactional gremlin for writes: ■ V only once - prefer V(id) ● g.V(id1).as_('one').V(id2).addE(label).from_('one') ■ Smaller traversals are better ■ Minimize coalesce() in write 14
  • 15. Lessons Learned Image ● Failed experiments (cont) ○ SessionedClient ○ AWS Lambda write from Kinesis 15
  • 17. Upstream Plan TODAY Internal refactoring Consolidation of gremlin code into new shared amundsen-gremlin repository. Databuilder and metadata service will utilize the shared code. Approx. August 17 Stabilization Improve stability/performance of existing gremlin code Approx. August 7 Ship to amundsen Clean up square-specific bits of amundsen-gremlin, publish. Publish proxy and proxy tests utilizing amundsen-gremlin Approx. August 21 17
  • 18. Thank you Kudos to the rest of the Privacy Engineering team at Square who worked on this - Dan Simms, Alyssa Ransbury, Sarah Harvey, and Kat Hawthorne