More Related Content Similar to Amundsen gremlin proxy design (20) More from markgrover (20) Amundsen gremlin proxy design8. ● Amundsen
8
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
Amundsen Gremlin Overview
9. ● Gremlin shared code
9
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
Amundsen Gremlin Overview
10. ● Metadata service
○ Gremlin proxy
10
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
Amundsen Gremlin Overview
11. 11
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
● Abstract proxy tests
○ Construct one case, test against
every* proxy
def test_rt_table(self) -> None:
expected = Fixtures.next_table()
self.get_proxy().put_table(table=expected)
actual: Table = self.get_proxy().get_table(table_uri=expected.key)
self.assertEqual(expected, actual)
Amundsen Gremlin Overview
12. ● Databuilder
12
Postgres Hive Redshift ... Presto
Github
Source
File
Databuilder Crawler
AWS Neptune
Elastic
Search
Metadata Service Search Service
Frontend Service
Metadata Sources
Gremlin shared code
Amundsen Gremlin Overview
14. Lessons Learned
Image
● Failed experiments
○ Transactional gremlin for writes:
■ V only once - prefer V(id)
● g.V(id1).as_('one').V(id2).addE(label).from_('one')
■ Smaller traversals are better
■ Minimize coalesce() in write
14
17. Upstream Plan
TODAY
Internal refactoring
Consolidation of gremlin code into new shared
amundsen-gremlin repository. Databuilder and
metadata service will utilize the shared code.
Approx. August 17
Stabilization
Improve stability/performance of existing gremlin
code
Approx. August 7
Ship to amundsen
Clean up square-specific bits of amundsen-gremlin,
publish. Publish proxy and proxy tests utilizing
amundsen-gremlin
Approx. August 21
17
18. Thank you
Kudos to the rest of the Privacy Engineering team
at Square who worked on this - Dan Simms, Alyssa
Ransbury, Sarah Harvey, and Kat Hawthorne