Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

On-Ramp to Graph Databases and Amazon Neptune (DAT335) - AWS re:Invent 2018

757 views

Published on

How do you get started with a graph database? Learn how to quickly and easily spin up a graph database and get started writing traversals over your connected data.

  • Be the first to comment

On-Ramp to Graph Databases and Amazon Neptune (DAT335) - AWS re:Invent 2018

  1. 1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. On-Ramp to Graph Databases and Amazon Neptune Ian Robinson Data Architect AWS D A T 3 3 5
  2. 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda Introduction to Amazon Neptune Creating and querying graph data Operations Getting started
  3. 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Related breakouts Tuesday, November 27 Migrating to Amazon Neptune - DAT338 11:30 a.m. | Venetian, Level 4, Lando 4305 Wednesday, November 28 Getting Started with Amazon Neptune and Amazon SageMaker Jupyter Notebooks - DAT359 2:30 p.m. | Aria West, Level 3, Starvine 10, Table 7
  4. 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Neptune Fast, reliable graph database Optimized for storing and querying highly connected data
  5. 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Do I have a graph workload? Complex domain model Variable schema Connected queries Large dataset Many different entities Similar entities may have different properties Highly connected entities connected in many different ways Navigate connected structure Take account of strength, weight, or quality of relationships Variable structure Social Networking Recommendations Knowledge Graphs Fraud Detection Life Sciences Network & IT Operations
  6. 6. “Our customers are increasingly required to navigate a complex web of global tax policies and regulations. We need an approach to model the sophisticated corporate structures of our largest clients and deliver an end-to-end tax solution. We use a microservices architecture approach for our platforms and are beginning to leverage Amazon Neptune as a graph- based system to quickly create links within the data.” Tim Vanderham Chief Technology Officer Thomson Reuters Tax & Accounting
  7. 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Fully managed graph database Fast Reliable Open Query billions of relationships with millisecond latency Six replicas of your data across three AZs with full backup and restore Build powerful queries easily with Gremlin and SPARQL Supports Apache TinkerPop & W3C RDF graph models Easy
  8. 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Two graph data models and query languages Open-source Apache TinkerPop Gremlin traversal language Property graph Resource description framework (RDF) W3C standard SPARQL query language
  9. 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Property graph Data model Vertices Edges Properties Labels Gremlin 3.3.2 Imperative traversal language
  10. 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. RDF Data model Triples subject-predicate-object SPARQL 1.1 Declarative pattern matching language
  11. 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Neptune high-level architecture Bulk load from Amazon S3 Database management
  12. 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  13. 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating property graph data with Gremlin g.addV('User'). property(id, 'p-1'). property('name','Bob'). addV('User'). property(id, 'p-2'). property('name','Alice'). addV('User'). property(id, 'p-3'). property('name','Dan'). V('p-1').addE('FOLLOWS').to(V('p-2')). V('p-1').addE('FOLLOWS').to(V('p-3')). next()
  14. 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating property graph data g.addV('User'). property(id, 'p-1'). property('name','Bob'). addV('User'). property(id, 'p-2'). property('name','Alice'). addV('User'). property(id, 'p-3'). property('name','Dan'). V('p-1').addE('FOLLOWS').to(V('p-2')). V('p-1').addE('FOLLOWS').to(V('p-3')). next()
  15. 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating property graph data g.addV('User'). property(id, 'p-1'). property('name','Bob'). addV('User'). property(id, 'p-2'). property('name','Alice'). addV('User'). property(id, 'p-3'). property('name','Dan'). V('p-1').addE('FOLLOWS').to(V('p-2')). V('p-1').addE('FOLLOWS').to(V('p-3')). next()
  16. 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating property graph data g.addV('User'). property(id, 'p-1'). property('name','Bob'). addV('User'). property(id, 'p-2'). property('name','Alice'). addV('User'). property(id, 'p-3'). property('name','Dan'). V('p-1').addE('FOLLOWS').to(V('p-2')). V('p-1').addE('FOLLOWS').to(V('p-3')). next()
  17. 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating property graph data g.addV('User'). property(id, 'p-1'). property('name','Bob'). addV('User'). property(id, 'p-2'). property('name','Alice'). addV('User'). property(id, 'p-3'). property('name','Dan'). V('p-1').addE('FOLLOWS').to(V('p-2')). V('p-1').addE('FOLLOWS').to(V('p-3')). next()
  18. 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating RDF data with SPARQL PREFIX c: <http://www.example.com/social#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> INSERT { c:p-1 c:name "Bob"; rdf:type c:User . c:p-2 c:name "Alice"; rdf:type c:User . c:p-3 c:name "Dan"; rdf:type c:User . c:p-1 c:FOLLOWS c:p-2; c:FOLLOWS c:p-3 . } WHERE {}
  19. 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating RDF data with SPARQL PREFIX c: <http://www.example.com/social#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> INSERT { c:p-1 c:name "Bob"; rdf:type c:User . c:p-2 c:name "Alice"; rdf:type c:User . c:p-3 c:name "Dan"; rdf:type c:User . c:p-1 c:FOLLOWS c:p-2; c:FOLLOWS c:p-3 . } WHERE {}
  20. 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating RDF data with SPARQL PREFIX c: <http://www.example.com/social#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> INSERT { c:p-1 c:name "Bob"; rdf:type c:User . c:p-2 c:name "Alice"; rdf:type c:User . c:p-3 c:name "Dan"; rdf:type c:User . c:p-1 c:FOLLOWS c:p-2; c:FOLLOWS c:p-3 . } WHERE {}
  21. 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating RDF data with SPARQL PREFIX c: <http://www.example.com/social#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> INSERT { c:p-1 c:name "Bob"; rdf:type c:User . c:p-2 c:name "Alice"; rdf:type c:User . c:p-3 c:name "Dan"; rdf:type c:User . c:p-1 c:FOLLOWS c:p-2; c:FOLLOWS c:p-3 . } WHERE {}
  22. 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating RDF data with SPARQL PREFIX c: <http://www.example.com/social#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> INSERT { c:p-1 c:name "Bob"; rdf:type c:User . c:p-2 c:name "Alice"; rdf:type c:User . c:p-3 c:name "Dan"; rdf:type c:User . c:p-1 c:FOLLOWS c:p-2; c:FOLLOWS c:p-3 . } WHERE {}
  23. 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Creating RDF data with SPARQL PREFIX c: <http://www.example.com/social#> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> INSERT { c:p-1 c:name "Bob"; rdf:type c:User . c:p-2 c:name "Alice"; rdf:type c:User . c:p-3 c:name "Dan"; rdf:type c:User . c:p-1 c:FOLLOWS c:p-2; c:FOLLOWS c:p-3 . } WHERE {}
  24. 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Querying graph data Universities Institutions Organizational structure Roles and job titles Research, teaching, and administrative staff Subjects and course catalogs Timetables Undergraduate and graduate populations
  25. 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Find graduate students working at the same university from which they received their undergraduate degree undergraduateDegreeFrom name: ? name: ? University GraduateStudent name: ? Department memberOf subOrganisationOf
  26. 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Gremlin g.V().hasLabel('GraduateStudent').as('student'). out('memberOf'). out('subOrganisationOf'). in('undergraduateDegreeFrom'). where(eq('student'))
  27. 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Gremlin g.V().hasLabel('GraduateStudent').as('student'). out('memberOf'). out('subOrganisationOf'). in('undergraduateDegreeFrom'). where(eq('student'))
  28. 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Gremlin g.V().hasLabel('GraduateStudent').as('student'). out('memberOf'). out('subOrganisationOf'). in('undergraduateDegreeFrom'). where(eq('student'))
  29. 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Gremlin g.V().hasLabel('GraduateStudent').as('student'). out('memberOf'). out('subOrganisationOf'). in('undergraduateDegreeFrom'). where(eq('student'))
  30. 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Gremlin g.V().hasLabel('GraduateStudent').as('student'). out('memberOf'). out('subOrganisationOf'). in('undergraduateDegreeFrom'). where(eq('student'))
  31. 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Gremlin g.V().hasLabel('GraduateStudent').as('student'). out('memberOf'). out('subOrganisationOf'). in('undergraduateDegreeFrom'). where(eq('student'))
  32. 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SPARQL PREFIX rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX ub:http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl# SELECT ?student WHERE { ?student rdf:type ub:GraduateStudent . ?univ rdf:type ub:University . ?dept rdf:type ub:Department . ?student ub:memberOf ?dept . ?dept ub:subOrganizationOf ?univ . ?student ub:undergraduateDegreeFrom ?univ }
  33. 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SPARQL PREFIX rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX ub:http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl# SELECT ?student WHERE { ?student rdf:type ub:GraduateStudent . ?univ rdf:type ub:University . ?dept rdf:type ub:Department . ?student ub:memberOf ?dept . ?dept ub:subOrganizationOf ?univ . ?student ub:undergraduateDegreeFrom ?univ }
  34. 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SPARQL PREFIX rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX ub:http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl# SELECT ?student WHERE { ?student rdf:type ub:GraduateStudent . ?univ rdf:type ub:University . ?dept rdf:type ub:Department . ?student ub:memberOf ?dept . ?dept ub:subOrganizationOf ?univ . ?student ub:undergraduateDegreeFrom ?univ }
  35. 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SPARQL PREFIX rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX ub:http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl# SELECT ?student WHERE { ?student rdf:type ub:GraduateStudent . ?univ rdf:type ub:University . ?dept rdf:type ub:Department . ?student ub:memberOf ?dept . ?dept ub:subOrganizationOf ?univ . ?student ub:undergraduateDegreeFrom ?univ }
  36. 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SPARQL PREFIX rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX ub:http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl# SELECT ?student WHERE { ?student rdf:type ub:GraduateStudent . ?univ rdf:type ub:University . ?dept rdf:type ub:Department . ?student ub:memberOf ?dept . ?dept ub:subOrganizationOf ?univ . ?student ub:undergraduateDegreeFrom ?univ }
  37. 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  38. 38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cloud-native storage Data replicated six times across three AZs Continuous backup to Amazon Simple Storage Service (Amazon S3) Built for 11 nines of durability Continuous monitoring of nodes Quorum system for read/write Latency tolerant Storage volume automatically grows up to 64 TB 10 GB segments as unit of repair or hot spot rebalance AZ 1 AZ 2 AZ 3 Amazon S3 Amazon Neptune Storage Node Storage Node Storage Node Storage Node Storage Node Storage Node Storage Monitoring
  39. 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon Neptune read replicas Availability • Failing database nodes automatically detected and replaced • Failing database processes automatically detected and recycled • Replicas automatically promoted to primary during failover • Customer-specifiable failover order Performance • Customer applications can scale out read traffic across read replicas • Reader endpoint balances connection across read replicas AZ 1 AZ 3AZ 2 Primary Node Primary Node Primary Master Node Primary Node Primary NodeRead Replica Primary Node Primary NodeRead Replica Cluster and Instance Monitoring
  40. 40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Security Network isolation via virtual private cloud Use security groups to control ingress Encryption at rest using AWS Key Management Service (AWS KMS) Encrypts underlying storage, automated backups, snapshots, and replicas in the same cluster AWS Identity Access Management (IAM) policies to secure resources IAM database authentication IAM policies for database access Each request must be signed using AWS Signature version 4 Libraries for Gremlin and SPARQL clients
  41. 41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Monitoring AWS CloudTrail Log all Neptune API calls to Amazon S3 bucket Event notifications Create SNS subscription via CLI or SDK Sources: db-instance | db-cluster | db-parameter-group db-security-group | db-snapshot | db-cluster-snapshot Amazon CloudWatch CPUUtilization GremlinRequestsPerSec Http429 SparqlErrors ClusterReplicaLag Http100 Http500 SparqlRequests ClusterReplicaLagMaximum Http101 Http501 SparqlRequestsPerSec ClusterReplicaLagMinimum Http200 LoaderErrors StatusErrors EngineUptime Http400 LoaderRequests StatusRequests FreeableMemory Http403 NetworkReceiveThroughput VolumeBytesUsed GremlinErrors Http405 NetworkThroughput VolumeReadIOPs GremlinRequests Http413 NetworkTransmitThroughput VolumeWriteIOPs
  42. 42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  43. 43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Quick start AWS CloudFormation stack VPC Security group Neptune cluster (with read replica) Amazon EC2 client IAM role Amazon S3 endpoint
  44. 44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Accessing the graph REPL environments • Gremlin console • RDF4j console Application clients • Open-source Gremlin and SPARQL clients for most languages Development environments • AWS Cloud9 • Amazon SageMaker Jupyter notebooks
  45. 45. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Ian Robinson ianrob@amazon.com
  46. 46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

×