Neptune
/ @laclefyoshi / ysaeki@r.recruit.co.jp
: / @laclefyoshi
• 2011/04
• 2015/09
•
• Kafka AWS Kinesis (Apache Kafka Meetup Japan #1; 2016)
• (FutureOfData; 2016)
• Queryable State for Kafka Streams (Apache Kafka Meetup Japan #2; 2016)
• Apache Spark ( Geek Night #11; 2016)
• (BigData-JAWS #6; 2017)
• Apache Kafka 0.11 Exactly Once Semantics (Apache Kafka Meetup Japan #3; 2017)
• Apache Kafka 

(Apache Kafka Meetup Japan #4; 2018)
• StackStorm (Tech Night @ Shiodome #8; 2018)
2
1/3
4
2/3
• Node Vertex Edge
• SNS
• JOIN
• Key-Value
• Titan
http://s3.thinkaurelius.com/docs/
titan/current/data-model.html
5
3/3
•
• Property
•
• Triple
• 



Label: KnowsName: Alice Name: Bob
NameName
Label
Alice
Knows
Bob
6
• DB
•
• SQL
•
• : Neo4j DataStax Enterprise Graph
• : AllegroGraph Virtuoso
7
Apache Kafka
(Apache Kafka Meetup Japan #4; 2018)
8
: Neptune
Neptune
https://aws.amazon.com/jp/neptune/
5/31 GA!
10
Neptune
•
• Azure Cosmos DB GCP
•
• 3AZ S3
•
•
•
11
Neptune
1. CPU
• / GB/
GB/
2.
• AZ
3. S3
4.
5.
12
Neptune
• Gremlin
• For
• Apache Tinkerpop
• https://github.com/tinkerpop/gremlin
• : https://docs.aws.amazon.com/neptune/latest/userguide/
access-graph-gremlin-differences.html
• IBM Google
• SQLGraph: An Efficient Relational-Based Property Graph Store

https://dl.acm.org/citation.cfm?id=2723732
• SPARQL
• For
• W3C
• https://www.w3.org/TR/sparql11-overview/
• Gremlin SQL
13
Neptune
• Athena Web
• EC2 API
• Gremlin/SPARQL
• S3 API
14
JanusGraph
JanusGraph
• http://janusgraph.org/
• Titan
• Linux Foundation
• IBM Google
•
• Gremlin
•
• C* HBase BigTable
16
Neptune JanusGraph
• Gremlin
• HTTP
• WebSocket
• Neptune SPARQL HTTP
• Neptune 2
• Write-Read
• Read-only
• JanusGraph Read-only
18
• Neptune
• VPC
•
• IAM
•
• Gremlin/SPARQL
• Janusgraph
• Gremlin
•
19
Gremlin API 1/2
• Neptune: db.r4.xlarge (4 vCPU, 30.5GB RAM)
• JanusGraph: m5.xlarge ( , 16GB RAM) x 2
• {JanusGraph + Elasticsearch (Indexing)} + C* x 3
• Stanford Large Network Dataset Collection
• com-Amazon (Nodes: 334863, Edge: 925,872)
• https://snap.stanford.edu/data/com-Amazon.html
• Go: https://github.com/go-gremlin/gremlin
20
Gremlin API 2/2
• Neptune
• JanusGraph
real 101m58.320s
user 0m51.129s
sys 1m17.198s
query := `
g.addV("` + uuid.New().String() + `").property(id, "` + source + `").next()
g.addV("` + uuid.New().String() + `").property(id, "` + target + `").next()
g.V("` + source + `").addE("purchase").to(g.V("` + target + `”)).next()
`
query := `
sv = g.V(` + source + `).tryNext().orElseGet{g.addV(T.id, ` + source + `).next()}
tv = g.V(` + target + `).tryNext().orElseGet{g.addV(T.id, ` + target + `).next()}
sv.addEdge("purchase", tv)
`
real 106m24.994s
user 1m9.838s
sys 1m8.390s …
21
Neptune
• JanusGraph Gremlin
•
• https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin-
differences.html
• JanusGraph Java Neptune Java
• ID ID
::
• S3 Neptune
• https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial-
format.html
• SPARQL Gremlin
• CSV Glue ETL
23
• Neptune
•
• Neptune
•
• Athena SQL DB
•
DB
• Kinesis → Firehose → Neptune
24

グラフデータベース Neptune 使ってみた

  • 1.
    Neptune / @laclefyoshi /ysaeki@r.recruit.co.jp
  • 2.
    : / @laclefyoshi •2011/04 • 2015/09 • • Kafka AWS Kinesis (Apache Kafka Meetup Japan #1; 2016) • (FutureOfData; 2016) • Queryable State for Kafka Streams (Apache Kafka Meetup Japan #2; 2016) • Apache Spark ( Geek Night #11; 2016) • (BigData-JAWS #6; 2017) • Apache Kafka 0.11 Exactly Once Semantics (Apache Kafka Meetup Japan #3; 2017) • Apache Kafka 
 (Apache Kafka Meetup Japan #4; 2018) • StackStorm (Tech Night @ Shiodome #8; 2018) 2
  • 4.
  • 5.
    2/3 • Node VertexEdge • SNS • JOIN • Key-Value • Titan http://s3.thinkaurelius.com/docs/ titan/current/data-model.html 5
  • 6.
    3/3 • • Property • • Triple •
 
 Label: KnowsName: Alice Name: Bob NameName Label Alice Knows Bob 6
  • 7.
    • DB • • SQL • •: Neo4j DataStax Enterprise Graph • : AllegroGraph Virtuoso 7
  • 8.
    Apache Kafka (Apache KafkaMeetup Japan #4; 2018) 8
  • 9.
  • 10.
  • 11.
    Neptune • • Azure CosmosDB GCP • • 3AZ S3 • • • 11
  • 12.
    Neptune 1. CPU • /GB/ GB/ 2. • AZ 3. S3 4. 5. 12
  • 13.
    Neptune • Gremlin • For •Apache Tinkerpop • https://github.com/tinkerpop/gremlin • : https://docs.aws.amazon.com/neptune/latest/userguide/ access-graph-gremlin-differences.html • IBM Google • SQLGraph: An Efficient Relational-Based Property Graph Store
 https://dl.acm.org/citation.cfm?id=2723732 • SPARQL • For • W3C • https://www.w3.org/TR/sparql11-overview/ • Gremlin SQL 13
  • 14.
    Neptune • Athena Web •EC2 API • Gremlin/SPARQL • S3 API 14
  • 15.
  • 16.
    JanusGraph • http://janusgraph.org/ • Titan •Linux Foundation • IBM Google • • Gremlin • • C* HBase BigTable 16
  • 17.
  • 18.
    • Gremlin • HTTP •WebSocket • Neptune SPARQL HTTP • Neptune 2 • Write-Read • Read-only • JanusGraph Read-only 18
  • 19.
    • Neptune • VPC • •IAM • • Gremlin/SPARQL • Janusgraph • Gremlin • 19
  • 20.
    Gremlin API 1/2 •Neptune: db.r4.xlarge (4 vCPU, 30.5GB RAM) • JanusGraph: m5.xlarge ( , 16GB RAM) x 2 • {JanusGraph + Elasticsearch (Indexing)} + C* x 3 • Stanford Large Network Dataset Collection • com-Amazon (Nodes: 334863, Edge: 925,872) • https://snap.stanford.edu/data/com-Amazon.html • Go: https://github.com/go-gremlin/gremlin 20
  • 21.
    Gremlin API 2/2 •Neptune • JanusGraph real 101m58.320s user 0m51.129s sys 1m17.198s query := ` g.addV("` + uuid.New().String() + `").property(id, "` + source + `").next() g.addV("` + uuid.New().String() + `").property(id, "` + target + `").next() g.V("` + source + `").addE("purchase").to(g.V("` + target + `”)).next() ` query := ` sv = g.V(` + source + `).tryNext().orElseGet{g.addV(T.id, ` + source + `).next()} tv = g.V(` + target + `).tryNext().orElseGet{g.addV(T.id, ` + target + `).next()} sv.addEdge("purchase", tv) ` real 106m24.994s user 1m9.838s sys 1m8.390s … 21
  • 23.
    Neptune • JanusGraph Gremlin • •https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin- differences.html • JanusGraph Java Neptune Java • ID ID :: • S3 Neptune • https://docs.aws.amazon.com/neptune/latest/userguide/bulk-load-tutorial- format.html • SPARQL Gremlin • CSV Glue ETL 23
  • 24.
    • Neptune • • Neptune • •Athena SQL DB • DB • Kinesis → Firehose → Neptune 24