More Related Content
Similar to Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph (20)
More from TigerGraph (20)
Graph Gurus Episode 13: Visualizing Bitcoin Blockchain with Tiger Graph
- 2. © 2019 TigerGraph. All Rights Reserved
Welcome
● Attendees are muted
● If you have any Zoom issues please contact the panelists via chat
● We will have 10 min for Q&A at the end so please send your questions
at any time using the Q&A tab in the Zoom menu
● The webinar will be recorded and sent via email
2
- 3. © 2019 TigerGraph. All Rights Reserved
Developer Edition Available
We now offer Docker versions and VirtualBox versions of the TigerGraph
Developer Edition, so you can now run on
● MacOS
● Windows 10
● Linux
Developer Edition Download https://www.tigergraph.com/developer/
3
Version 2.3
Available Now
- 4. © 2019 TigerGraph. All Rights
Reserved
Today's Gurus
4
Sai Sameer Pusapaty
Sophomore, MIT
● Majoring in Computer Science and Engineering (6-3)
● Past projects in topics including Machine Learning,
Parallel Processing, Big Data
● Extern at TigerGraph (Winter 2019). Worked previously
at Visa (Summer 2018)
Benyue (Emma) Liu
Senior Product Manager
● BS in Engineering from Harvey Mudd College, MS in
Engineering Systems from MIT
● Prior work experience at Oracle and MarkLogic
● Focus - Cloud, Containers, Enterprise Infra, Monitoring,
Management, Connectors
- 5. © 2019 TigerGraph. All Rights Reserved
Agenda
5
● How to Load Bitcoin Dataset into TigerGraph
● Demo Bitcoin BlockChain in TigerGraph
● TigerGraph 2.3 Kafka Loader for Future Integration
- 6. © 2019 TigerGraph. All Rights Reserved
MIT Externship
6
● 4 week Internship during winter Independent Activities Period
(IAP)
- 8. © 2019 TigerGraph. All Rights Reserved
Briefly on Blockchain
• Past: central authority for money management
• I.e. banks, Visa, Paypal
• These systems require “trust”
• New idea: trustless system
• Not completely accurate name
• Rather than a “human” system, a network of computers divide
the “trust”
• Public-private key cryptography
• Private key used to make a signature
• Public key used to confirm the signature
- 10. © 2019 TigerGraph. All Rights Reserved
Bitcoin transaction
TXN
3 BTC
3 BTC
2 BTC
TXN4 BTC
4 BTC 3 BTC
3 BTC
2 BTC
FALSE TRUE
- 11. © 2019 TigerGraph. All Rights Reserved
Briefly on Blockchain 199
843 132
344
432
424
345
665
645
634
432
432
324
256 345
634
???
• Transaction is sent to pool of
unverified txns
• Set of transactions put into a block
• Verification is done across a network
of computers by solving a complex
math problem -- mining
• Goal of mining is to generate a
proper block hash
- 12. © 2019 TigerGraph. All Rights Reserved
Looking at the Data Transactions
Blocks
Addresses
- 13. © 2019 TigerGraph. All Rights Reserved
Past Attempts with Graph DBs
• Report 1:
• Neo4j, 600GB db to store 100GB raw data
• 60+ days of exporting
• Report 2:
• used 400-core computer to parse 80GB
• used a 12-core VM with 64 GB RAM to analyze resulting graph
• Other:
• Graph Blockchain -> Apache Cassandra
• GraphSense -> AgensGraph Engine
- 14. © 2019 TigerGraph. All Rights Reserved
Niche of TigerGraph
• Able to scale with the
exponential size of
blockchain
• Can perform more
complex queries as
network increases
• Relatively lightweight
(especially in regards
to other graph db
alternatives)
- 15. © 2019 TigerGraph. All Rights Reserved
Within the Data
Transaction
● Txn_hash
● Coinbase
● ...
Input List
Output List
Inputs (outputs
of previous txns)
● Previous hash
● output_id
Outputs
● Address
● Amount
File
Block
Block
Block
Block
Block Header
● Block_hash
● Previous hash
● Nonce
● Merkle_root
● Timestamp
● ...
Transaction List
- 16. © 2019 TigerGraph. All Rights Reserved
Structure of the Graph
Block
Txn
Address
Output
- 17. © 2019 TigerGraph. All Rights Reserved
Structure of the Graph
Block
AddressTransaction
Output
- 18. © 2019 TigerGraph. All Rights Reserved
Program Structure: Golang
txn
Block
Channel
Batch
of
blocks txn txn txn
txn txn txn txn
Blocks
Txn
Channel
Txns Inputs Outputs
spawned
Goroutines
Folder
containing .dat
files
Wait
Group
- 20. © 2019 TigerGraph. All Rights Reserved
Demo Machine and Loading Stats
• Hardware:
• AWS EC2 r5.24xlarge (96vCPU and 768G memory)
• Original Bitcoin Data Size: 569G
• Loading Time: 2h 50min
• TigerGraph Graph Data Size: 262G
• Memory Usage During Loading: ~200G
Github:https://github.com/tigergraph/bitcoin-to-tigergraph
- 21. © 2019 TigerGraph. All Rights Reserved
Let’s explore!
explore
Alleged address of Satoshi Nakamoto: 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
Pizza transaction: cca7507897abc89628f450e8b1e0c6fca4ec3f7b34cccf55f3f531c659ff4d79
Address of the Silk Road guy: 1Ez69SnzzmePmZX3WpEzMKTrcBF2gpNQ55
Largest Bitcoin Transaction Address: 1M8s2S5bgAzSSzVTeL7zruvMPLvzSkEAuv
More addresses: http://www.theopenledger.com/9-most-famous-bitcoin-addresses/
- 23. © 2019 TigerGraph. All Rights Reserved
Source: https://people.csail.mit.edu/spillai/data/papers/bitcoin-transaction-graph-analysis.pdf
- 24. © 2019 TigerGraph. All Rights Reserved
Extended Ideas
• Each block is a frozen state in time
• Acts as time series data
• Examine behavior of addresses over time
• Can be used to determine black market addresses
• Can potentially be used to predict the USD price of
cryptocurrencies
• https://medium.com/@clearblocks/valuing-bitcoin-and-ethereum
-with-metcalfes-law-aaa743f469f6
• Metcalfe’s Law: value of a network is proportional to the number
of active users squared
- 26. © 2019 TigerGraph. All Rights Reserved
NEW in TigerGraph 2.3: Kafka Loader
● Increase Data Availability and Accelerate Time to Value
• Load streaming and batched data from user's Kafka server
• Consistent with GSQL file loading syntax and MultiGraph support
● Embrace Benefits of Kafka Ecosystem
• Scalable data loading and Built-in fault tolerance
• Data buffer - Kafka is in a separate cluster
• Extensible - open up data pipeline from many other data sources
26
Kafka Loader
- 27. © 2019 TigerGraph. All Rights Reserved
Kafka and TigerGraph Data Pipeline
Static
Data
Sources
Streaming
Data
Sources
Kafka
Loader
- 28. © 2019 TigerGraph. All Rights Reserved
Bitcoin Streaming Data Pipeline
Preprocessed
Bitcoin
Live
Data
Kafka
Loader
- 29. © 2019 TigerGraph. All Rights Reserved
Summary
• How to Load Bitcoin Dataset into TigerGraph
• Demo Bitcoin BlockChain in TigerGraph
• Future BitCoin Data Pipeline Integration with Kafka Loader
from TigerGraph 2.3 (See Graph Guru 12)
Github:https://github.com/tigergraph/bitcoin-to-tigergraph
29
Version 2.3
Available Now
- 31. © 2019 TigerGraph. All Rights Reserved
NEW! Graph Gurus Developer Office Hours
31
Catch up on previous episodes of Graph Gurus:
https://www.tigergraph.com/webinars-and-events/
Every Thursday at 11:00 am Pacific
Talk directly with our engineers every
week. During office hours, you get
answers to any questions pertaining to
graph modeling and GSQL
programming.
https://info.tigergraph.com/officehours
- 32. © 2019 TigerGraph. All Rights Reserved
Additional Resources
32
New Developer Portal
https://www.tigergraph.com/developers/
Download the Developer Edition or Enterprise Free Trial
https://www.tigergraph.com/download/
Guru Scripts
https://github.com/tigergraph/ecosys/tree/master/guru_scripts
Join our Developer Forum
https://groups.google.com/a/opengsql.org/forum/#!forum/gsql-users
@TigerGraphDB youtube.com/tigergraph facebook.com/TigerGraphDB linkedin.com/company/TigerGraph
- 34. © 2019 TigerGraph. All Rights Reserved
Kafka Loader High Level Architecture
• Connect to External Kafka Cluster
• User Commands Through GSQL
Server
• Configuration Files:
• Config 1: Kakfa Cluster
Configuration
• Config 2: Topic/Partition/Offset
Info
34
- 35. © 2019 TigerGraph. All Rights Reserved
Kafka Loader : Three Steps
35
Consistent with GSQL Data Loading Steps
Step 1: Define the Data Source
Step 2: Create a Loading Job
Step 3: Run the Loading Job
- 36. © 2019 TigerGraph. All Rights Reserved
Prerequisites: Kafka Configuration Files
36
Connect to External
Kafka Data Source
Through
Kafka Cluster
Configuration file
In Step 1
(Kafka broker's domain
name and port)
Define Kafka Data
Source Structure
Through
Kafka Topic/Partition
Configuration File
In Step 2
(Kafka topic, partition list,
and start offset)
- 37. © 2019 TigerGraph. All Rights Reserved
Manage Loading Job
• SHOW LOADING STATUS
• ABORT LOADING JOB
• RESUME LOADING JOB
37
Consistent with GSQL Data Loading Syntax