SlideShare a Scribd company logo
Datafying Bitcoin
Tariq B. Ahmad
https://github.com/tariq786/datafying_bitcoin
Motivation
● Bitcoin is a virtual Peer-to-Peer crypto currency.
● All bitcoin transactions are publicly available (who sent, who received and
how much?) but pseudo-anonymous
● This publicly available data is called “blockchain distributed ledger”. Current
size is around 70 GB (binary data). Growing every day since 2009.
2
BlockChain Size
3
Bitcoin Transaction types
4
one to one transaction
Many to Many transaction
Block
Block
contains
bitcoin
transactions.
There are
almost
400,000
blocks today.
Blockchain
contain all
these blocks
linked
together like
a doubly
linklist
Data
● Historical Data
○ Almost 400,000 blocks (new bitcoins)
○ More than 104 Million transactions so far
● Live Data
○ 2 transaction per second
○ Propagate through Peer to Peer
6
69 GB (2009-2016)
Query
The evolution of bitcoin transaction fee per block.
7
Working with Data
● Run full node locally on AWS => Store the entire blockchain ledger on AWS.
● Query blockchain via JSON RPC in Python
● Two RPC calls per block (Number of relevant blocks ~ 200,000 and 6.5 GB
of text storage)
○ Av time per RPC call = 1.45 sec (huge performance bottleneck. Work around is to reduce RPC
calls to one RPC call by storing all blocks in json format on disk/HDFS)
8
Bitcoin NodeAPP
get block RPC call
block json
get transaction RPC call
transaction json
1
2
Data Pipeline
9
Ingestion File System
Batch
processing
Database
Visualization
Bitcoin
Node
(Local Disk)
Stream
processingNetcat
Relay
Accomplishments and Challenges
● Complex query (bitcoin transaction fee evolution) working end to end
● Working with sea of jsons (2 jsons per block) in Apache Spark is complex.
Takes time to scale the results
● Ideally comparing three modes (batch,streaming and API) for throughput,
latency and cost
● Public APIs have rate limits. After lot of search, found Toshi API https://toshi.
io that has no rate limits
10
11
Mode # of processed
blocks
Time
(minutes)
Storage
RPC Batch 186,846 162 Local File System
RPC Batch 186,846 69 HDFS
RPC Streaming 187,990 177 -
API Streaming 187,990 222 -
API Batch 187,990 3.1 HDFS
Comparison
Storing data on HDFS pays off with Spark processing taking only 3.1 minutes in API mode
and 69 minutes in RPC mode (62 minutes account for RPC call overhead for get transaction)
Visualization
12
Zooming in to check discontinuity
13
About Me
PhD in Computer Engineering
Parallel Computing & Computer
Security.
In Love with Linux
Likes disruptive technology
14
Thank you +
Q&A

More Related Content

What's hot

Alexander Sibiryakov- Frontera
Alexander Sibiryakov- FronteraAlexander Sibiryakov- Frontera
Alexander Sibiryakov- Frontera
PyData
 
Logs management
Logs managementLogs management
Logs management
Mantas Klasavicius
 
Windows Azure Tables e NoSQL
Windows Azure Tables e NoSQLWindows Azure Tables e NoSQL
Windows Azure Tables e NoSQL
Vinícius Batista de Souza
 
Mongo db present
Mongo db presentMongo db present
Mongo db present
scottmsims
 
Fluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at ScaleFluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at Scale
Eduardo Silva Pereira
 
Logging for Containers
Logging for ContainersLogging for Containers
Logging for Containers
Eduardo Silva Pereira
 
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica ColoftCassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Jon Haddad
 
Command line git
Command line gitCommand line git
Command line git
Manos Emmanouilidis
 
Log forwarding at Scale
Log forwarding at ScaleLog forwarding at Scale
Log forwarding at Scale
Eduardo Silva Pereira
 
Cloud Firestore – From JSON Deserialization to Object Document Mapping (ODM)
Cloud Firestore – From JSON Deserialization to Object Document Mapping (ODM)Cloud Firestore – From JSON Deserialization to Object Document Mapping (ODM)
Cloud Firestore – From JSON Deserialization to Object Document Mapping (ODM)
Minh Dao
 
Java 9 Security Enhancements in Practice
Java 9 Security Enhancements in PracticeJava 9 Security Enhancements in Practice
Java 9 Security Enhancements in Practice
Martin Toshev
 
Jdk 10 sneak peek
Jdk 10 sneak peekJdk 10 sneak peek
Jdk 10 sneak peek
Martin Toshev
 
All you need to know about Kotlin's documentation engine Dokka
All you need to know about Kotlin's documentation engine Dokka All you need to know about Kotlin's documentation engine Dokka
All you need to know about Kotlin's documentation engine Dokka
Florian Benz
 
Semantic Technology In Oracle Database 12c
Semantic Technology In Oracle Database 12cSemantic Technology In Oracle Database 12c
Semantic Technology In Oracle Database 12c
Martin Toshev
 
공영주차장 실시간 예측
공영주차장 실시간 예측공영주차장 실시간 예측
공영주차장 실시간 예측
지승 한
 
Back to Basics German 3: Einführung in Replica Sets
Back to Basics German 3: Einführung in Replica SetsBack to Basics German 3: Einführung in Replica Sets
Back to Basics German 3: Einführung in Replica Sets
MongoDB
 
Fluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons BriefingFluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons Briefing
Eduardo Silva Pereira
 
Back to Basics German 3: Einführung ins Sharding
Back to Basics German 3: Einführung ins ShardingBack to Basics German 3: Einführung ins Sharding
Back to Basics German 3: Einführung ins Sharding
MongoDB
 
Cloud Native Logging / Fluentd Summit Tokyo
Cloud Native Logging / Fluentd Summit TokyoCloud Native Logging / Fluentd Summit Tokyo
Cloud Native Logging / Fluentd Summit Tokyo
Eduardo Silva Pereira
 
Fluent Bit
Fluent BitFluent Bit

What's hot (20)

Alexander Sibiryakov- Frontera
Alexander Sibiryakov- FronteraAlexander Sibiryakov- Frontera
Alexander Sibiryakov- Frontera
 
Logs management
Logs managementLogs management
Logs management
 
Windows Azure Tables e NoSQL
Windows Azure Tables e NoSQLWindows Azure Tables e NoSQL
Windows Azure Tables e NoSQL
 
Mongo db present
Mongo db presentMongo db present
Mongo db present
 
Fluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at ScaleFluent Bit: Log Forwarding at Scale
Fluent Bit: Log Forwarding at Scale
 
Logging for Containers
Logging for ContainersLogging for Containers
Logging for Containers
 
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica ColoftCassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica Coloft
 
Command line git
Command line gitCommand line git
Command line git
 
Log forwarding at Scale
Log forwarding at ScaleLog forwarding at Scale
Log forwarding at Scale
 
Cloud Firestore – From JSON Deserialization to Object Document Mapping (ODM)
Cloud Firestore – From JSON Deserialization to Object Document Mapping (ODM)Cloud Firestore – From JSON Deserialization to Object Document Mapping (ODM)
Cloud Firestore – From JSON Deserialization to Object Document Mapping (ODM)
 
Java 9 Security Enhancements in Practice
Java 9 Security Enhancements in PracticeJava 9 Security Enhancements in Practice
Java 9 Security Enhancements in Practice
 
Jdk 10 sneak peek
Jdk 10 sneak peekJdk 10 sneak peek
Jdk 10 sneak peek
 
All you need to know about Kotlin's documentation engine Dokka
All you need to know about Kotlin's documentation engine Dokka All you need to know about Kotlin's documentation engine Dokka
All you need to know about Kotlin's documentation engine Dokka
 
Semantic Technology In Oracle Database 12c
Semantic Technology In Oracle Database 12cSemantic Technology In Oracle Database 12c
Semantic Technology In Oracle Database 12c
 
공영주차장 실시간 예측
공영주차장 실시간 예측공영주차장 실시간 예측
공영주차장 실시간 예측
 
Back to Basics German 3: Einführung in Replica Sets
Back to Basics German 3: Einführung in Replica SetsBack to Basics German 3: Einführung in Replica Sets
Back to Basics German 3: Einführung in Replica Sets
 
Fluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons BriefingFluentd Intro for OpenShift Commons Briefing
Fluentd Intro for OpenShift Commons Briefing
 
Back to Basics German 3: Einführung ins Sharding
Back to Basics German 3: Einführung ins ShardingBack to Basics German 3: Einführung ins Sharding
Back to Basics German 3: Einführung ins Sharding
 
Cloud Native Logging / Fluentd Summit Tokyo
Cloud Native Logging / Fluentd Summit TokyoCloud Native Logging / Fluentd Summit Tokyo
Cloud Native Logging / Fluentd Summit Tokyo
 
Fluent Bit
Fluent BitFluent Bit
Fluent Bit
 

Viewers also liked

مته های حفاری
مته های حفاریمته های حفاری
مته های حفاری
Technical University of Denamrk
 
Amarelas Internet Portugal
Amarelas Internet PortugalAmarelas Internet Portugal
Amarelas Internet Portugal
Paulo Lopes
 
Phd seminar oct2012
Phd seminar oct2012Phd seminar oct2012
Phd seminar oct2012
Jenine Beekhuyzen
 
Nipah photos and floor plan
Nipah photos and floor planNipah photos and floor plan
Nipah photos and floor planHenry Loh
 
طراحی، پیاده‌سازی و ارزیابی رایانش توری تحت وب
طراحی، پیاده‌سازی و ارزیابی رایانش توری تحت وبطراحی، پیاده‌سازی و ارزیابی رایانش توری تحت وب
طراحی، پیاده‌سازی و ارزیابی رایانش توری تحت وب
MoEii Hm
 
Residential property transactions profile 201210
Residential property transactions profile 201210Residential property transactions profile 201210
Residential property transactions profile 201210Henry Loh
 
Xii promoción
Xii promociónXii promoción
Xii promoción
Jazmín Jácome
 
Urban residences photos
Urban residences photosUrban residences photos
Urban residences photosHenry Loh
 
Pdhpe.wiki 2
Pdhpe.wiki 2Pdhpe.wiki 2
Pdhpe.wiki 2
amira24
 
SemiD at Toh Avenue
SemiD at Toh AvenueSemiD at Toh Avenue
SemiD at Toh AvenueHenry Loh
 
Great Company Concepto
Great Company ConceptoGreat Company Concepto
Great Company Concepto
Linfer Restrepo Ospina
 
1 1 sviluppo sostenibile vezzoli_polimi_12.13
1 1 sviluppo sostenibile vezzoli_polimi_12.131 1 sviluppo sostenibile vezzoli_polimi_12.13
1 1 sviluppo sostenibile vezzoli_polimi_12.13elisa_bacchetti
 
кращі вчителі школи
кращі вчителі школикращі вчителі школи
кращі вчителі школи
max_iwan
 
Barley residences site
Barley residences siteBarley residences site
Barley residences siteHenry Loh
 
Jewella at Disrupt Sydney September 2015
Jewella at Disrupt Sydney September 2015Jewella at Disrupt Sydney September 2015
Jewella at Disrupt Sydney September 2015
Jenine Beekhuyzen
 
Basic comp oper
Basic comp operBasic comp oper
Basic comp operkchesta
 
2 3 scelta risorse vezzoli_polimi_12.13
2 3 scelta risorse vezzoli_polimi_12.132 3 scelta risorse vezzoli_polimi_12.13
2 3 scelta risorse vezzoli_polimi_12.13elisa_bacchetti
 

Viewers also liked (20)

مته های حفاری
مته های حفاریمته های حفاری
مته های حفاری
 
Tower a
Tower aTower a
Tower a
 
Amarelas Internet Portugal
Amarelas Internet PortugalAmarelas Internet Portugal
Amarelas Internet Portugal
 
Phd seminar oct2012
Phd seminar oct2012Phd seminar oct2012
Phd seminar oct2012
 
Nipah photos and floor plan
Nipah photos and floor planNipah photos and floor plan
Nipah photos and floor plan
 
뉴스페이퍼
뉴스페이퍼뉴스페이퍼
뉴스페이퍼
 
طراحی، پیاده‌سازی و ارزیابی رایانش توری تحت وب
طراحی، پیاده‌سازی و ارزیابی رایانش توری تحت وبطراحی، پیاده‌سازی و ارزیابی رایانش توری تحت وب
طراحی، پیاده‌سازی و ارزیابی رایانش توری تحت وب
 
Residential property transactions profile 201210
Residential property transactions profile 201210Residential property transactions profile 201210
Residential property transactions profile 201210
 
Xii promoción
Xii promociónXii promoción
Xii promoción
 
Urban residences photos
Urban residences photosUrban residences photos
Urban residences photos
 
Pdhpe.wiki 2
Pdhpe.wiki 2Pdhpe.wiki 2
Pdhpe.wiki 2
 
SemiD at Toh Avenue
SemiD at Toh AvenueSemiD at Toh Avenue
SemiD at Toh Avenue
 
Great Company Concepto
Great Company ConceptoGreat Company Concepto
Great Company Concepto
 
1 1 sviluppo sostenibile vezzoli_polimi_12.13
1 1 sviluppo sostenibile vezzoli_polimi_12.131 1 sviluppo sostenibile vezzoli_polimi_12.13
1 1 sviluppo sostenibile vezzoli_polimi_12.13
 
Pdhpe
PdhpePdhpe
Pdhpe
 
кращі вчителі школи
кращі вчителі школикращі вчителі школи
кращі вчителі школи
 
Barley residences site
Barley residences siteBarley residences site
Barley residences site
 
Jewella at Disrupt Sydney September 2015
Jewella at Disrupt Sydney September 2015Jewella at Disrupt Sydney September 2015
Jewella at Disrupt Sydney September 2015
 
Basic comp oper
Basic comp operBasic comp oper
Basic comp oper
 
2 3 scelta risorse vezzoli_polimi_12.13
2 3 scelta risorse vezzoli_polimi_12.132 3 scelta risorse vezzoli_polimi_12.13
2 3 scelta risorse vezzoli_polimi_12.13
 

Similar to Datafying Bitcoins

Bitcoin Blockchain - Under the Hood
Bitcoin Blockchain - Under the HoodBitcoin Blockchain - Under the Hood
Bitcoin Blockchain - Under the Hood
Galin Dinkov
 
JEEConf. Vanilla java
JEEConf. Vanilla javaJEEConf. Vanilla java
JEEConf. Vanilla java
Dmitriy Dumanskiy
 
Fredericksburg LUG Bitcoin slides
Fredericksburg LUG Bitcoin slidesFredericksburg LUG Bitcoin slides
Fredericksburg LUG Bitcoin slides
Alex Akselrod
 
tezos_hands-on-training.pdf
tezos_hands-on-training.pdftezos_hands-on-training.pdf
tezos_hands-on-training.pdf
Neven6
 
An introduction and evaluations of a wide area distributed storage system
An introduction and evaluations of  a wide area distributed storage systemAn introduction and evaluations of  a wide area distributed storage system
An introduction and evaluations of a wide area distributed storage system
Hiroki Kashiwazaki
 
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
confluent
 
Blockchan For Developers
Blockchan For DevelopersBlockchan For Developers
Blockchan For Developers
Alex Chepurnoy
 
Software architecture for high traffic website
Software architecture for high traffic websiteSoftware architecture for high traffic website
Software architecture for high traffic website
Tung Nguyen Thanh
 
On Private Blockchains, Technically
On Private Blockchains, TechnicallyOn Private Blockchains, Technically
On Private Blockchains, Technically
Alex Chepurnoy
 
Using Blockchain in Geospatial Applications
Using Blockchain in Geospatial ApplicationsUsing Blockchain in Geospatial Applications
Using Blockchain in Geospatial Applications
Luis Bermudez
 
Bitcoin and the future of cryptocurrency
Bitcoin and the future of cryptocurrencyBitcoin and the future of cryptocurrency
Bitcoin and the future of cryptocurrency
Ben Hall
 
Bitcoin and blockchain engineering
Bitcoin and blockchain engineeringBitcoin and blockchain engineering
Bitcoin and blockchain engineering
Gregory Bataille
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
OVHcloud
 
Introduction into blockchains and cryptocurrencies
Introduction into blockchains and cryptocurrenciesIntroduction into blockchains and cryptocurrencies
Introduction into blockchains and cryptocurrencies
Sergey Ivliev
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
Renzo Tomà
 
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Romeo Kienzler
 
Pythonlearn-12-HTTP- Network Programming
Pythonlearn-12-HTTP-  Network ProgrammingPythonlearn-12-HTTP-  Network Programming
Pythonlearn-12-HTTP- Network Programming
ssusere5ddd6
 
BdxCoin #7 : Scalability you said ? 22-10-2014
BdxCoin #7 : Scalability you said ? 22-10-2014BdxCoin #7 : Scalability you said ? 22-10-2014
BdxCoin #7 : Scalability you said ? 22-10-2014
bdxcoin
 
POA based Side-Chain Architecture
POA based Side-Chain ArchitecturePOA based Side-Chain Architecture
POA based Side-Chain Architecture
Luniverse Dunamu
 

Similar to Datafying Bitcoins (20)

Bitcoin Blockchain - Under the Hood
Bitcoin Blockchain - Under the HoodBitcoin Blockchain - Under the Hood
Bitcoin Blockchain - Under the Hood
 
JEEConf. Vanilla java
JEEConf. Vanilla javaJEEConf. Vanilla java
JEEConf. Vanilla java
 
Fredericksburg LUG Bitcoin slides
Fredericksburg LUG Bitcoin slidesFredericksburg LUG Bitcoin slides
Fredericksburg LUG Bitcoin slides
 
tezos_hands-on-training.pdf
tezos_hands-on-training.pdftezos_hands-on-training.pdf
tezos_hands-on-training.pdf
 
An introduction and evaluations of a wide area distributed storage system
An introduction and evaluations of  a wide area distributed storage systemAn introduction and evaluations of  a wide area distributed storage system
An introduction and evaluations of a wide area distributed storage system
 
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
 
Blockchan For Developers
Blockchan For DevelopersBlockchan For Developers
Blockchan For Developers
 
Software architecture for high traffic website
Software architecture for high traffic websiteSoftware architecture for high traffic website
Software architecture for high traffic website
 
On Private Blockchains, Technically
On Private Blockchains, TechnicallyOn Private Blockchains, Technically
On Private Blockchains, Technically
 
Using Blockchain in Geospatial Applications
Using Blockchain in Geospatial ApplicationsUsing Blockchain in Geospatial Applications
Using Blockchain in Geospatial Applications
 
Bitcoin and the future of cryptocurrency
Bitcoin and the future of cryptocurrencyBitcoin and the future of cryptocurrency
Bitcoin and the future of cryptocurrency
 
Bitcoin and blockchain engineering
Bitcoin and blockchain engineeringBitcoin and blockchain engineering
Bitcoin and blockchain engineering
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
 
Introduction into blockchains and cryptocurrencies
Introduction into blockchains and cryptocurrenciesIntroduction into blockchains and cryptocurrencies
Introduction into blockchains and cryptocurrencies
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
 
Summit2013 eventos onto quad
Summit2013   eventos onto quadSummit2013   eventos onto quad
Summit2013 eventos onto quad
 
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
Architecture of the Hyperledger Blockchain Fabric - Christian Cachin - IBM Re...
 
Pythonlearn-12-HTTP- Network Programming
Pythonlearn-12-HTTP-  Network ProgrammingPythonlearn-12-HTTP-  Network Programming
Pythonlearn-12-HTTP- Network Programming
 
BdxCoin #7 : Scalability you said ? 22-10-2014
BdxCoin #7 : Scalability you said ? 22-10-2014BdxCoin #7 : Scalability you said ? 22-10-2014
BdxCoin #7 : Scalability you said ? 22-10-2014
 
POA based Side-Chain Architecture
POA based Side-Chain ArchitecturePOA based Side-Chain Architecture
POA based Side-Chain Architecture
 

Recently uploaded

block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
Kamal Acharya
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 

Recently uploaded (20)

block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 

Datafying Bitcoins

  • 1. Datafying Bitcoin Tariq B. Ahmad https://github.com/tariq786/datafying_bitcoin
  • 2. Motivation ● Bitcoin is a virtual Peer-to-Peer crypto currency. ● All bitcoin transactions are publicly available (who sent, who received and how much?) but pseudo-anonymous ● This publicly available data is called “blockchain distributed ledger”. Current size is around 70 GB (binary data). Growing every day since 2009. 2
  • 4. Bitcoin Transaction types 4 one to one transaction Many to Many transaction
  • 6. Data ● Historical Data ○ Almost 400,000 blocks (new bitcoins) ○ More than 104 Million transactions so far ● Live Data ○ 2 transaction per second ○ Propagate through Peer to Peer 6 69 GB (2009-2016)
  • 7. Query The evolution of bitcoin transaction fee per block. 7
  • 8. Working with Data ● Run full node locally on AWS => Store the entire blockchain ledger on AWS. ● Query blockchain via JSON RPC in Python ● Two RPC calls per block (Number of relevant blocks ~ 200,000 and 6.5 GB of text storage) ○ Av time per RPC call = 1.45 sec (huge performance bottleneck. Work around is to reduce RPC calls to one RPC call by storing all blocks in json format on disk/HDFS) 8 Bitcoin NodeAPP get block RPC call block json get transaction RPC call transaction json 1 2
  • 9. Data Pipeline 9 Ingestion File System Batch processing Database Visualization Bitcoin Node (Local Disk) Stream processingNetcat Relay
  • 10. Accomplishments and Challenges ● Complex query (bitcoin transaction fee evolution) working end to end ● Working with sea of jsons (2 jsons per block) in Apache Spark is complex. Takes time to scale the results ● Ideally comparing three modes (batch,streaming and API) for throughput, latency and cost ● Public APIs have rate limits. After lot of search, found Toshi API https://toshi. io that has no rate limits 10
  • 11. 11 Mode # of processed blocks Time (minutes) Storage RPC Batch 186,846 162 Local File System RPC Batch 186,846 69 HDFS RPC Streaming 187,990 177 - API Streaming 187,990 222 - API Batch 187,990 3.1 HDFS Comparison Storing data on HDFS pays off with Spark processing taking only 3.1 minutes in API mode and 69 minutes in RPC mode (62 minutes account for RPC call overhead for get transaction)
  • 13. Zooming in to check discontinuity 13
  • 14. About Me PhD in Computer Engineering Parallel Computing & Computer Security. In Love with Linux Likes disruptive technology 14 Thank you + Q&A