SlideShare a Scribd company logo
“Breaking Bad: De-Anonymising
Entity Types on the Bitcoin
Blockchain Using Supervised
Machine Learning ”
HICSS–51, January 2018
Internet and the Digital Economy
Distributed Ledger Technology: The Blockchain Minitrack
Awa Sun Yin
@awasunyin
About Me
Academic
Background &
Current Position
BSc. in Business & Statistics
MSc. in Information Systems
Data Scientist at Chainalysis,
spec. in ML/DL & Blockchain
Research director at the
Interchain Foundation*
Research Areas of
Interest
Application of ML/DL to
Blockchain data for clustering,
de-anonymization, etc.
2nd and 3rd Generation
Blockchains: Ethereum,
Cøsmos*
Privacy Coins &
Cryptography: Monero,
ZCash
Haohua (Awa) Sun Yin
Agenda
1
2
3
Background & Motivations
Problem Formulation
Research Question
4
5
6
Basic Concepts
Methodology
Final Outcomes & Reflections
7 Future Research
8 Q&A
1 Background & Motivations
Adoption of
Cryptocurrencies
2.9 to 5.8 million unique users
(mostly Bitcoin), and increasing
Accepted as a payment method
by over 100,000 merchants
(~2014)
Affiliation with Illicit
Activities
Used for: Money laundering,
scamming, terror financing
Used as payment method for:
cyber-extortion (ransom
payments), thievery, trading
illegal goods in the Darknet
Need for Investigation
& Compliance Tools
Businesses: Required by AML
and KYC regulations, need tools
to assess the risk of each of their
customers
Law Enforcement: Need for
domain specific analysis and
investigation tools
1
2 Background & Motivations
Problem Formulation
BTC
“Account”
Other E-
Payment
Methods
Public Key / Address
Personal
ID
Email Address
Phone Number
Physical
Address
Transaction Hash
Receiving
Address
Amount
BTC
Anyone can create
anytime a BTC
“account”
Issued by centralized
organizations
1
2
3 Background & Motivations
Problem Formulation
Research Question
To what extent can we predict the category of
a yet-unidentified cluster on the Bitcoin
Blockchain?
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
4a Basic Concepts
X Y
12c6DSiU4Rq3P4ZxziKxzrL5LmMBrzjrJX 1DGoMT2uz6Dg59JbwtDSp8KyjiTPR7RnVQ
1 BTC
69cea94f19a47ea0d118e80bef56fbd1e8
5d134472f18214ed8e6b236d2701f8
Category: Unknown Category: Unknown
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
4b Basic Concepts
Alice Kraken
B13HjAyJyAX6SYc6hrEqLf4naoKCWdeS
1FaEn6fGeDfDkHZvvSLsVRVxqgQt95JeyD
1JQFWqx9VFMagd8UryqiJriLmCvCuJLyf5
1PxoZtvm3sb6LnYS5iNERuJBNR5oQZxuXJ
1K4EqXxSVd6NndkmVAMsSBd1n8VJNJLwr2
10 BTC
72cme94f19a47ea0d118e80bef56fbd1e
85d134472f18214ed8e6b236d2701f8
Category: Personal Wallet Category: Exchange
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
4c Basic Concepts
The Bitcoin Network I
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
4c Basic Concepts
Gambling
Scam
Mixing
Merchant
Services
Exchange
Darknet
Market
The Bitcoin Network II
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5a Methodology: Dataset
Dataset Description
§ 434 Observations
§ 10 categories
§ + 200 M transactions
§ Period: January 2009 –
May 2017
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5b Data Preparation
Data Preparation
(Python & Numpy)
Enriched data Preprocessing
Feature
engineering
Input data
Raw data
Data Provider’s Tool
Clustering &
manual
labelling
API
BTC Node or
Core APIs
Co-spend,
custom
heuristics, etc
Cleaning &
Dimensionality
Reduction
76 Features [434, 77]
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5c Analysis: Supervised Learning
Data Analysis
(Python & Scikit-Learn)
Selection of
algorithms
Train each model
with default
parameters
Assess & compare
results
Find optimal
parameters
Train each model
with optimal
parameters
Oversample two
minority classes
Input data
Find optimal
parameters
1
2
3
kNN
**RF
ET
AB
DT
**BG
***GB
SMOTE
CV-Random Search
Conf,
ROC &
Class.
Report
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5d Results
ROC Curve and Classification Report
from Gradient Boosted Trees classifier,
average confidence of 77%
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5
6b Methodology
Reflections
Implications
§ It is possible to categorize unidentified
clusters on Bitcoin using supervised
learning
§ Further challenging Bitcoin’s true level of
anonymity
§ Applicability to compliance, investigation
tools
Limitations
§ Dataset limited to 434 observations
§ Low performance with under-sampled
categories
§ Features not reflecting all available data
§ Lack of test set
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5
6b Methodology
Conclusion
Multiclass Classification on
the Bitcoin Blockchain
Goal: Predict the category of unidentified
clusters
Methodology: Using a dataset of already
identified clusters (a total of 434
observations across 10 categories)
Results: It is possible to classify with a
confidence of 77% using Gradient Boosted
Trees
Implications of the Research
A degree of de-anonymization can be
achieved using this approach
Considering the limitations: Paving the way
for future research
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5
6 Methodology
Final Outcomes & Reflections
7a Future Research
Expanding the Dataset
of Identified Clusters
Number of observations per
category
Number of categories
Refining & Testing
Alternative
Methodologies
Automatic feature engineering
and extraction
Testing more classification
algorithms
Binary Classification
Applying the Model
Use the tested methodology to
uncover the Bitcoin Blockchain
for multiple purposes: cybercrime
investigations, compliance tools,
etc.
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5
7b Methodology
Future Research
Haohua Sun Yin & Ravi Vatrapu, A First Estimation of
the Proportion of Cybercriminal Entities in the Bitcoin
Ecosystem using Supervised Machine Learning, IEEE
Big Data for Cybercrime Prevention, 2017, Boston MA
1
2
3 Background & Motivations
Problem Formulation
Research Question & Objectives
5
6 Methodology
Final Outcomes & Reflections
8 Q&A
Proposed Questions
§ How can we increase the dataset in both number of observations and categories?
§ If Bitcoin is not truly anonymous, why has it been used for nefarious activities?
§ Are there alternatives to Bitcoin that offer higher privacy?
Get my pub PGP key here:
awasunyin.com
Let’s stay in touch!
awasunyin@gmail.com

More Related Content

Similar to CSIG_presentation_updated.awa_.sun_.yin_.breakingbad.20180512.pdf

Machine Learning and Blockchain by Director of Product at Target
Machine Learning and Blockchain by Director of Product at TargetMachine Learning and Blockchain by Director of Product at Target
Machine Learning and Blockchain by Director of Product at Target
Product School
 
ADPP: A NOVEL ANOMALY DETECTION AND PRIVACY-PRESERVING FRAMEWORK USING BLOCKC...
ADPP: A NOVEL ANOMALY DETECTION AND PRIVACY-PRESERVING FRAMEWORK USING BLOCKC...ADPP: A NOVEL ANOMALY DETECTION AND PRIVACY-PRESERVING FRAMEWORK USING BLOCKC...
ADPP: A NOVEL ANOMALY DETECTION AND PRIVACY-PRESERVING FRAMEWORK USING BLOCKC...
ijaia
 
Blockchain technology overview
Blockchain technology overviewBlockchain technology overview
Blockchain technology overview
RishabhMalik32
 
Types of Blockchain, AI and its future
Types of Blockchain, AI and its futureTypes of Blockchain, AI and its future
Types of Blockchain, AI and its future
Aarthi Srinivasan
 
Risk & challenges in virtual currencies - Cocon 2013
Risk & challenges in virtual currencies - Cocon 2013Risk & challenges in virtual currencies - Cocon 2013
Risk & challenges in virtual currencies - Cocon 2013
Harsh Patel
 
Improved Particle Swarm Optimization Based on Blockchain Mechanism for Flexib...
Improved Particle Swarm Optimization Based on Blockchain Mechanism for Flexib...Improved Particle Swarm Optimization Based on Blockchain Mechanism for Flexib...
Improved Particle Swarm Optimization Based on Blockchain Mechanism for Flexib...
BRNSSPublicationHubI
 
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency AnalyticsO Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
Bernhard Haslhofer
 
Data mining is the statistical technique of processing raw data in a structur...
Data mining is the statistical technique of processing raw data in a structur...Data mining is the statistical technique of processing raw data in a structur...
Data mining is the statistical technique of processing raw data in a structur...
ssuser6478a8
 
01-pengantar.pdf
01-pengantar.pdf01-pengantar.pdf
01-pengantar.pdf
ssuseradaf5f
 
blockchain.pptx
blockchain.pptxblockchain.pptx
blockchain.pptx
19MEB302SahilAli
 
E-Voting using Blockchain Technology
E-Voting using Blockchain Technology E-Voting using Blockchain Technology
E-Voting using Blockchain Technology
Associate Professor in VSB Coimbatore
 
E-Voting using Blockchain Technology
E-Voting using Blockchain Technology E-Voting using Blockchain Technology
E-Voting using Blockchain Technology
Associate Professor in VSB Coimbatore
 
China Digital Economy
China Digital EconomyChina Digital Economy
China Digital Economy
Melanie Swan
 
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Infrastructure Facility
 
Blockchain Technology | Bitcoin | Ethereum Coin | Cryptocurrency
Blockchain Technology | Bitcoin | Ethereum Coin | CryptocurrencyBlockchain Technology | Bitcoin | Ethereum Coin | Cryptocurrency
Blockchain Technology | Bitcoin | Ethereum Coin | Cryptocurrency
Unbiased Technolab
 
Blockchain technology in (life) sciences
Blockchain technology in (life) sciencesBlockchain technology in (life) sciences
Blockchain technology in (life) sciences
Barbera van Schaik
 
Blockchain Technology and its effect on Environment: A Comparative Study betw...
Blockchain Technology and its effect on Environment: A Comparative Study betw...Blockchain Technology and its effect on Environment: A Comparative Study betw...
Blockchain Technology and its effect on Environment: A Comparative Study betw...
AI Publications
 
Blockchain Technology And Cryptocurrency
Blockchain Technology And CryptocurrencyBlockchain Technology And Cryptocurrency
Blockchain Technology And Cryptocurrency
Eno Bassey
 
Blockchain Interview Questions and Answers | Blockchain Technology | Blockcha...
Blockchain Interview Questions and Answers | Blockchain Technology | Blockcha...Blockchain Interview Questions and Answers | Blockchain Technology | Blockcha...
Blockchain Interview Questions and Answers | Blockchain Technology | Blockcha...
Edureka!
 
Fraud Detection: A Review on Blockchain
Fraud Detection: A Review on BlockchainFraud Detection: A Review on Blockchain
Fraud Detection: A Review on Blockchain
IRJET Journal
 

Similar to CSIG_presentation_updated.awa_.sun_.yin_.breakingbad.20180512.pdf (20)

Machine Learning and Blockchain by Director of Product at Target
Machine Learning and Blockchain by Director of Product at TargetMachine Learning and Blockchain by Director of Product at Target
Machine Learning and Blockchain by Director of Product at Target
 
ADPP: A NOVEL ANOMALY DETECTION AND PRIVACY-PRESERVING FRAMEWORK USING BLOCKC...
ADPP: A NOVEL ANOMALY DETECTION AND PRIVACY-PRESERVING FRAMEWORK USING BLOCKC...ADPP: A NOVEL ANOMALY DETECTION AND PRIVACY-PRESERVING FRAMEWORK USING BLOCKC...
ADPP: A NOVEL ANOMALY DETECTION AND PRIVACY-PRESERVING FRAMEWORK USING BLOCKC...
 
Blockchain technology overview
Blockchain technology overviewBlockchain technology overview
Blockchain technology overview
 
Types of Blockchain, AI and its future
Types of Blockchain, AI and its futureTypes of Blockchain, AI and its future
Types of Blockchain, AI and its future
 
Risk & challenges in virtual currencies - Cocon 2013
Risk & challenges in virtual currencies - Cocon 2013Risk & challenges in virtual currencies - Cocon 2013
Risk & challenges in virtual currencies - Cocon 2013
 
Improved Particle Swarm Optimization Based on Blockchain Mechanism for Flexib...
Improved Particle Swarm Optimization Based on Blockchain Mechanism for Flexib...Improved Particle Swarm Optimization Based on Blockchain Mechanism for Flexib...
Improved Particle Swarm Optimization Based on Blockchain Mechanism for Flexib...
 
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency AnalyticsO Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
O Bitcoin Where Art Thou? An Introduction to Cryptocurrency Analytics
 
Data mining is the statistical technique of processing raw data in a structur...
Data mining is the statistical technique of processing raw data in a structur...Data mining is the statistical technique of processing raw data in a structur...
Data mining is the statistical technique of processing raw data in a structur...
 
01-pengantar.pdf
01-pengantar.pdf01-pengantar.pdf
01-pengantar.pdf
 
blockchain.pptx
blockchain.pptxblockchain.pptx
blockchain.pptx
 
E-Voting using Blockchain Technology
E-Voting using Blockchain Technology E-Voting using Blockchain Technology
E-Voting using Blockchain Technology
 
E-Voting using Blockchain Technology
E-Voting using Blockchain Technology E-Voting using Blockchain Technology
E-Voting using Blockchain Technology
 
China Digital Economy
China Digital EconomyChina Digital Economy
China Digital Economy
 
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
SMART Seminar Series: "Blockchain and its Applications". Presented by Prof Wi...
 
Blockchain Technology | Bitcoin | Ethereum Coin | Cryptocurrency
Blockchain Technology | Bitcoin | Ethereum Coin | CryptocurrencyBlockchain Technology | Bitcoin | Ethereum Coin | Cryptocurrency
Blockchain Technology | Bitcoin | Ethereum Coin | Cryptocurrency
 
Blockchain technology in (life) sciences
Blockchain technology in (life) sciencesBlockchain technology in (life) sciences
Blockchain technology in (life) sciences
 
Blockchain Technology and its effect on Environment: A Comparative Study betw...
Blockchain Technology and its effect on Environment: A Comparative Study betw...Blockchain Technology and its effect on Environment: A Comparative Study betw...
Blockchain Technology and its effect on Environment: A Comparative Study betw...
 
Blockchain Technology And Cryptocurrency
Blockchain Technology And CryptocurrencyBlockchain Technology And Cryptocurrency
Blockchain Technology And Cryptocurrency
 
Blockchain Interview Questions and Answers | Blockchain Technology | Blockcha...
Blockchain Interview Questions and Answers | Blockchain Technology | Blockcha...Blockchain Interview Questions and Answers | Blockchain Technology | Blockcha...
Blockchain Interview Questions and Answers | Blockchain Technology | Blockcha...
 
Fraud Detection: A Review on Blockchain
Fraud Detection: A Review on BlockchainFraud Detection: A Review on Blockchain
Fraud Detection: A Review on Blockchain
 

Recently uploaded

Which Crypto to Buy Today for Short-Term in May-June 2024.pdf
Which Crypto to Buy Today for Short-Term in May-June 2024.pdfWhich Crypto to Buy Today for Short-Term in May-June 2024.pdf
Which Crypto to Buy Today for Short-Term in May-June 2024.pdf
Kezex (KZX)
 
Commercial Bank Economic Capsule - May 2024
Commercial Bank Economic Capsule - May 2024Commercial Bank Economic Capsule - May 2024
Commercial Bank Economic Capsule - May 2024
Commercial Bank of Ceylon PLC
 
when will pi network coin be available on crypto exchange.
when will pi network coin be available on crypto exchange.when will pi network coin be available on crypto exchange.
when will pi network coin be available on crypto exchange.
DOT TECH
 
how can I sell/buy bulk pi coins securely
how can I sell/buy bulk pi coins securelyhow can I sell/buy bulk pi coins securely
how can I sell/buy bulk pi coins securely
DOT TECH
 
Webinar Exploring DORA for Fintechs - Simont Braun
Webinar Exploring DORA for Fintechs - Simont BraunWebinar Exploring DORA for Fintechs - Simont Braun
Webinar Exploring DORA for Fintechs - Simont Braun
FinTech Belgium
 
Introduction to Value Added Tax System.ppt
Introduction to Value Added Tax System.pptIntroduction to Value Added Tax System.ppt
Introduction to Value Added Tax System.ppt
VishnuVenugopal84
 
APP I Lecture Notes to students 0f 4the year
APP I  Lecture Notes  to students 0f 4the yearAPP I  Lecture Notes  to students 0f 4the year
APP I Lecture Notes to students 0f 4the year
telilaalilemlem
 
The European Unemployment Puzzle: implications from population aging
The European Unemployment Puzzle: implications from population agingThe European Unemployment Puzzle: implications from population aging
The European Unemployment Puzzle: implications from population aging
GRAPE
 
how can i use my minded pi coins I need some funds.
how can i use my minded pi coins I need some funds.how can i use my minded pi coins I need some funds.
how can i use my minded pi coins I need some funds.
DOT TECH
 
Introduction to Indian Financial System ()
Introduction to Indian Financial System ()Introduction to Indian Financial System ()
Introduction to Indian Financial System ()
Avanish Goel
 
The secret way to sell pi coins effortlessly.
The secret way to sell pi coins effortlessly.The secret way to sell pi coins effortlessly.
The secret way to sell pi coins effortlessly.
DOT TECH
 
how can I sell pi coins after successfully completing KYC
how can I sell pi coins after successfully completing KYChow can I sell pi coins after successfully completing KYC
how can I sell pi coins after successfully completing KYC
DOT TECH
 
Exploring Abhay Bhutada’s Views After Poonawalla Fincorp’s Collaboration With...
Exploring Abhay Bhutada’s Views After Poonawalla Fincorp’s Collaboration With...Exploring Abhay Bhutada’s Views After Poonawalla Fincorp’s Collaboration With...
Exploring Abhay Bhutada’s Views After Poonawalla Fincorp’s Collaboration With...
beulahfernandes8
 
The new type of smart, sustainable entrepreneurship and the next day | Europe...
The new type of smart, sustainable entrepreneurship and the next day | Europe...The new type of smart, sustainable entrepreneurship and the next day | Europe...
The new type of smart, sustainable entrepreneurship and the next day | Europe...
Antonis Zairis
 
what is the future of Pi Network currency.
what is the future of Pi Network currency.what is the future of Pi Network currency.
what is the future of Pi Network currency.
DOT TECH
 
where can I find a legit pi merchant online
where can I find a legit pi merchant onlinewhere can I find a legit pi merchant online
where can I find a legit pi merchant online
DOT TECH
 
Financial Assets: Debit vs Equity Securities.pptx
Financial Assets: Debit vs Equity Securities.pptxFinancial Assets: Debit vs Equity Securities.pptx
Financial Assets: Debit vs Equity Securities.pptx
Writo-Finance
 
US Economic Outlook - Being Decided - M Capital Group August 2021.pdf
US Economic Outlook - Being Decided - M Capital Group August 2021.pdfUS Economic Outlook - Being Decided - M Capital Group August 2021.pdf
US Economic Outlook - Being Decided - M Capital Group August 2021.pdf
pchutichetpong
 
Summary of financial results for 1Q2024
Summary of financial  results for 1Q2024Summary of financial  results for 1Q2024
Summary of financial results for 1Q2024
InterCars
 
Poonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit Card
Poonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit CardPoonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit Card
Poonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit Card
nickysharmasucks
 

Recently uploaded (20)

Which Crypto to Buy Today for Short-Term in May-June 2024.pdf
Which Crypto to Buy Today for Short-Term in May-June 2024.pdfWhich Crypto to Buy Today for Short-Term in May-June 2024.pdf
Which Crypto to Buy Today for Short-Term in May-June 2024.pdf
 
Commercial Bank Economic Capsule - May 2024
Commercial Bank Economic Capsule - May 2024Commercial Bank Economic Capsule - May 2024
Commercial Bank Economic Capsule - May 2024
 
when will pi network coin be available on crypto exchange.
when will pi network coin be available on crypto exchange.when will pi network coin be available on crypto exchange.
when will pi network coin be available on crypto exchange.
 
how can I sell/buy bulk pi coins securely
how can I sell/buy bulk pi coins securelyhow can I sell/buy bulk pi coins securely
how can I sell/buy bulk pi coins securely
 
Webinar Exploring DORA for Fintechs - Simont Braun
Webinar Exploring DORA for Fintechs - Simont BraunWebinar Exploring DORA for Fintechs - Simont Braun
Webinar Exploring DORA for Fintechs - Simont Braun
 
Introduction to Value Added Tax System.ppt
Introduction to Value Added Tax System.pptIntroduction to Value Added Tax System.ppt
Introduction to Value Added Tax System.ppt
 
APP I Lecture Notes to students 0f 4the year
APP I  Lecture Notes  to students 0f 4the yearAPP I  Lecture Notes  to students 0f 4the year
APP I Lecture Notes to students 0f 4the year
 
The European Unemployment Puzzle: implications from population aging
The European Unemployment Puzzle: implications from population agingThe European Unemployment Puzzle: implications from population aging
The European Unemployment Puzzle: implications from population aging
 
how can i use my minded pi coins I need some funds.
how can i use my minded pi coins I need some funds.how can i use my minded pi coins I need some funds.
how can i use my minded pi coins I need some funds.
 
Introduction to Indian Financial System ()
Introduction to Indian Financial System ()Introduction to Indian Financial System ()
Introduction to Indian Financial System ()
 
The secret way to sell pi coins effortlessly.
The secret way to sell pi coins effortlessly.The secret way to sell pi coins effortlessly.
The secret way to sell pi coins effortlessly.
 
how can I sell pi coins after successfully completing KYC
how can I sell pi coins after successfully completing KYChow can I sell pi coins after successfully completing KYC
how can I sell pi coins after successfully completing KYC
 
Exploring Abhay Bhutada’s Views After Poonawalla Fincorp’s Collaboration With...
Exploring Abhay Bhutada’s Views After Poonawalla Fincorp’s Collaboration With...Exploring Abhay Bhutada’s Views After Poonawalla Fincorp’s Collaboration With...
Exploring Abhay Bhutada’s Views After Poonawalla Fincorp’s Collaboration With...
 
The new type of smart, sustainable entrepreneurship and the next day | Europe...
The new type of smart, sustainable entrepreneurship and the next day | Europe...The new type of smart, sustainable entrepreneurship and the next day | Europe...
The new type of smart, sustainable entrepreneurship and the next day | Europe...
 
what is the future of Pi Network currency.
what is the future of Pi Network currency.what is the future of Pi Network currency.
what is the future of Pi Network currency.
 
where can I find a legit pi merchant online
where can I find a legit pi merchant onlinewhere can I find a legit pi merchant online
where can I find a legit pi merchant online
 
Financial Assets: Debit vs Equity Securities.pptx
Financial Assets: Debit vs Equity Securities.pptxFinancial Assets: Debit vs Equity Securities.pptx
Financial Assets: Debit vs Equity Securities.pptx
 
US Economic Outlook - Being Decided - M Capital Group August 2021.pdf
US Economic Outlook - Being Decided - M Capital Group August 2021.pdfUS Economic Outlook - Being Decided - M Capital Group August 2021.pdf
US Economic Outlook - Being Decided - M Capital Group August 2021.pdf
 
Summary of financial results for 1Q2024
Summary of financial  results for 1Q2024Summary of financial  results for 1Q2024
Summary of financial results for 1Q2024
 
Poonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit Card
Poonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit CardPoonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit Card
Poonawalla Fincorp and IndusInd Bank Introduce New Co-Branded Credit Card
 

CSIG_presentation_updated.awa_.sun_.yin_.breakingbad.20180512.pdf

  • 1. “Breaking Bad: De-Anonymising Entity Types on the Bitcoin Blockchain Using Supervised Machine Learning ” HICSS–51, January 2018 Internet and the Digital Economy Distributed Ledger Technology: The Blockchain Minitrack Awa Sun Yin @awasunyin
  • 2. About Me Academic Background & Current Position BSc. in Business & Statistics MSc. in Information Systems Data Scientist at Chainalysis, spec. in ML/DL & Blockchain Research director at the Interchain Foundation* Research Areas of Interest Application of ML/DL to Blockchain data for clustering, de-anonymization, etc. 2nd and 3rd Generation Blockchains: Ethereum, Cøsmos* Privacy Coins & Cryptography: Monero, ZCash Haohua (Awa) Sun Yin
  • 3. Agenda 1 2 3 Background & Motivations Problem Formulation Research Question 4 5 6 Basic Concepts Methodology Final Outcomes & Reflections 7 Future Research 8 Q&A
  • 4. 1 Background & Motivations Adoption of Cryptocurrencies 2.9 to 5.8 million unique users (mostly Bitcoin), and increasing Accepted as a payment method by over 100,000 merchants (~2014) Affiliation with Illicit Activities Used for: Money laundering, scamming, terror financing Used as payment method for: cyber-extortion (ransom payments), thievery, trading illegal goods in the Darknet Need for Investigation & Compliance Tools Businesses: Required by AML and KYC regulations, need tools to assess the risk of each of their customers Law Enforcement: Need for domain specific analysis and investigation tools
  • 5. 1 2 Background & Motivations Problem Formulation BTC “Account” Other E- Payment Methods Public Key / Address Personal ID Email Address Phone Number Physical Address Transaction Hash Receiving Address Amount BTC Anyone can create anytime a BTC “account” Issued by centralized organizations
  • 6. 1 2 3 Background & Motivations Problem Formulation Research Question To what extent can we predict the category of a yet-unidentified cluster on the Bitcoin Blockchain?
  • 7. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 4a Basic Concepts X Y 12c6DSiU4Rq3P4ZxziKxzrL5LmMBrzjrJX 1DGoMT2uz6Dg59JbwtDSp8KyjiTPR7RnVQ 1 BTC 69cea94f19a47ea0d118e80bef56fbd1e8 5d134472f18214ed8e6b236d2701f8 Category: Unknown Category: Unknown
  • 8. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 4b Basic Concepts Alice Kraken B13HjAyJyAX6SYc6hrEqLf4naoKCWdeS 1FaEn6fGeDfDkHZvvSLsVRVxqgQt95JeyD 1JQFWqx9VFMagd8UryqiJriLmCvCuJLyf5 1PxoZtvm3sb6LnYS5iNERuJBNR5oQZxuXJ 1K4EqXxSVd6NndkmVAMsSBd1n8VJNJLwr2 10 BTC 72cme94f19a47ea0d118e80bef56fbd1e 85d134472f18214ed8e6b236d2701f8 Category: Personal Wallet Category: Exchange
  • 9. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 4c Basic Concepts The Bitcoin Network I
  • 10. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 4c Basic Concepts Gambling Scam Mixing Merchant Services Exchange Darknet Market The Bitcoin Network II
  • 11. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5a Methodology: Dataset Dataset Description § 434 Observations § 10 categories § + 200 M transactions § Period: January 2009 – May 2017
  • 12. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5b Data Preparation Data Preparation (Python & Numpy) Enriched data Preprocessing Feature engineering Input data Raw data Data Provider’s Tool Clustering & manual labelling API BTC Node or Core APIs Co-spend, custom heuristics, etc Cleaning & Dimensionality Reduction 76 Features [434, 77]
  • 13. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5c Analysis: Supervised Learning Data Analysis (Python & Scikit-Learn) Selection of algorithms Train each model with default parameters Assess & compare results Find optimal parameters Train each model with optimal parameters Oversample two minority classes Input data Find optimal parameters 1 2 3 kNN **RF ET AB DT **BG ***GB SMOTE CV-Random Search Conf, ROC & Class. Report
  • 14. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5d Results ROC Curve and Classification Report from Gradient Boosted Trees classifier, average confidence of 77%
  • 15. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5 6b Methodology Reflections Implications § It is possible to categorize unidentified clusters on Bitcoin using supervised learning § Further challenging Bitcoin’s true level of anonymity § Applicability to compliance, investigation tools Limitations § Dataset limited to 434 observations § Low performance with under-sampled categories § Features not reflecting all available data § Lack of test set
  • 16. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5 6b Methodology Conclusion Multiclass Classification on the Bitcoin Blockchain Goal: Predict the category of unidentified clusters Methodology: Using a dataset of already identified clusters (a total of 434 observations across 10 categories) Results: It is possible to classify with a confidence of 77% using Gradient Boosted Trees Implications of the Research A degree of de-anonymization can be achieved using this approach Considering the limitations: Paving the way for future research
  • 17. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5 6 Methodology Final Outcomes & Reflections 7a Future Research Expanding the Dataset of Identified Clusters Number of observations per category Number of categories Refining & Testing Alternative Methodologies Automatic feature engineering and extraction Testing more classification algorithms Binary Classification Applying the Model Use the tested methodology to uncover the Bitcoin Blockchain for multiple purposes: cybercrime investigations, compliance tools, etc.
  • 18. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5 7b Methodology Future Research Haohua Sun Yin & Ravi Vatrapu, A First Estimation of the Proportion of Cybercriminal Entities in the Bitcoin Ecosystem using Supervised Machine Learning, IEEE Big Data for Cybercrime Prevention, 2017, Boston MA
  • 19. 1 2 3 Background & Motivations Problem Formulation Research Question & Objectives 5 6 Methodology Final Outcomes & Reflections 8 Q&A Proposed Questions § How can we increase the dataset in both number of observations and categories? § If Bitcoin is not truly anonymous, why has it been used for nefarious activities? § Are there alternatives to Bitcoin that offer higher privacy?
  • 20. Get my pub PGP key here: awasunyin.com Let’s stay in touch! awasunyin@gmail.com