Digital Identity is Under Attack: FIDO Paris Seminar.pptx
IBM Blockchain Platform - Architectural Good Practices v1.0
1. Architectural Good Practices
How to create blockchains that work well
V1.0, 4 October 2019
IBM Blockchain Platform Technical Series
Architectural Good Practices
Modeling Blockchain Applications
What’s New in Technology
Using IBM Blockchain Platform
Technical Introduction
4. A simple analogy to start with…
How would you maximize the throughput of a Ferris wheel?
https://www.maxpixel.net/Folk-Festival-Ferris-Wheel-Height-1024217
People = transactions Stalls = client apps
Gondolas/Pods = organizations
5. Recap on Hyperledger Fabric transaction process
Client
Application
SDK
Membership
Services
Peer
Endorser
Ledger
Committer
A
Chaincode B
!Events
Ordering-Service
O
O O
O
ü
Fabric-CA
ü
External-CA
Hyperledger Fabric Network
optionaloptional
Admin
6. So, what do we mean by performance?
Performance can mean anything… and in a de-centralized network the concept is even more vague.
Client App PeerOrg 1 PeerOrg 2 OrdererPeerOrg N
and what about the breaking point?
When transactions starts timing out…
Response time:
75 ms?
Endorsement: 10 ms
Committing: 5 ms
20 ms
10 ms
10 ms
5 ms
Ordering:
20 ms
Response time
or
system latency?
Endorsement
& validation
throughput ?
Endorsement
& validation
throughput ?
Endorsement
& validation
throughput ?
Block
throughput ?
Network latency?
Note: Response time figures are
not representative.
7. What are the Hyperledger Fabric performance levers?
A
B
0 1 2 3
Peer
Ledger
Blockchain WorldState
!Events
Chaincode
Channels
Local
MSP
1
4
3
2
1
3
4
• Indexing
• Size of payload
World State
Orderers / Channels (configtx.yaml)
• "absolute_max_bytes"
• "max_message_count"
• "timeout"
• World state queries
• Query aggregation (e.g. Sum)
• Business logic complexity
• Query to external sources (oracles)
Chaincode
2
• VSCC Transaction validation
• Peer’s resources
Committing peers
• ESCC Transaction signing
• Peer’s resources
Endorsing peers
8. What are the Client Application performance levers?
! Events
ChannelsClient
Application
SDK
(HFC)
Local
MSP
1
• GRPCs Connections reuse
• Endorsement load balancing
• Don’t always hit the same peer
• Complex Endorsement policies
• More endorsers = more invocations
• Channel Event Hubs impact
• How many events should you wait for?
• More peers an organization has on a
channel means more events to listen to
3
2
3
2
• Network latency
• To the orderer
• To the peers of every orgs
1 Fabric components
Fabric SDK / Client Application
Channels
9. Keeping up with the
community…
• 2018 has been an active year for
Blockchain and Hyperledger Fabric.
• Lots of work in the field of
performance has been delivered,
providing a constant stream of
optimization opportunity and new
ideas…
• As versions of the software evolve,
those reports are markers of a vibrant
community, however not every tuning
still applies…
https://arxiv.org/pdf/1808.08406.pdf
https://arxiv.org/pdf/1801.10228.pdf
https://arxiv.org/pdf/1805.11390.pdf
https://arxiv.org/pdf/1901.00910.pdf
10. The context of blockchain performance
Performance must be seen in context of the wider system architecture
• What external applications submit transactions?
• Are there any SLAs?
• Can they process events successfully?
12. Performance testing requires a systemic
approach involving:
• Many test cycles
• High-volume of data to process
• Multiple configurations to support
Deployment
Tool
Data
Analysis
Load
Generator
Tool
System
Monitoring
Tool
Agility in
deployment
Efficient load
generation
From insight to
action
Capturing the
right data
Things to look for in a Blockchain performance tool
Pick the right tool!
Without the proper tools, progress will be slow...
13. 13
Hyperledger
Performance Traffic
Engine (PTE)
https://github.com/hyperledger/fabric-
test/tree/master/tools/PTE
+ Multiple deployment
options (K8S, Cello, Cloud)
- Complex to manage and
configure
+ Aligned with Fabric
release
- Does not use your client
application
+ Basic metrics are
provided out of the box
- Missing advanced
metrics (ESCC, VSCC)
+ All the analysis you
can think of J
- Upfront investment
Hyperledger Caliper
https://hyperledger.github.io/caliper/
+ Pre-defined topologies
- Work required for
distributed network
topologies
+ Support distributed
clients
- Does not use your client
application
+ Basic metrics are
provided out of the box
- Limited support for
remote topologies
(Docker stats)
+ All the analysis you
can think of J
- Upfront investment
Custom + Perfect for existing
project topologies
- Upfront time investment
to create topology
+ Complete flexibility in
generating the load
- Upfront time investment
to ensure your tool is
accurate
+ All the metrics you can
think of J
- Upfront investment
+ All the analysis you
can think of J
- Upfront investment
Deployment Data analysis
Load
Generator
Monitoring
What are your options?
14. • Generic framework that supports many
blockchain technologies
• Benchmark layer defines the test cases
to invoke
• Blockchain (north bound interfaces) NBIs
define the integration point with the
underlying blockchain (e.g. chaincode
deploy)
• Resource Monitor currently supports:
• Local processes (e.g. client app)
• Remote/Local docker containers (e.g. peers)
Overview of Hyperledger Caliper
https://hyperledger.github.io/caliper/docs/2_Architecture.html
15. Structure of an Hyperledger Caliper project
The two things to define:
1. Network configuration
2. Benchmark tests
1
2
16. Output of a Hyperledger Caliper test run
https://hyperledger.github.io/caliper/assets/img/report.png
17. Use of custom performance test harness
• What if you want to test your Hyperledger Fabric client app?
• This is where a simple test harness can be useful!
Creates a series of promises
invoking your node.js client library
Capture execution time and
measure throughput
Using Node.js child process,
create has many clients are
required
18. • In Fabric v1.4 the peer and the orderer host an HTTP server that offers a RESTful
“operations” API that includes two ways to expose metrics:
• a pull model based on Prometheus (open-source monitoring solution)
• a push model based on StatsD (statistics aggregation daemon)
https://hyperledger-fabric.readthedocs.io/en/release-1.4/metrics_reference.html
Metric Name Description
endorser_propsal_duration The time to complete a proposal.
broadcast_validate_duration The time to validate a transaction in seconds
blockcutter_block_fill_duration
The time from first transaction enqueing to the block being
cut in seconds
ledger_blockstorage_commit_time Time for committing the block and private data to storage
ledger_statedb_commit_time Time for committing block changes to state db
couchdb_processing_time Time for the function to complete request to CouchDB
Leveraging the HL Fabric v1.4 metrics
It includes many
metrics that are
useful
in performance
testing.
Summary of some of the key metrics in performance testing:
19. What we learned so far
from real projects and
things to watch for…
20. 1. Make sure you reuse your connections
2. Ensure your connection profile is
properly configured
3. Load balance endorsement requests
across multiple peers
4. Sample code is not a guarantee of
scalability
5. If starting a new project, leverage the
new HL Fabric v1.4 programming
model!
Optimizing the client application
Latency per tx when not
reusing client connections
1
Enabling the peers that need to endorse2
21. 1. Understanding the effect of:
(configtx.yaml)
– Preferred Max Bytes
Represents the threshold at which a block will be
cut
– Block size
– Timeout
2. The peer and its dual personality
– Commit versus endorsement
– State validation is the most expensive operations
• Driven primarily by payload size and frequency
3. Increasing the number of endorsing
peers helps throughput
Transactions and the Orderer
Block commit State commit State validation
22. 1. Test your queries against separate
CouchDB first
Use _explain and execution_stats to confirm
index is used
2. Avoid $or or $regexp in selectors
It ignores the index and forces CouchDB to
perform a full scan
3. Use pagination to improve performance
Pagination is supported since Hyperledger
Fabric v1.3
Considerations for queries
23. It’s all about the use-case
• Consider your non-functional requirements to drive the network topology
• There is no recipe about the “best” block size or the ideal number of peers
• Leverage the new fabric programming model to simplify your client application tuning
25. 25
Integrating with Existing Systems – Possibilities
Transform
Existing
systems
1. System
events
2. Blockchain
events
4. Call out to existing systems
3. Call into blockchain network
from existing systems
Blockchain network Existing
systems
!
!
26. 26
• Blockchain is a network system of record
• Two-way exchange
– Events from blockchain network create
actions in existing systems
– Cumulative actions in existing systems
result in Blockchain interaction
Integrating with Existing Systems – Using Middleware
• Transformation between blockchain and existing systems’ formats
– GBO, ASBO is most likely approach
– Standard approach will be for gateway products to bridge these formats
– Gateway connects to peer in blockchain network and existing systems
• Smart contracts can call out to existing systems
– Query is most likely interaction for smart decisions
• e.g. all payments made before asset transfer?
• Warning: Take care over predictability: transaction must provide same outputs each time it executes…
27. 27
Non-determinism in blockchain
• Blockchain is a distributed processing system
– Smart contracts are run multiple times and in
multiple places
– As we will see, smart contracts need to run
deterministically in order for consensus to work
• Particularly when updating the world state
• It’s particularly difficult to achieve determinism with
off-chain processing
– Implement oracle services that are guaranteed to
be consistent for a given transaction, or
– Detect duplicates for a transaction in the
blockchain, middleware or external system
getDateTime()
getExchangeRate()
getTemperature()
random()
incrementValue
inExternalSystem(…)
28. 28
Business Considerations
• As a B2B system, blockchain adds a number of aspects that are not typical in other projects:
– Who pays for the development and operation of the network?
– Where are the blockchain peers hosted?
– When and how do new participants join the network?
– What are the rules of confidentiality in the network?
– Who is liable for bugs in (for example) shared smart contracts?
– For private networks, what are the trusted forms of identity?
• Remember that each business network participant may have different requirements (e.g. trust)
– Evaluate the incentives of potential participants to work out a viable business model
• Mutual benefit → shared cost (e.g. sharing reference information)
• Asymmetric benefit → money as leveler (e.g. pay for access to KYC)
29. 29
Trade-offs Between Non-Functional Requirements
Consider the trade-offs between
performance, security and resiliency!
Performance
o The amount of data being shared
o Number and location of peers
o Latency and throughput
o Batching characteristics
Security
o Type of data being shared, and with whom
o How is identity achieved
o Confidentiality of transaction queries
o Who verifies (endorses) transactions
Resiliency
o Resource failure
o Malicious activity
o Non-determinism
30. 30
Non-Functional Requirements as a slider
Multi-tenant Cloud
Low Controls
Single Site Development
SaaS
Constrained
Monolithic
Higher re-use
Proven
Low
Low
Low
Dedicated
High Controls
High Availability & DR
On-Premise
Ready For Growth
Highly Modular
Custom Build
Leading Edge
High
High
High
Isolation
Security
Resilient Design
Dev Options
Sized For Growth
Componentisation
Re-Use
Production Readiness
Blockchain Network Complexity
Blockchain Network
Security/Privacy Complexity
Blockchain SC Complexity
Adjust the sliders with
the client early in the
project so all parties are
aligned on the
expectations of
robustness, isolation,
security controls etc. as
all these factors have
material impact on the
cost and complexity of
the solution.
$ $$$
31. 31
Privacy & Confidentiality: Toolbox and Patterns
ü IBM Blockchain Platform delivers the
most comprehensive privacy
toolbox of any blockchain
technology provider.
ü IBM Research is the industry
leading expert behind these
technology advancements
ü Tools are often combined in patterns
to achieve the P&C requirements for
given industry data.
ü Many patterns are in use today in
production blockchain solutions:
TradeLens and IBM Food Trust, and
other IBM delivered solutions
ü All blockchain solutions have an
accompanied application stack. At
scale P&C has been a key focus
IBMs large program deliveries for
decades
.
ü Sharing needs of your industry
include public, dynamic, and secret
bilateral
Recommended Pattern for unobservable secret bilateral sharing
All parties ensure
verifiability without
knowing:
if activity is
real
Activity
Obfuscation
Noise
who is
acting
Identity
Obfuscation
IdMixer
transaction
details
Data
Isolation
Private Data
Collections
32. 33
Summary
• Identify key operational considerations
• Consensus is the art of maintaining a consistent ledger
• It is possible to integrate with existing systems, but take care over determinism
• Security requirements solved through techniques such as encryption and signing
• Also consider business and non-functional requirements
34. 35
GDPR Topics
• Summary of GDPR
• “Right to Erasure” and Blockchain
• Hyperledger Fabric Block
• Solution
• Non-Compliant Solutions
• Further Information
35. 36
General Data Protection Regulation (GDPR) – Overview
• Employee or Customer
• New rights include:
– Information about processing
– Obtain access to PD
– Request corrections to PD
– Data be erased
– Object to being used for
marketing
– Restriction of processing
– Data portability
– Automated processing of PD
Data
Subject
Data Controller
Determines the
purpose and means
by which personal
data is processed
Contracts with
processor must be
GDPR compliant
Organisations can be either/both
Data Processor
Processes data on
behalf of data
controller
Legal Contract
specifies GDPR
duties to the controller
Any information
relating to an identified
or identifiable natural
person
Personal Data
• GDPR became law on 25th May 2018
• Applies to organisations operating within the EU, also applies to
organisations outside the EU who offer goods/services to EU individuals
• Fines of €20m or 4% of global revenues
The processor must provide sufficient guarantees
they will implement measures to meet GDPR.
36. 37
General Data Protection Regulation (GDPR) – Links
• Rights for citizens
– https://ec.europa.eu/info/law/law-topic/data-protection/reform/rights-citizens/my-rights/what-are-my-rights_en
• What is Personal Data?
– https://ec.europa.eu/info/law/law-topic/data-protection/reform/what-personal-data_en
– Examples: Names, Address, Email, ID Card number, Location Data, IP Address, Cookie ID,
Advertising ID
– https://ico.org.uk/media/for-organisations/documents/1554/determining-what-is-personal-data.pdf
• What is a data controller or processor?
– https://ec.europa.eu/info/law/law-topic/data-protection/reform/rules-business-and-organisations/obligations/controller-processor/what-data-controller-or-data-processor_en
• Official Journal
– http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016R0679&from=EN
37. 38
Further Reading
• UK Information Commissioner’s Office (ICO) Data Protection Act (DPA) document that
helps determine if data is considered Personal Data. DPA is being superseded by GDPR:
– https://ico.org.uk/media/for-organisations/documents/1554/determining-what-is-personal-data.pdf
– DPA and GDPR
– https://www.itgovernance.co.uk/data-protection
– Anonymisation
• https://ico.org.uk/media/for-organisations/documents/1061/anonymisation-code.pdf
• NHS in the UK document on Anonymisation and Pseudonymisation Standard:
– https://www.nhsbsa.nhs.uk/sites/default/files/2018-
03/NHSBSA%20Anonymisation%20and%20Pseudonymisation%20Standard.pdf
• WP216 opinion 05/2014 on Anonymisation Techniques (often cited in GDPR information):
– https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf
38. 39
GDPR Topics
• Summary of GDPR
• “Right to Erasure” and Blockchain
• Hyperledger Fabric Block
• Solution
• Non-Compliant Solutions
• Further Information
39. 40
The “Right to Erasure” and blockchain
Enables an individual to request the deletion or removal of their personal data
Config
Block
0
Config
Block
1
Transaction
Block
2
Transaction
Block
3
Genesis
Worldstate
Blockchain
Transactions in
the blockchain
are immutable
Data in the
world state is
mutable
40. 41
Further Reading
• UK Information Commissioner’s Office (ICO) “Right to Erasure” guidance:
– https://ico.org.uk/for-organisations/guide-to-the-general-data-protection-regulation-gdpr/individual-rights/right-to-erasure
41. 42
GDPR Topics
• Summary of GDPR
• “Right to Erasure” and Blockchain
• Hyperledger Fabric Block
• Solution
• Non-Compliant Solutions
• Further Information
42. 43
Summary of a Hyperledger Fabric Block
Header Number Previous hash (merkel root) Data hash (merkel root)
BlockData
(repeatedpertransaction)
Signature Client signature of transaction payload
Payload Header Channel Header (timestamp, tx_id, type, …) Signature Header (client certificate)
Data Client certificate MspId
Proposal payload Proposal hash
Array of keys read (key and version of value)
Array of keys written (key/value and delete flag)
Array of range_queries (start/end key and list)
Events (chaincode_id, tx_id, name, payload)
Chaincode response (status, message, payload)
Chaincode ID (path, name, version)
Array of endorsers (mspid, certificate, signature)
Block Metadata
Array of orderers (mspid, certificate, signature)
Array of transaction valid flags
• Fields highlighted in red could potentially contain Personal Data recorded on the blockchain.
DataIndex
Block
43. 44
Block Fields and Personal Data
This slide describes the potential areas that Personal Data could be included in the block:
• Proposal payload : These are the transaction input arguments.
• Client certificate: If the client certificate is individual to a Data Subject then it could be
considered PD.
• Key: The key written to or read from the world state. If PD is used in the key name then it will be
included here.
• Value: The value of any keys written to the world state. If PD is used in the key value then it will
be included here.
• Events: Events emitted from chaincode could include PD.
• Chaincode response: Any response from invoking the chaincode could include PD.
It is assumed that endorser and orderer certificates are issued to an organization and do not
include Personal Data.
Remember that Fabric configuration blocks also contain certificates!
44. 45
GDPR Topics
• Summary of GDPR
• “Right to Erasure” and Blockchain
• Hyperledger Fabric Block
• Solution
• Non-Compliant Solutions
• Further Information
45. 46
Solution – Store data off-chain
Config
Block
0
Config
Block
1
Transaction
Block
2
Transaction
Block
3
Genesis
Worldstate
Blockchain
Data Store
Salted hash
stored on-chain
matches hash in
data store
• What
– Store Personal Data (PD) in an off-chain mutable data storage
• How
– Store only proofs as a salted hash of the PD on-chain
– The salt must be stored securely, and should be unique for each PD
– The salted hash can be stored in an off-chain data store and linked to the PD
– Salt and salted hash can be in different data stores to the PD
– Warning: Hashes and salted hashes are considered pseudonymous data and
still fall under the scope of GDPR.
• Why
– PD, salted hash and salt can be deleted at any point from the data store(s)
causing the hash on-chain to be anonymized
Personal Data
Salt
salted hash
salted hash
salted hash
46. 47
Notes – Store data off-chain
• Alternative solutions
⎼ There are alternative solutions that balance the right-to-erasure with
checking the integrity of the PD. For example a crypto random number
(CRN) is linked with the PD both of which are stored in an off-chain data
store. The CRN is then stored on-chain. The CRN is not computed from
the value of the PD (as in the case of a hash) and is perhaps easier to
anonymise, but doesn’t provide proof of the actual value of PD.
⎼ It is best practice to associate a CRN with a single PD record
• Reidentification of data
– Check the risk of reidentification of the data stored on-chain. For example
transaction correlation or improper controls that allows someone to
associate the data stored on-chain with the off-chain PD.
• It is important to consult with IBM Security counsel on any solution.
47. 48
GDPR Topics
• Summary of GDPR
• “Right to Erasure” and Blockchain
• Hyperledger Fabric Block
• Solution
• Non-Compliant Solutions
• Further Information
48. 49
Store Personal Data in Private Data Collections
Config
Block
0
Config
Block
1
Transaction
Block
2
Transaction
Block
3
Genesis
Worldstate
Blockchain
Hashes of PD stored
on blockchain
PD passed in the
transient data field and
stored in the private
world state and private
writeset storage
• What
– Send Personal Data in transient part of blockchain transaction, save
PD in private state and hashes on the blockchain
• How
– PD is sent in the transient data field to the Smart Contract (FAB
2450), data stored in private state, hashes are stored on the
channel and writeset is stored in the private writeset storage (FAB
1151). Available in Fabric v1.2.
• Why not compliant
– FAB 1151 stores the private data writeset in storage similar to a
blockchain. This private writeset storage cannot be deleted (other
than a blockToLive policy on the collection) and therefore a request
to erase PD cannot be processed.
Private
State
49. 50
Private Data Collections : Further Information
• Further enhancements to Private Data Collections in Fabric will enable data in the private
writeset and world state to be deleted on demand. This may make it suitable for storing
personal data within the bounds of GDPR regulation. Currently there is no timeframe for
this additional feature. Further information can be found here:
– https://jira.hyperledger.org/browse/FAB-5097
51. 52
GDPR Topics
• Summary of GDPR
• “Right to Erasure” and Blockchain
• Hyperledger Fabric Block
• Solution
• Non-Compliant Solutions
• Further Information
52. 53
Further Information
• GDPR consideration for blockchain solution architects
– https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=84018584GBEN&
• Blockchain and GDPR
– https://www-01.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=61014461USEN