SlideShare a Scribd company logo
From CRUD to Event Sourcing
an Investible Stock Universe
O'Reilly Software Architecture Conference March 18, 2015 #oreillysacon
Marc Siegel
Team Lead (@ms_yd)
Brian Roberts
Senior Developer / Team Lead (@flicken)
TIM Group
What's in it for you?
● Answer the Unanswerable
○ both now and future
Vocabulary
1. Domain-Driven Design (DDD)
○ Event
○ Command
○ Aggregate
2. Event Sourcing (ES)
○ Projection
○ Read Model
How does Event Sourcing Work?
(Image credit: Wikipedia)
How does Event Sourcing Work?
(Quotes from Greg Young)
"State transitions are an important part of our problem
space and should be modeled within our domain"
How does Event Sourcing Work?
(Quotes from Greg Young)
"State transitions are an important part of our problem
space and should be modeled within our domain"
"Event Sourcing says all state is transient and you only
store facts."
Vocabulary
1. Domain-Driven Design (DDD)
○ Event: something that happened in the past; a fact; a state transition
○ Command
○ Aggregate
2. Event Sourcing (ES)
○ Projection
○ Read Model
Lifecycle of a Listing (Simplified)
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
Q: How to represent the meaningful state
transitions of the domain as events?
FB (Facebook)
Listing Lifecycle Events
IPET (Pets.com)
TickerListed
ticker: MSFT
date: 1986
TickerListed
ticker: IPET
date: 2000
TickerDelisted
ticker: IPET
date: 2000
TickerListed
ticker: FB
date: 2012
Events:
FB (Facebook)
t
1995 2000 2005 2010 2015
MSFT (Microsoft)
Determine Current State
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
Q: How do we
determine the state as
of a given year? Or as
of today?
FB (Facebook)
Determine Current State
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
Q: How do we
determine the state as
of a given year? Or as
of today?
A: "When we talk about Event
Sourcing, current state is a
left-fold of previous behaviors"
-- Greg Young
FB (Facebook)
Current State is a Left Fold of Events
● FP: Left Fold aggregates a collection via a function and an initial value
○ Ruby: [1, 2, 3].inject(0, :+) == 6 # symbol fn name
○ Scala: List(1, 2, 3).foldLeft(0)(_ + _) == 6 // anon function
● Provide an initial state s0
and a function f : (S, E) => S
● Current State after event e3
is:
○ = leftFold( [e1
, e2
, e3
], s0
, f )
○ = f( f( f( s0
, e1
), e2
), e3
)
Replay to Earlier State: 1998
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
State in 1998 =
[].apply(
TickerListed
ticker: MSFT
date: 1986
)
FB (Facebook)
Replay to Earlier State: 1998
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
MSFT
Listed
State in 1998 =
[ ]
FB (Facebook)
Replay to Earlier State: 2000
IPET (Pets.com)
MSFT
Listed
State in 2000 =
[ ] TickerListed
ticker: IPET
date: 2000
.apply( )
FB (Facebook)
MSFT (Microsoft)
t
1995 2000 2005 2010 2015
Replay to Earlier State: 2000
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
MSFT
Listed
IPET
Listed
State in 2000 =
[ ]
FB (Facebook)
Replay to Earlier State: 2001
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
MSFT
Listed
IPET
Listed
State in 2001 =
[ ] TickerDelisted
ticker: IPET
date: 2000
.apply( )
FB (Facebook)
Replay to Earlier State: 2001
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
MSFT
Listed
IPET
Delisted
State in 2001 =
[ ]
FB (Facebook)
Replay to Earlier State: 2012
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
FB (Facebook)
MSFT
Listed
IPET
Delisted
State in 2012 =
[ ] TickerListed
ticker: FB
date: 2012
.apply( )
Replay to Earlier State: 2012
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
MSFT
Listed
IPET
Delisted
FB
Listed
State in 2012 =
[ ]
FB (Facebook)
Review - Only the Events are Stored
t
1995 2000 2005 2010 2015
IPET (Pets.com)
MSFT (Microsoft)
TickerListed
ticker: MSFT
date: 1986
TickerListed
ticker: IPET
date: 2000
TickerDelisted
ticker: IPET
date: 2000
TickerListed
ticker: FB
date: 2012
Events:
FB (Facebook)
Potential Benefits
● Answer the unanswerable (via history replay)
● Debugging of historical states deterministically (via history replay)
● Never Lose Information (write-only store)
● Edit the Past (via new events effective at older times)
● Optimize reads (purpose-built read models)
● Enhanced Analytics (analyze all history as it occurred)
Potential Drawbacks
● Eventual Consistency
● No built-in querying of domain models (SELECT name WHERE …)
● Risks of using a new architectural pattern
● Lack of agreement on tools and infrastructure
● Increased storage requirements
What is different from CRUD?
CRUD Micro-Service
REST API
Data Store
Controllers
ORM Model ORM Model
Background
Process
Create / Update
External
Market
Data
APIs
Client #3Client #2Client #1
Update / DeleteRead #1 Read #2
ORM Model
Event-sourced Micro-Service
Read Model #1
(files in S3)
Read Model #2
(specific database)
Write
Events
Projection #1 Projection #2
Command
Processor
Domain Models (pure) Background
Process
External
Market
Data
APIs
Client #1
Read #1
Event Store
Read
Events Read
Events
Client #2
Read #2
Client #3
Update / Delete
Commands
Create / Update
Commands
Vocabulary
1. Domain-Driven Design (DDD)
○ Event: something that happened in the past; a fact; a state transition
○ Command: a request for a state transition in the domain
○ Aggregate
2. Event Sourcing (ES)
○ Projection
○ Read Model
Aside: Domain Model is Pure?
● FP: A "pure" function doesn't cause any side effects
○ No reads or writes that modify the world
○ No altering a mutable data structure
○ Substitute f(x) for its result without changing meaning of program
● An event-sourced domain can be two pure functions
○ process(currentState, command) => e1, e2, e3
○ apply (currentState, event) => nextState
● Separates the logic of the model from interactions with any changing state
in the world
What is different from CRUD?
(Quotes from Greg Young)
"The model that a client needs for the data in a distributed
system is screen-based and different than the domain
model."
Trade-offs
● ACID vs. Eventual Consistency
○ CRUD w/ ACID database
■ Once a row is written, subsequent reads reflect it
■ But: no help with domain-level consistency!
○ Event Sourced
■ Once event is written, subsequent reads reflect it
■ Projections eventually consistent
● Up-front costs
○ Domain modeling is hard!
CRUD Models of Market Data
CRUD Models of Market Data
Listing
- ticker
- exchange
- trading status
- ....
*
Fundamentals
- earnings per share
- market cap
- ...
primary
listing
Org
- name
- ...
Price
- date
- value
- currency
*
Universe
- name
- ...
**
other
listings
merged
with
spin-off
from
Answerable?
Q: Was AOL in Investible Universe in 1995?
Known
Equities
Active
Also:
● Market Cap > X
● Price > Y
● Avg Turnover > Z
● ...
Investible
2001 - Merger of AOL/Time Warner
Listings
.findByTicker("AOL")
.setTicker("TWX")
t
1995 2000 2005 2010 2015
Time Warner
Listing
ticker: AOL
trading status: Listed
....
Listing
ticker: TWX
trading status: Listed
....
Org
name: Time Warner
Org
name: America Online
Orgs
.findByName("Time Warner")
.mergeWith("America Online")
Orgs
.findByName("America Online")
.setName("AOL Time Warner)
.addListing("TWX")
Listings
.findByTicker("TWX")
.setTradingStatus("Delisted")
America Online
AOL Time
Warner Time Warner
AOL
Time Warner
2001 - Merger of AOL/Time Warner
Listings
.findByTicker("AOL")
.setTicker("TWX")
t
1995 2000 2005 2010 2015
Listing
ticker: AOL
trading status: Listed
....
Listing
ticker: TWX
trading status: Delisted
...
Org
name: Time Warner
Org
name: AOL Time Warner
Orgs
.findByName("Time Warner")
.mergeWith("America Online")
Orgs
.findByName("America Online")
.setName("AOL Time Warner)
.addListing("TWX")
Listings
.findByTicker("TWX")
.setTradingStatus("Delisted")
merged
with
America Online
Time Warner
AOL Time
Warner Time Warner
AOL
Time Warner
2003 - Name / Ticker Change
Listings
.findByTicker("AOL")
.setTicker("TWX")
t
1995 2000 2005 2010 2015
Listing
ticker: TWX (old)
trading status: Delisted
...
Org
name: Time Warner
Org
name: AOL Time Warner
Listing
ticker: AOL
trading status: Listed
....
Org
.findByName("AOL Time Warner")
.setName("Time Warner")
Listings
.findByTicker("AOL")
.setTicker("TWX")
merged
with
Time Warner
America Online
AOL Time
Warner Time Warner
AOL
Time Warner
2003 - Name / Ticker Change
Listings
.findByTicker("AOL")
.setTicker("TWX")
Listing
ticker: TWX (old)
trading status: Delisted
...
Org
name: Time Warner
Org
name: Time Warner
Listing
ticker: TWX
old ticker: AOL
trading status: Listed
Org
.findByName("AOL Time Warner")
.setName("Time Warner")
Listings
.findByTicker("AOL")
.setTicker("TWX")
merged
with
t
1995 2000 2005 2010 2015
Time Warner
AOL Time
Warner Time WarnerAmerica Online
AOL
Time Warner
2009 - AOL Spinoff
Listing
ticker: TWX (old)
trading status:
Delisted
...
Org
name: Time Warner
Org
name: Time Warner
Listing
ticker: TWX
old ticker: AOL
trading status:
Listed
merged
with
Org
.findByName("Time Warner")
.spinOff("AOL")
.addListing("AOL")
Listings
.newListing("AOL")
t
1995 2000 2005 2010 2015
Time Warner
AOL Time
Warner Time WarnerAmerica Online
AOL
Time Warner
2009 - AOL Spinoff
Listing
ticker: TWX (old)
trading status:
Delisted
...
Org
name: Time Warner
Org
name: Time Warner
Listing
ticker: TWX
old ticker: AOL
trading status:
Listed
merged
with
Org
name: AOL
Listing
ticker: AOL
trading status:
Listed
...
spin-off
from
Org
.findByName("Time Warner")
.spinOff("AOL")
.addListing("AOL")
Listings
.newListing("AOL")
t
1995 2000 2005 2010 2015
Time Warner
AOL Time
Warner Time WarnerAmerica Online
AOL
Time Warner
Answerable with CRUD Models?
Q: Was AOL in Investible Universe in 1995?
A: No and Yes
○ No current AOL org (didn't exist at time)
○ Yes former America Online (now Time Warner)
Complexity in query, requires previous states
○ Query against version columns with date ranges?
○ Query against previous versions tables?
CRUD Models - How to Update?
● Update price of TWX on 1995-01-03
○ Original: $56.22 Correct: $52.62
● Complexity in query
○ Wrong: Listings.findByTicker("TWX") // this is AOL!
○ Right: Complex historical query...
● Complexity in update
○ Org Primary Listing - any change?
○ Org Universe membership - any changes?
○ Support Two-Dimensional Time aka as-of query?
Problems
Main Problem
Unanswerable questions
● Time travel intractable
● Past not always reproducible
More Problems
● Correctness
○ Divergent interpretations of data
● Availability
○ How often can data be unavailable to clients?
● Performance
○ How fast must operations complete?
● Determinism
○ Reproducing prior states for reporting, debugging, etc.
● Auditability
○ Who changed what when and why?
Problems - Correctness
● Need a new definition of e.g. adjusted price
○ Old: unadjusted * splits * spin-offs
○ New: unadjusted * splits * spin-offs * dividends
● But...
○ Some client systems still need the old definition
○ CRUD data store didn't store the individual factors
● Common Solutions
○ Add past to relational model? Reprocess?
Problems - Availability
● How long can data be unavailable?
○ Not long - End-user client
○ Hoursto Days - Reporting
● But...
○ Most stringent of client requirements applies to all
○ Cascading failures: unavailability propagates
● Common solutions
○ bulk-heading, circuit-breakers, more servers
○ more complex than necessary?
Problems - Performance
● How fast must operations complete?
○ Writes need to keep up with input
○ Reads have varying requirements
● But..
○ Due to contention on Shared Mutable State, badly
performing Reads can impact everything else
● Common Solutions
○ Caching, sharding, more server resources
○ Trade-off with ACID consistency
Problems - Determinism
● Reproducing prior states
○ Reporting Consistently on a Past period
■ Apply adjustments only to end of the period
○ Debugging
■ Reproduce state of data in past
● But..
○ Not easy with Shared Mutable State!
● Common Solutions
○ Versioned rows, audit tables, database snapshots
Problems - Auditability
● Why did data change?
○ Attribution (source of data)
○ Security (who did it)
○ History (what and when was previous value)
● But..
○ Not easy with Shared Mutable State!
● Common Solutions
○ Versioned rows, audit tables
Event Sourced Models of Market Data
Event Sourced Listings
Listing
Aggregate
Other Reference Data...
ListingListed
ListingDelisted
ListingDeleted
Ticker ISIN
Org Name
Reference Data
ListingInitialized
ListingChanged
ListingIsinChanged
ListingTickerChanged
Listing Reference
Data Processor
Our Listing Id
Vendor Listing Id
ListingUndeleted
ListingSourceOrgIdChanged
Exch
External
Market
Data
APIs
Background
Process
Vendor Org Id
Event Sourced Orgs
Org
Aggregate
ListingListed
ListingDelisted
ListingDeleted
ListingIsinChanged
ListingRicChanged
Org Lifecycle
Data Processor
Our Org Id
Reference
Data
Reference Data
Org Name
Vendor Org Id
ListingUndeleted
ListingSourceOrgIdChanged
Listing
Reference
Data
Processor
(state…)
OrgListed
OrgDelisted
OrgDeleted
OrgUndeleted
OrgListingAdded
OrgListingRemoved
OrgEnteredUniverse
OrgLeftUniverse
OrgPrimaryListingChanged
Listing
1..n
primary
Vocabulary
1. Domain-Driven Design (DDD)
○ Event: something that happened in the past; a fact; a state transition
○ Command: a request for a state transition in the domain
○ Aggregate: domain objects in a transactionally-consistent unit
2. Event Sourcing (ES)
○ Projection
○ Read Model
Resulting Read Models
Stock Universes - Read Model
● Generate as CSV files in S3, clients retrieve via REST API
GET /universes/1995-01-03
Org Id,Known Universe?,Active Universe?,Investible Universe?,Primary Listing Id, … Ticker
"49498",true,true,true,"0x00100b000b569402","US8873173038",..."AOL"
● Problems?
○ Correctness: Interpretation only for this use case
○ Availability: No impact on other use cases
○ Performance: No read-side calculations
○ Determinism: Can re-generate from event source data
○ Auditability: Available in event source data
● Consistency is at domain level -- entire history of universes in this case
○ Generate entire history to S3 bucket, API switches buckets atomically
Answerable via Event Sourcing?
Q: Was AOL in Investible Universe in 1995?
GET /universes/1995-01-03
Org Id,Known Universe?,Active Universe?,Investible Universe?,Primary Listing Id, …, Ticker
"49498",true,true,true,"0x00100b000b569402","US8873173038",...,"AOL"
A: Yes! former America Online (now Time Warner)
Trivial query against purpose-built Read Model
Org History - Read Model
● Build directly from indexed event stream, clients retrieve via REST API
GET /orgs?ticker=TRW
[ { eventType: "ListingListed", listing_id: "1", ticker: "AOL", processedAt: "...", effectiveAt: "...", …}
{ eventType: "OrgListed", … },
{ eventType: "ListingAdded", listing_id: "1", …}, …
{ eventType: "ListingTickerChanged", listing_id: "1", old_ticker: "AOL", ticker: "TWX" …}, ... ]
● Problems?
○ Correctness: Interpretation only for this use case
○ Availability: No impact on other use cases
○ Performance: No read-side calculations
○ Determinism: Reads directly from event source data
○ Auditability: Available in event source data
● Consistency is at domain level -- entire history of single org in this case
○ Generate entire history on-the-fly directly from source (indexed!)
Vocabulary
1. Domain-Driven Design (DDD)
○ Event: something that happened in the past; a fact; a state transition
○ Command: a request for a state transition in the domain
○ Aggregate: domain objects in a transactionally-consistent unit
2. Event Sourcing (ES)
○ Projection: to derive current state from the stream of events
○ Read Model: a model of current state designed to answer a query
Read Model vs Cache
A cache is a query intermediary. It holds a previous
response until invalidated, then queries again.
A read model does not query. Applying new events to it
changes the answer it returns.
Example: count of people in room
● both may return 10 when the answer is now 11 or 9 - staleness
● cache counts the people - slow and may require locking the doors
● read model applies the entrance/exit events - like a click counter - no
impact on people nor doors
Conclusions
● Answer the "unanswerable"
● Avoid impacts on other use cases
● Keep facts that may answer future questions
Marc Siegel
@ms_yd
Brian Roberts
@flicken
Questions?
Follow up later
Events vs Audit Tables or Versioned Rows
An audit table or row version column use shared mutable state as the source
of truth, and additionally store some history. Inconsistencies are hard to fix,
edits of the past are challenging, new use cases can be challenging.
An event store is an append-only list of immutable facts. What has occurred is
recorded, and can be replayed to interpret according to a future use case.
Example: count of people in room
● audit table is a logbook in the room - query may be complex, facts
may not be consistent, depends on keeping it up to date
● versioned rows is a logbook carried by each person - same issues as
audit table
● event store is "just the facts", only interpretation changes

More Related Content

Recently uploaded

TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
 
how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfMehmet Akar
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationWave PLM
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Krakówbim.edu.pl
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfDeskTrack
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with StrimziStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzisteffenkarlsson2
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...rajkumar669520
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationHelp Desk Migration
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareinfo611746
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionWave PLM
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabbereGrabber
 
CompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdfCompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdfFurqanuddin10
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024vaibhav130304
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesNeo4j
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024Shane Coughlan
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Soroosh Khodami
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignNeo4j
 

Recently uploaded (20)

TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
how-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdfhow-to-download-files-safely-from-the-internet.pdf
how-to-download-files-safely-from-the-internet.pdf
 
Crafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM IntegrationCrafting the Perfect Measurement Sheet with PLM Integration
Crafting the Perfect Measurement Sheet with PLM Integration
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
Workforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdfWorkforce Efficiency with Employee Time Tracking Software.pdf
Workforce Efficiency with Employee Time Tracking Software.pdf
 
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with StrimziStrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
StrimziCon 2024 - Transition to Apache Kafka on Kubernetes with Strimzi
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
A Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data MigrationA Guideline to Zendesk to Re:amaze Data Migration
A Guideline to Zendesk to Re:amaze Data Migration
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
The Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion ProductionThe Impact of PLM Software on Fashion Production
The Impact of PLM Software on Fashion Production
 
How to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabberHow to install and activate eGrabber JobGrabber
How to install and activate eGrabber JobGrabber
 
CompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdfCompTIA Security+ (Study Notes) for cs.pdf
CompTIA Security+ (Study Notes) for cs.pdf
 
IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024IT Software Development Resume, Vaibhav jha 2024
IT Software Development Resume, Vaibhav jha 2024
 
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product UpdatesGraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
GraphSummit Stockholm - Neo4j - Knowledge Graphs and Product Updates
 
OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024OpenChain @ LF Japan Executive Briefing - May 2024
OpenChain @ LF Japan Executive Briefing - May 2024
 
Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024Secure Software Ecosystem Teqnation 2024
Secure Software Ecosystem Teqnation 2024
 
INGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by DesignINGKA DIGITAL: Linked Metadata by Design
INGKA DIGITAL: Linked Metadata by Design
 

Featured

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Featured (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

From CRUD to Event Sourcing an Investible Stock Universe

  • 1. From CRUD to Event Sourcing an Investible Stock Universe O'Reilly Software Architecture Conference March 18, 2015 #oreillysacon
  • 2. Marc Siegel Team Lead (@ms_yd) Brian Roberts Senior Developer / Team Lead (@flicken) TIM Group
  • 3. What's in it for you? ● Answer the Unanswerable ○ both now and future
  • 4. Vocabulary 1. Domain-Driven Design (DDD) ○ Event ○ Command ○ Aggregate 2. Event Sourcing (ES) ○ Projection ○ Read Model
  • 5. How does Event Sourcing Work?
  • 7. How does Event Sourcing Work? (Quotes from Greg Young) "State transitions are an important part of our problem space and should be modeled within our domain"
  • 8. How does Event Sourcing Work? (Quotes from Greg Young) "State transitions are an important part of our problem space and should be modeled within our domain" "Event Sourcing says all state is transient and you only store facts."
  • 9. Vocabulary 1. Domain-Driven Design (DDD) ○ Event: something that happened in the past; a fact; a state transition ○ Command ○ Aggregate 2. Event Sourcing (ES) ○ Projection ○ Read Model
  • 10. Lifecycle of a Listing (Simplified) t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) Q: How to represent the meaningful state transitions of the domain as events? FB (Facebook)
  • 11. Listing Lifecycle Events IPET (Pets.com) TickerListed ticker: MSFT date: 1986 TickerListed ticker: IPET date: 2000 TickerDelisted ticker: IPET date: 2000 TickerListed ticker: FB date: 2012 Events: FB (Facebook) t 1995 2000 2005 2010 2015 MSFT (Microsoft)
  • 12. Determine Current State t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) Q: How do we determine the state as of a given year? Or as of today? FB (Facebook)
  • 13. Determine Current State t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) Q: How do we determine the state as of a given year? Or as of today? A: "When we talk about Event Sourcing, current state is a left-fold of previous behaviors" -- Greg Young FB (Facebook)
  • 14. Current State is a Left Fold of Events ● FP: Left Fold aggregates a collection via a function and an initial value ○ Ruby: [1, 2, 3].inject(0, :+) == 6 # symbol fn name ○ Scala: List(1, 2, 3).foldLeft(0)(_ + _) == 6 // anon function ● Provide an initial state s0 and a function f : (S, E) => S ● Current State after event e3 is: ○ = leftFold( [e1 , e2 , e3 ], s0 , f ) ○ = f( f( f( s0 , e1 ), e2 ), e3 )
  • 15. Replay to Earlier State: 1998 t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) State in 1998 = [].apply( TickerListed ticker: MSFT date: 1986 ) FB (Facebook)
  • 16. Replay to Earlier State: 1998 t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) MSFT Listed State in 1998 = [ ] FB (Facebook)
  • 17. Replay to Earlier State: 2000 IPET (Pets.com) MSFT Listed State in 2000 = [ ] TickerListed ticker: IPET date: 2000 .apply( ) FB (Facebook) MSFT (Microsoft) t 1995 2000 2005 2010 2015
  • 18. Replay to Earlier State: 2000 t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) MSFT Listed IPET Listed State in 2000 = [ ] FB (Facebook)
  • 19. Replay to Earlier State: 2001 t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) MSFT Listed IPET Listed State in 2001 = [ ] TickerDelisted ticker: IPET date: 2000 .apply( ) FB (Facebook)
  • 20. Replay to Earlier State: 2001 t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) MSFT Listed IPET Delisted State in 2001 = [ ] FB (Facebook)
  • 21. Replay to Earlier State: 2012 t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) FB (Facebook) MSFT Listed IPET Delisted State in 2012 = [ ] TickerListed ticker: FB date: 2012 .apply( )
  • 22. Replay to Earlier State: 2012 t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) MSFT Listed IPET Delisted FB Listed State in 2012 = [ ] FB (Facebook)
  • 23. Review - Only the Events are Stored t 1995 2000 2005 2010 2015 IPET (Pets.com) MSFT (Microsoft) TickerListed ticker: MSFT date: 1986 TickerListed ticker: IPET date: 2000 TickerDelisted ticker: IPET date: 2000 TickerListed ticker: FB date: 2012 Events: FB (Facebook)
  • 24. Potential Benefits ● Answer the unanswerable (via history replay) ● Debugging of historical states deterministically (via history replay) ● Never Lose Information (write-only store) ● Edit the Past (via new events effective at older times) ● Optimize reads (purpose-built read models) ● Enhanced Analytics (analyze all history as it occurred)
  • 25. Potential Drawbacks ● Eventual Consistency ● No built-in querying of domain models (SELECT name WHERE …) ● Risks of using a new architectural pattern ● Lack of agreement on tools and infrastructure ● Increased storage requirements
  • 26. What is different from CRUD?
  • 27.
  • 28. CRUD Micro-Service REST API Data Store Controllers ORM Model ORM Model Background Process Create / Update External Market Data APIs Client #3Client #2Client #1 Update / DeleteRead #1 Read #2 ORM Model
  • 29. Event-sourced Micro-Service Read Model #1 (files in S3) Read Model #2 (specific database) Write Events Projection #1 Projection #2 Command Processor Domain Models (pure) Background Process External Market Data APIs Client #1 Read #1 Event Store Read Events Read Events Client #2 Read #2 Client #3 Update / Delete Commands Create / Update Commands
  • 30. Vocabulary 1. Domain-Driven Design (DDD) ○ Event: something that happened in the past; a fact; a state transition ○ Command: a request for a state transition in the domain ○ Aggregate 2. Event Sourcing (ES) ○ Projection ○ Read Model
  • 31. Aside: Domain Model is Pure? ● FP: A "pure" function doesn't cause any side effects ○ No reads or writes that modify the world ○ No altering a mutable data structure ○ Substitute f(x) for its result without changing meaning of program ● An event-sourced domain can be two pure functions ○ process(currentState, command) => e1, e2, e3 ○ apply (currentState, event) => nextState ● Separates the logic of the model from interactions with any changing state in the world
  • 32. What is different from CRUD? (Quotes from Greg Young) "The model that a client needs for the data in a distributed system is screen-based and different than the domain model."
  • 33. Trade-offs ● ACID vs. Eventual Consistency ○ CRUD w/ ACID database ■ Once a row is written, subsequent reads reflect it ■ But: no help with domain-level consistency! ○ Event Sourced ■ Once event is written, subsequent reads reflect it ■ Projections eventually consistent ● Up-front costs ○ Domain modeling is hard!
  • 34. CRUD Models of Market Data
  • 35.
  • 36. CRUD Models of Market Data Listing - ticker - exchange - trading status - .... * Fundamentals - earnings per share - market cap - ... primary listing Org - name - ... Price - date - value - currency * Universe - name - ... ** other listings merged with spin-off from
  • 37. Answerable? Q: Was AOL in Investible Universe in 1995? Known Equities Active Also: ● Market Cap > X ● Price > Y ● Avg Turnover > Z ● ... Investible
  • 38. 2001 - Merger of AOL/Time Warner Listings .findByTicker("AOL") .setTicker("TWX") t 1995 2000 2005 2010 2015 Time Warner Listing ticker: AOL trading status: Listed .... Listing ticker: TWX trading status: Listed .... Org name: Time Warner Org name: America Online Orgs .findByName("Time Warner") .mergeWith("America Online") Orgs .findByName("America Online") .setName("AOL Time Warner) .addListing("TWX") Listings .findByTicker("TWX") .setTradingStatus("Delisted") America Online AOL Time Warner Time Warner AOL Time Warner
  • 39. 2001 - Merger of AOL/Time Warner Listings .findByTicker("AOL") .setTicker("TWX") t 1995 2000 2005 2010 2015 Listing ticker: AOL trading status: Listed .... Listing ticker: TWX trading status: Delisted ... Org name: Time Warner Org name: AOL Time Warner Orgs .findByName("Time Warner") .mergeWith("America Online") Orgs .findByName("America Online") .setName("AOL Time Warner) .addListing("TWX") Listings .findByTicker("TWX") .setTradingStatus("Delisted") merged with America Online Time Warner AOL Time Warner Time Warner AOL Time Warner
  • 40. 2003 - Name / Ticker Change Listings .findByTicker("AOL") .setTicker("TWX") t 1995 2000 2005 2010 2015 Listing ticker: TWX (old) trading status: Delisted ... Org name: Time Warner Org name: AOL Time Warner Listing ticker: AOL trading status: Listed .... Org .findByName("AOL Time Warner") .setName("Time Warner") Listings .findByTicker("AOL") .setTicker("TWX") merged with Time Warner America Online AOL Time Warner Time Warner AOL Time Warner
  • 41. 2003 - Name / Ticker Change Listings .findByTicker("AOL") .setTicker("TWX") Listing ticker: TWX (old) trading status: Delisted ... Org name: Time Warner Org name: Time Warner Listing ticker: TWX old ticker: AOL trading status: Listed Org .findByName("AOL Time Warner") .setName("Time Warner") Listings .findByTicker("AOL") .setTicker("TWX") merged with t 1995 2000 2005 2010 2015 Time Warner AOL Time Warner Time WarnerAmerica Online AOL Time Warner
  • 42. 2009 - AOL Spinoff Listing ticker: TWX (old) trading status: Delisted ... Org name: Time Warner Org name: Time Warner Listing ticker: TWX old ticker: AOL trading status: Listed merged with Org .findByName("Time Warner") .spinOff("AOL") .addListing("AOL") Listings .newListing("AOL") t 1995 2000 2005 2010 2015 Time Warner AOL Time Warner Time WarnerAmerica Online AOL Time Warner
  • 43. 2009 - AOL Spinoff Listing ticker: TWX (old) trading status: Delisted ... Org name: Time Warner Org name: Time Warner Listing ticker: TWX old ticker: AOL trading status: Listed merged with Org name: AOL Listing ticker: AOL trading status: Listed ... spin-off from Org .findByName("Time Warner") .spinOff("AOL") .addListing("AOL") Listings .newListing("AOL") t 1995 2000 2005 2010 2015 Time Warner AOL Time Warner Time WarnerAmerica Online AOL Time Warner
  • 44. Answerable with CRUD Models? Q: Was AOL in Investible Universe in 1995? A: No and Yes ○ No current AOL org (didn't exist at time) ○ Yes former America Online (now Time Warner) Complexity in query, requires previous states ○ Query against version columns with date ranges? ○ Query against previous versions tables?
  • 45. CRUD Models - How to Update? ● Update price of TWX on 1995-01-03 ○ Original: $56.22 Correct: $52.62 ● Complexity in query ○ Wrong: Listings.findByTicker("TWX") // this is AOL! ○ Right: Complex historical query... ● Complexity in update ○ Org Primary Listing - any change? ○ Org Universe membership - any changes? ○ Support Two-Dimensional Time aka as-of query?
  • 47.
  • 48. Main Problem Unanswerable questions ● Time travel intractable ● Past not always reproducible
  • 49. More Problems ● Correctness ○ Divergent interpretations of data ● Availability ○ How often can data be unavailable to clients? ● Performance ○ How fast must operations complete? ● Determinism ○ Reproducing prior states for reporting, debugging, etc. ● Auditability ○ Who changed what when and why?
  • 50. Problems - Correctness ● Need a new definition of e.g. adjusted price ○ Old: unadjusted * splits * spin-offs ○ New: unadjusted * splits * spin-offs * dividends ● But... ○ Some client systems still need the old definition ○ CRUD data store didn't store the individual factors ● Common Solutions ○ Add past to relational model? Reprocess?
  • 51. Problems - Availability ● How long can data be unavailable? ○ Not long - End-user client ○ Hoursto Days - Reporting ● But... ○ Most stringent of client requirements applies to all ○ Cascading failures: unavailability propagates ● Common solutions ○ bulk-heading, circuit-breakers, more servers ○ more complex than necessary?
  • 52. Problems - Performance ● How fast must operations complete? ○ Writes need to keep up with input ○ Reads have varying requirements ● But.. ○ Due to contention on Shared Mutable State, badly performing Reads can impact everything else ● Common Solutions ○ Caching, sharding, more server resources ○ Trade-off with ACID consistency
  • 53. Problems - Determinism ● Reproducing prior states ○ Reporting Consistently on a Past period ■ Apply adjustments only to end of the period ○ Debugging ■ Reproduce state of data in past ● But.. ○ Not easy with Shared Mutable State! ● Common Solutions ○ Versioned rows, audit tables, database snapshots
  • 54. Problems - Auditability ● Why did data change? ○ Attribution (source of data) ○ Security (who did it) ○ History (what and when was previous value) ● But.. ○ Not easy with Shared Mutable State! ● Common Solutions ○ Versioned rows, audit tables
  • 55. Event Sourced Models of Market Data
  • 56.
  • 57. Event Sourced Listings Listing Aggregate Other Reference Data... ListingListed ListingDelisted ListingDeleted Ticker ISIN Org Name Reference Data ListingInitialized ListingChanged ListingIsinChanged ListingTickerChanged Listing Reference Data Processor Our Listing Id Vendor Listing Id ListingUndeleted ListingSourceOrgIdChanged Exch External Market Data APIs Background Process Vendor Org Id
  • 58. Event Sourced Orgs Org Aggregate ListingListed ListingDelisted ListingDeleted ListingIsinChanged ListingRicChanged Org Lifecycle Data Processor Our Org Id Reference Data Reference Data Org Name Vendor Org Id ListingUndeleted ListingSourceOrgIdChanged Listing Reference Data Processor (state…) OrgListed OrgDelisted OrgDeleted OrgUndeleted OrgListingAdded OrgListingRemoved OrgEnteredUniverse OrgLeftUniverse OrgPrimaryListingChanged Listing 1..n primary
  • 59. Vocabulary 1. Domain-Driven Design (DDD) ○ Event: something that happened in the past; a fact; a state transition ○ Command: a request for a state transition in the domain ○ Aggregate: domain objects in a transactionally-consistent unit 2. Event Sourcing (ES) ○ Projection ○ Read Model
  • 61.
  • 62. Stock Universes - Read Model ● Generate as CSV files in S3, clients retrieve via REST API GET /universes/1995-01-03 Org Id,Known Universe?,Active Universe?,Investible Universe?,Primary Listing Id, … Ticker "49498",true,true,true,"0x00100b000b569402","US8873173038",..."AOL" ● Problems? ○ Correctness: Interpretation only for this use case ○ Availability: No impact on other use cases ○ Performance: No read-side calculations ○ Determinism: Can re-generate from event source data ○ Auditability: Available in event source data ● Consistency is at domain level -- entire history of universes in this case ○ Generate entire history to S3 bucket, API switches buckets atomically
  • 63. Answerable via Event Sourcing? Q: Was AOL in Investible Universe in 1995? GET /universes/1995-01-03 Org Id,Known Universe?,Active Universe?,Investible Universe?,Primary Listing Id, …, Ticker "49498",true,true,true,"0x00100b000b569402","US8873173038",...,"AOL" A: Yes! former America Online (now Time Warner) Trivial query against purpose-built Read Model
  • 64. Org History - Read Model ● Build directly from indexed event stream, clients retrieve via REST API GET /orgs?ticker=TRW [ { eventType: "ListingListed", listing_id: "1", ticker: "AOL", processedAt: "...", effectiveAt: "...", …} { eventType: "OrgListed", … }, { eventType: "ListingAdded", listing_id: "1", …}, … { eventType: "ListingTickerChanged", listing_id: "1", old_ticker: "AOL", ticker: "TWX" …}, ... ] ● Problems? ○ Correctness: Interpretation only for this use case ○ Availability: No impact on other use cases ○ Performance: No read-side calculations ○ Determinism: Reads directly from event source data ○ Auditability: Available in event source data ● Consistency is at domain level -- entire history of single org in this case ○ Generate entire history on-the-fly directly from source (indexed!)
  • 65. Vocabulary 1. Domain-Driven Design (DDD) ○ Event: something that happened in the past; a fact; a state transition ○ Command: a request for a state transition in the domain ○ Aggregate: domain objects in a transactionally-consistent unit 2. Event Sourcing (ES) ○ Projection: to derive current state from the stream of events ○ Read Model: a model of current state designed to answer a query
  • 66. Read Model vs Cache A cache is a query intermediary. It holds a previous response until invalidated, then queries again. A read model does not query. Applying new events to it changes the answer it returns. Example: count of people in room ● both may return 10 when the answer is now 11 or 9 - staleness ● cache counts the people - slow and may require locking the doors ● read model applies the entrance/exit events - like a click counter - no impact on people nor doors
  • 67.
  • 68. Conclusions ● Answer the "unanswerable" ● Avoid impacts on other use cases ● Keep facts that may answer future questions
  • 70.
  • 71. Events vs Audit Tables or Versioned Rows An audit table or row version column use shared mutable state as the source of truth, and additionally store some history. Inconsistencies are hard to fix, edits of the past are challenging, new use cases can be challenging. An event store is an append-only list of immutable facts. What has occurred is recorded, and can be replayed to interpret according to a future use case. Example: count of people in room ● audit table is a logbook in the room - query may be complex, facts may not be consistent, depends on keeping it up to date ● versioned rows is a logbook carried by each person - same issues as audit table ● event store is "just the facts", only interpretation changes