SlideShare a Scribd company logo
Using Akka Persistence to build
a configuration datastore
Anargyros Kiourkos – Software engineer @ CERN
Agenda
• Problem statement
• CRUD architecture, why not?
• Event sourcing & CQRS
• Actor model, Akka & Akka persistence
• Supervision & Testing
• Akka persistence in production
Problem statement
• Configuration management system for large industrial
control systems
• Thousands of devices, millions of datapoints and properties
• Complex equipment, devices with a lot of configuration data
• Full audit trail
• Historical data
• “Undo” functionality
Some numbers (one of these systems)
80 dc’s
21.000 devices
250.000 data-points
> 100.000.000 properties
CRUD, why not?
At least in this case…
CRUD architecture
User interface
Business layer
DAO/ORM
Database
• Usually 3-layers
• UI or any other client
• Business layer
• Data layer
• Perfectly OK for most situations
• But…
CREATE
Database
Datapoint
id:1
Hysteresis: 2.0F
INSERT
Datapoint
id:1
Hysteresis: 2.0F
UPDATE
Database
Datapoint
id:1
Hysteresis: 1.0F
UPDATE
Datapoint
id:1
Hysteresis: 2.0F
Datapoint
id:1
Hysteresis: 1.0F
Data
lost
Traditional solutions
• Auditing solutions (e.g. Hibernate Envers, AOP etc.)
• “Soft” deletes
• Valid solutions
• Not an inherent characteristic of the system, usually add-ons
• Separate architecture considerations
• Sometimes difficult to test
• Often, considerable amount of extra code
CRUD summary
• Data loss on every UPDATE and DELETE (maybe ok)
• Auditing is hard
• Locking contention
• Database is shared mutable state
• Object-relation impedance mismatch
• Compromise on READ-WRITE performance
Event sourcing to the rescue
Sometimes…
Event sourcing – what is it?
• Do not save current state
• Store the events that lead to current state
• Immutable events in an append-only storage ( event-log)
• Restore state by replaying events
• Full audit trail of every change in the system
• No mapping of objects to tables
• Allows to “retrofit” new functionality to past data
Commands and Events
• Command
• Command = Wish
• Do something => UpdateHysteresis
• Needs to be validated => ValidateHysteresis
• Can be rejected => InvalidHysteresis
• Event
• Event = Fact
• It has already happened => HysteresisUpdated
• Cannot be deleted
Event sourcing example - create
CreateDP
id:1
Hysteresis: 2.0F
Commands Validation Events
CreateDP
id:1
Hysteresis: 2.0F
DPCreated
id:1
Hysteresis: 2.0F
persist
DPCreated
Event sourcing example - update
UpdateHysteresis
id:1
Hysteresis: 1.0F
Commands Validation Events
UpdateHysteresis
id:1
Hysteresis: 1.0F
HysteresisUpdated
id:1
Hysteresis: 1.0F
persist
DPCreated
HysteresisUpdated
Event sourcing - considerations
• Read side, eventually consistent
• Event data cannot be updated, compensating events to
“undo”
• Schema evolution (more on this later)
• Serialization overhead
• Event consistency is vital
• Increased storage requirements
• Not a “silver bullet”
Quiz
The largest, event sourced system in world?
CQRS
Command Query Responsibility Segregation
How do you query an event log?
• Unrealistic to go through millions of records for a single read
• Events optimized for write operations & sequential reading
CQRS
• Segregate data operations:
• Read side => optimized for reading
• Write side => optimized for writing
• Essentially 2 or more separate domain models
• Read and Write sides of the application can be optimized &
scale independently
• No “one-size-fits-all” domain model
• Works well with ES, but not mandatory
CQRS without Event Sourcing (one approach)
Client
Write model Read model
Database
Data access layer
CQRS with Event Sourcing
Client
Write model Read model
Journal
Database
Events Projections
CQRS - Considerations
• Eventual consistency (depends on the implementation)
• Increased storage requirements
• Can become complex
• Different mindset than CRUD
Actor model, actors and Akka
Actors everywhere…
Actor model and Actors
Actor A Actor B
Actor C
Internal
State
Internal
State
Internal
State
Mailbox Mailbox
Mailbox
The actor model is a mathematical
model of concurrent computation
that treats "actors" as the universal
primitives of concurrent
computation
An actor is the primitive unit of
computation. It receives messages
and does something depending on
the type of message received
Actor System
Akka
• A library implementing the actor model on the JVM
• Multi-threaded behavior without atomics or locks
• Less complex than “traditional” concurrency solutions
• Supports both scale-up and scale-out
• Embraces failure
• Scala and Java APIs available
Akka and actors in 6 lines
import akka.actor.{Actor, ActorLogging}
class MyActor extends Actor with ActorLogging {
var myState = 0
override def receive: Receive = {
case x: UpdateState => myState = x.value
sender() ! StateUpdated
}
}
Akka persistence
• Enables stateful actors to save their internal state with
events
• Events are saved in a journal
• Community plugins for
• Cassandra, MongoDB, Kafka, DynamoDB & JDBC
• Optional snapshots can be triggered on-demand
• Persistence Query allows the migration of data to the read
side of the application
Persistent actor
sealed trait Cmd
sealed trait Event
case class UpdateHysteresis(hysteresis: Float) extends Cmd
case class HysteresisUpdated(hysteresis: Float) extends Event
case class DpState(id: Long = 1L, name: String = "my-name",
hysteresis: Float = 0.0F, events: Int = 0)
Persistent actor
class MyPersistentActor extends PersistentActor {
override def persistenceId = "my-unique-id“
override def receiveCommand = {
case UpdateHysteresis(hys) => persist(HysteresisUpdated(hys)) { evt =>
updateState(evt)
sender() ! HysteresisUpdatedAck
if (lastSequenceNr % snapshotInterval == 0 && lastSequenceNr != 0)
saveSnapshot(state)
//any other side effects
}
}
Persistent actor - recovery
override def receiveRecover = {
case x: HysteresisUpdated => updateState(x)
case SnapshotOffer(_, snapshot: DpState) => state = snapshot
}
Persistent actor - complete
sealed trait Cmd
sealed trait Event
case class UpdateHysteresis(hysteresis: Float) extends Cmd
case class HysteresisUpdated(hysteresis: Float) extends Event
case class DpState(id: Long = 1L, name: String = "my-name", hysteresis: Float = 0.0F, events: Int = 0)
class MyPersistentActor extends PersistentActor {
override def persistenceId = "my-unique-id"
var state = DpState()
def updateState(event: Event): Unit = {
case x: HysteresisUpdated => state = state.copy(hysteresis = x.hysteresis, events = state.events + 1)
}
val snapshotInterval = 100
override def receiveCommand = {
case UpdateHysteresis(hys) => persist(HysteresisUpdated(hys)) { evt =>
updateState(evt)
if (lastSequenceNr % snapshotInterval == 0 && lastSequenceNr != 0)
saveSnapshot(state)
//any other side effects
}
}
override def receiveRecover = {
case x: HysteresisUpdated => updateState(x)
case SnapshotOffer(_, snapshot: DpState) => state = snapshot
}
}
Persistent Query
• Read side of events
• Limited operations ( e.g. eventsBypersistenceId )
• Unless you have very limited query needs, mostly used to
populate a “query friendly” datastore (e.g. RDBMS)
• Stream and non-stream queries
• Stream allows to subscribe to events from specific
persistence Ids
Akka persistence considerations
Schema evolution
• As application evolves over time schema evolves as well
• It should be possible to read “old” events
• Transparently promote events to the latest version
• Most of cases can be handled at the serialization level
Schema evolution
• Rename fields
• Easy with IDL based serializers
• Field name is not part of the binary representation
• Add fields
• Handle in the de-serialization phase
• When replaying old events set a default value for the new field
• Delete fields (not common)
• Handle at de-serialization level (ignore)
• Remove events
• Serializer aware of events no longer needed and skip de-
serialization
Serialization format
• By default Akka uses Java serialization (not recommended)
• Alternatives
• Google Protobuf
• Apache Thrift
• Apache Avro
• Kryo
• JSON
• Your own (not recommended)
Serialization - configuration
actor {
serializers {
dpEvent = “myPackage.DpEventSerializer"
}
serialization-bindings {
“myPackage.HysteresisUpdatedEvent" = dpEvent
}
}
Define your serializer
Bind it to your events
Serialization - ScalaPB
• Protocol buffer compiler for Scala
• Supports proto2 and proto3
• It generates:
• Case classes
• Parsers
• Serializers
Serialization – protobuf
import "scalapb/scalapb.proto";
import "google/protobuf/wrappers.proto";
package ch.cern.ensdm;
message HysteresisUpdatedEvent {
option (scalapb.message).extends =
“myParckage.HysteresisUpdatedEvent";
float hysteresis = 1;
}
Fault tolerance
Actor supervision
• Akka uses separate flows to handle normal & recovery code
• Normal flow handles normal messages
• Recovery flow consists of actors that monitor actors in the normal
flow (supervisors)
• Akka enforces parental supervision
• 3 types of failures:
1. Systematic (e.g. programming error)
2. Transient (e.g. unavailable resources, database etc.)
3. Corrupt internal state
• Various recovery strategies
Backoff supervisor
• Recommended to use a backoff supervisor with persistence
• When a higher level actor restarts all children restart as well
• Depending the level, this can mean thousands of actors
restarting at the same time
• Backoff supervisor can vary the intervals and restart time
Backoff supervisor
val supervisor = BackoffSupervisor.props(
Backoff.onStop(
childProps,
childName = "myActor",
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2
))
Testing Akka applications
Testing Actors
• Akka TestKit
• Allows asynchronous testing in a controlled environment
• akka-persistence-inmemory plugin
• Stores journal and snapshot messages in-memory
• In general testing procedure goes like:
1. Instantiate the actor to test
2. Send commands to the actor
3. Actor persists events in memory
4. Restart the actor
5. Check actor’s state
Akka persistence in production
Configuration System implementation
• Domain driven actor application
• Akka persistent actors represent the current state of
configuration
• Domain driven messages notify actors of configuration
changes
• Read side updated using persistence query
Topology
Configuration
Management
System
AMSRDBMS
DC DC DC DCOther systems
REST
API
DC
SVN
SCADA
CASSANDRA
Technologies
• Akka with Cassandra for the Persistence Journal
• Play framework
• DI using Guice
• WS Library
• Eureka
• Automatic discovery, registration of data concentrators
• Serialization
• Google protobuf
Actor hierarchy
DC
Supervisor
DC
Actor
DC
Actor
Device
Actor
Device
Actor Device
Actor
Device
Actor
Tag
Actor
Tag
Actor
Tag
Actor
Tag
Actor
Tag
Actor
Tag
Actor
Tag
Actor
Tag
Actor
Our experience - async
Our experience - Scala
Summary
• Asynchronous operations => difficult to test, debug
• Careful consideration of actor supervision strategy
• Event granularity => varies per use case
• Schema evolution => handle at serialization level
• Longer startup times => reduce snapshot interval
• If state does not fit in memory you need to use sharding
• Increased disk and memory usage
Questions?

More Related Content

What's hot

iOS for ERREST - alternative version
iOS for ERREST - alternative versioniOS for ERREST - alternative version
iOS for ERREST - alternative version
WO Community
 
Groovy concurrency
Groovy concurrencyGroovy concurrency
Groovy concurrency
Alex Miller
 
Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK &...
Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK &...Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK &...
Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK &...
DataStax
 
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
thelabdude
 
Drools 6.0 (Red Hat Summit)
Drools 6.0 (Red Hat Summit)Drools 6.0 (Red Hat Summit)
Drools 6.0 (Red Hat Summit)
Mark Proctor
 
Caching In The Cloud
Caching In The CloudCaching In The Cloud
Caching In The Cloud
Alex Miller
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
Patrick McFadin
 
Cassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra InternalsCassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra Internals
DataStax
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 
Cassandra Metrics
Cassandra MetricsCassandra Metrics
Cassandra Metrics
Chris Lohfink
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fire
Patrick McFadin
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Yaroslav Tkachenko
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
Ed Anuff
 
MongoSF - mongodb @ foursquare
MongoSF - mongodb @ foursquareMongoSF - mongodb @ foursquare
MongoSF - mongodb @ foursquare
jorgeortiz85
 
Full Stack Scala
Full Stack ScalaFull Stack Scala
Full Stack Scala
Ramnivas Laddad
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
knowbigdata
 
Introduction to Apache Kafka- Part 2
Introduction to Apache Kafka- Part 2Introduction to Apache Kafka- Part 2
Introduction to Apache Kafka- Part 2
Knoldus Inc.
 
Networks and types - the future of Akka
Networks and types - the future of AkkaNetworks and types - the future of Akka
Networks and types - the future of Akka
Johan Andrén
 
Next generation actors with Akka
Next generation actors with AkkaNext generation actors with Akka
Next generation actors with Akka
Johan Andrén
 
OpenStack Horizon: Controlling the Cloud using Django
OpenStack Horizon: Controlling the Cloud using DjangoOpenStack Horizon: Controlling the Cloud using Django
OpenStack Horizon: Controlling the Cloud using Django
David Lapsley
 

What's hot (20)

iOS for ERREST - alternative version
iOS for ERREST - alternative versioniOS for ERREST - alternative version
iOS for ERREST - alternative version
 
Groovy concurrency
Groovy concurrencyGroovy concurrency
Groovy concurrency
 
Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK &...
Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK &...Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK &...
Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK &...
 
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
 
Drools 6.0 (Red Hat Summit)
Drools 6.0 (Red Hat Summit)Drools 6.0 (Red Hat Summit)
Drools 6.0 (Red Hat Summit)
 
Caching In The Cloud
Caching In The CloudCaching In The Cloud
Caching In The Cloud
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Cassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra InternalsCassandra Community Webinar: Apache Cassandra Internals
Cassandra Community Webinar: Apache Cassandra Internals
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Cassandra Metrics
Cassandra MetricsCassandra Metrics
Cassandra Metrics
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fire
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
 
Indexing in Cassandra
Indexing in CassandraIndexing in Cassandra
Indexing in Cassandra
 
MongoSF - mongodb @ foursquare
MongoSF - mongodb @ foursquareMongoSF - mongodb @ foursquare
MongoSF - mongodb @ foursquare
 
Full Stack Scala
Full Stack ScalaFull Stack Scala
Full Stack Scala
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Introduction to Apache Kafka- Part 2
Introduction to Apache Kafka- Part 2Introduction to Apache Kafka- Part 2
Introduction to Apache Kafka- Part 2
 
Networks and types - the future of Akka
Networks and types - the future of AkkaNetworks and types - the future of Akka
Networks and types - the future of Akka
 
Next generation actors with Akka
Next generation actors with AkkaNext generation actors with Akka
Next generation actors with Akka
 
OpenStack Horizon: Controlling the Cloud using Django
OpenStack Horizon: Controlling the Cloud using DjangoOpenStack Horizon: Controlling the Cloud using Django
OpenStack Horizon: Controlling the Cloud using Django
 

Similar to Using Akka Persistence to build a configuration datastore

Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an action
Gordon Chung
 
Developing a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and SprayDeveloping a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and Spray
Jacob Park
 
Actors or Not: Async Event Architectures
Actors or Not: Async Event ArchitecturesActors or Not: Async Event Architectures
Actors or Not: Async Event Architectures
Yaroslav Tkachenko
 
Kappa Architecture on Apache Kafka and Querona: datamass.io
Kappa Architecture on Apache Kafka and Querona: datamass.ioKappa Architecture on Apache Kafka and Querona: datamass.io
Kappa Architecture on Apache Kafka and Querona: datamass.io
Piotr Czarnas
 
Streaming Data with scalaz-stream
Streaming Data with scalaz-streamStreaming Data with scalaz-stream
Streaming Data with scalaz-stream
GaryCoady
 
Event Sourcing - what could go wrong - Jfokus 2022
Event Sourcing - what could go wrong - Jfokus 2022Event Sourcing - what could go wrong - Jfokus 2022
Event Sourcing - what could go wrong - Jfokus 2022
Andrzej Ludwikowski
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examples
Peter Lawrey
 
Deep Dive on AWS IoT
Deep Dive on AWS IoTDeep Dive on AWS IoT
Deep Dive on AWS IoT
Amazon Web Services
 
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Event sourcing  - what could possibly go wrong ? Devoxx PL 2021Event sourcing  - what could possibly go wrong ? Devoxx PL 2021
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Andrzej Ludwikowski
 
GR8Conf 2011: Grails Webflow
GR8Conf 2011: Grails WebflowGR8Conf 2011: Grails Webflow
GR8Conf 2011: Grails Webflow
GR8Conf
 
Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Avoiding the Pit of Despair - Event Sourcing with Akka and CassandraAvoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Luke Tillman
 
Akka london scala_user_group
Akka london scala_user_groupAkka london scala_user_group
Akka london scala_user_group
Skills Matter
 
Oracle Drivers configuration for High Availability
Oracle Drivers configuration for High AvailabilityOracle Drivers configuration for High Availability
Oracle Drivers configuration for High Availability
Ludovico Caldara
 
Architecting for Microservices Part 2
Architecting for Microservices Part 2Architecting for Microservices Part 2
Architecting for Microservices Part 2
Elana Krasner
 
AWS IoT Deep Dive
AWS IoT Deep DiveAWS IoT Deep Dive
AWS IoT Deep Dive
Kristana Kane
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
zeeg
 
Building Stateful Microservices With Akka
Building Stateful Microservices With AkkaBuilding Stateful Microservices With Akka
Building Stateful Microservices With Akka
Yaroslav Tkachenko
 
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et KibanaJournée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Publicis Sapient Engineering
 
Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture  Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture
Yaroslav Tkachenko
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
Renato Guimaraes
 

Similar to Using Akka Persistence to build a configuration datastore (20)

Anatomy of an action
Anatomy of an actionAnatomy of an action
Anatomy of an action
 
Developing a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and SprayDeveloping a Real-time Engine with Akka, Cassandra, and Spray
Developing a Real-time Engine with Akka, Cassandra, and Spray
 
Actors or Not: Async Event Architectures
Actors or Not: Async Event ArchitecturesActors or Not: Async Event Architectures
Actors or Not: Async Event Architectures
 
Kappa Architecture on Apache Kafka and Querona: datamass.io
Kappa Architecture on Apache Kafka and Querona: datamass.ioKappa Architecture on Apache Kafka and Querona: datamass.io
Kappa Architecture on Apache Kafka and Querona: datamass.io
 
Streaming Data with scalaz-stream
Streaming Data with scalaz-streamStreaming Data with scalaz-stream
Streaming Data with scalaz-stream
 
Event Sourcing - what could go wrong - Jfokus 2022
Event Sourcing - what could go wrong - Jfokus 2022Event Sourcing - what could go wrong - Jfokus 2022
Event Sourcing - what could go wrong - Jfokus 2022
 
Reactive programming with examples
Reactive programming with examplesReactive programming with examples
Reactive programming with examples
 
Deep Dive on AWS IoT
Deep Dive on AWS IoTDeep Dive on AWS IoT
Deep Dive on AWS IoT
 
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Event sourcing  - what could possibly go wrong ? Devoxx PL 2021Event sourcing  - what could possibly go wrong ? Devoxx PL 2021
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
 
GR8Conf 2011: Grails Webflow
GR8Conf 2011: Grails WebflowGR8Conf 2011: Grails Webflow
GR8Conf 2011: Grails Webflow
 
Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Avoiding the Pit of Despair - Event Sourcing with Akka and CassandraAvoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
 
Akka london scala_user_group
Akka london scala_user_groupAkka london scala_user_group
Akka london scala_user_group
 
Oracle Drivers configuration for High Availability
Oracle Drivers configuration for High AvailabilityOracle Drivers configuration for High Availability
Oracle Drivers configuration for High Availability
 
Architecting for Microservices Part 2
Architecting for Microservices Part 2Architecting for Microservices Part 2
Architecting for Microservices Part 2
 
AWS IoT Deep Dive
AWS IoT Deep DiveAWS IoT Deep Dive
AWS IoT Deep Dive
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
 
Building Stateful Microservices With Akka
Building Stateful Microservices With AkkaBuilding Stateful Microservices With Akka
Building Stateful Microservices With Akka
 
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et KibanaJournée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
 
Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture  Building Eventing Systems for Microservice Architecture
Building Eventing Systems for Microservice Architecture
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
 

Recently uploaded

Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
michniczscribd
 
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdfSoftware Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
kalichargn70th171
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual PerfectionBuilding the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Applitools
 
Folding Cheat Sheet #5 - fifth in a series
Folding Cheat Sheet #5 - fifth in a seriesFolding Cheat Sheet #5 - fifth in a series
Folding Cheat Sheet #5 - fifth in a series
Philip Schwarz
 
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
kalichargn70th171
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
Zycus
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
ervikas4
 
Refactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contextsRefactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contexts
Michał Kurzeja
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
Anand Bagmar
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
kalichargn70th171
 
42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert
vaishalijagtap12
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
kgyxske
 
Best Practices & Tips for a Successful Odoo ERP Implementation
Best Practices & Tips for a Successful Odoo ERP ImplementationBest Practices & Tips for a Successful Odoo ERP Implementation
Best Practices & Tips for a Successful Odoo ERP Implementation
Envertis Software Solutions
 
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
campbellclarkson
 
Folding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a seriesFolding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a series
Philip Schwarz
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
ICS
 
ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.
Maitrey Patel
 

Recently uploaded (20)

Beginner's Guide to Observability@Devoxx PL 2024
Beginner's  Guide to Observability@Devoxx PL 2024Beginner's  Guide to Observability@Devoxx PL 2024
Beginner's Guide to Observability@Devoxx PL 2024
 
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdfSoftware Test Automation - A Comprehensive Guide on Automated Testing.pdf
Software Test Automation - A Comprehensive Guide on Automated Testing.pdf
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual PerfectionBuilding the Ideal CI-CD Pipeline_ Achieving Visual Perfection
Building the Ideal CI-CD Pipeline_ Achieving Visual Perfection
 
Folding Cheat Sheet #5 - fifth in a series
Folding Cheat Sheet #5 - fifth in a seriesFolding Cheat Sheet #5 - fifth in a series
Folding Cheat Sheet #5 - fifth in a series
 
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
The Power of Visual Regression Testing_ Why It Is Critical for Enterprise App...
 
How GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdfHow GenAI Can Improve Supplier Performance Management.pdf
How GenAI Can Improve Supplier Performance Management.pdf
 
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptxMigration From CH 1.0 to CH 2.0 and  Mule 4.6 & Java 17 Upgrade.pptx
Migration From CH 1.0 to CH 2.0 and Mule 4.6 & Java 17 Upgrade.pptx
 
Refactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contextsRefactoring legacy systems using events commands and bubble contexts
Refactoring legacy systems using events commands and bubble contexts
 
Streamlining End-to-End Testing Automation
Streamlining End-to-End Testing AutomationStreamlining End-to-End Testing Automation
Streamlining End-to-End Testing Automation
 
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdfThe Comprehensive Guide to Validating Audio-Visual Performances.pdf
The Comprehensive Guide to Validating Audio-Visual Performances.pdf
 
42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert42 Ways to Generate Real Estate Leads - Sellxpert
42 Ways to Generate Real Estate Leads - Sellxpert
 
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
一比一原版(sdsu毕业证书)圣地亚哥州立大学毕业证如何办理
 
Best Practices & Tips for a Successful Odoo ERP Implementation
Best Practices & Tips for a Successful Odoo ERP ImplementationBest Practices & Tips for a Successful Odoo ERP Implementation
Best Practices & Tips for a Successful Odoo ERP Implementation
 
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
🏎️Tech Transformation: DevOps Insights from the Experts 👩‍💻
 
Folding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a seriesFolding Cheat Sheet #6 - sixth in a series
Folding Cheat Sheet #6 - sixth in a series
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA ComplianceSecure-by-Design Using Hardware and Software Protection for FDA Compliance
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
 
ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.
 
bgiolcb
bgiolcbbgiolcb
bgiolcb
 

Using Akka Persistence to build a configuration datastore

  • 1. Using Akka Persistence to build a configuration datastore Anargyros Kiourkos – Software engineer @ CERN
  • 2. Agenda • Problem statement • CRUD architecture, why not? • Event sourcing & CQRS • Actor model, Akka & Akka persistence • Supervision & Testing • Akka persistence in production
  • 3. Problem statement • Configuration management system for large industrial control systems • Thousands of devices, millions of datapoints and properties • Complex equipment, devices with a lot of configuration data • Full audit trail • Historical data • “Undo” functionality
  • 4. Some numbers (one of these systems) 80 dc’s 21.000 devices 250.000 data-points > 100.000.000 properties
  • 5. CRUD, why not? At least in this case…
  • 6. CRUD architecture User interface Business layer DAO/ORM Database • Usually 3-layers • UI or any other client • Business layer • Data layer • Perfectly OK for most situations • But…
  • 9. Traditional solutions • Auditing solutions (e.g. Hibernate Envers, AOP etc.) • “Soft” deletes • Valid solutions • Not an inherent characteristic of the system, usually add-ons • Separate architecture considerations • Sometimes difficult to test • Often, considerable amount of extra code
  • 10. CRUD summary • Data loss on every UPDATE and DELETE (maybe ok) • Auditing is hard • Locking contention • Database is shared mutable state • Object-relation impedance mismatch • Compromise on READ-WRITE performance
  • 11. Event sourcing to the rescue Sometimes…
  • 12. Event sourcing – what is it? • Do not save current state • Store the events that lead to current state • Immutable events in an append-only storage ( event-log) • Restore state by replaying events • Full audit trail of every change in the system • No mapping of objects to tables • Allows to “retrofit” new functionality to past data
  • 13. Commands and Events • Command • Command = Wish • Do something => UpdateHysteresis • Needs to be validated => ValidateHysteresis • Can be rejected => InvalidHysteresis • Event • Event = Fact • It has already happened => HysteresisUpdated • Cannot be deleted
  • 14. Event sourcing example - create CreateDP id:1 Hysteresis: 2.0F Commands Validation Events CreateDP id:1 Hysteresis: 2.0F DPCreated id:1 Hysteresis: 2.0F persist DPCreated
  • 15. Event sourcing example - update UpdateHysteresis id:1 Hysteresis: 1.0F Commands Validation Events UpdateHysteresis id:1 Hysteresis: 1.0F HysteresisUpdated id:1 Hysteresis: 1.0F persist DPCreated HysteresisUpdated
  • 16. Event sourcing - considerations • Read side, eventually consistent • Event data cannot be updated, compensating events to “undo” • Schema evolution (more on this later) • Serialization overhead • Event consistency is vital • Increased storage requirements • Not a “silver bullet”
  • 17. Quiz The largest, event sourced system in world?
  • 19. How do you query an event log? • Unrealistic to go through millions of records for a single read • Events optimized for write operations & sequential reading
  • 20. CQRS • Segregate data operations: • Read side => optimized for reading • Write side => optimized for writing • Essentially 2 or more separate domain models • Read and Write sides of the application can be optimized & scale independently • No “one-size-fits-all” domain model • Works well with ES, but not mandatory
  • 21. CQRS without Event Sourcing (one approach) Client Write model Read model Database Data access layer
  • 22. CQRS with Event Sourcing Client Write model Read model Journal Database Events Projections
  • 23. CQRS - Considerations • Eventual consistency (depends on the implementation) • Increased storage requirements • Can become complex • Different mindset than CRUD
  • 24. Actor model, actors and Akka Actors everywhere…
  • 25. Actor model and Actors Actor A Actor B Actor C Internal State Internal State Internal State Mailbox Mailbox Mailbox The actor model is a mathematical model of concurrent computation that treats "actors" as the universal primitives of concurrent computation An actor is the primitive unit of computation. It receives messages and does something depending on the type of message received Actor System
  • 26. Akka • A library implementing the actor model on the JVM • Multi-threaded behavior without atomics or locks • Less complex than “traditional” concurrency solutions • Supports both scale-up and scale-out • Embraces failure • Scala and Java APIs available
  • 27. Akka and actors in 6 lines import akka.actor.{Actor, ActorLogging} class MyActor extends Actor with ActorLogging { var myState = 0 override def receive: Receive = { case x: UpdateState => myState = x.value sender() ! StateUpdated } }
  • 28. Akka persistence • Enables stateful actors to save their internal state with events • Events are saved in a journal • Community plugins for • Cassandra, MongoDB, Kafka, DynamoDB & JDBC • Optional snapshots can be triggered on-demand • Persistence Query allows the migration of data to the read side of the application
  • 29. Persistent actor sealed trait Cmd sealed trait Event case class UpdateHysteresis(hysteresis: Float) extends Cmd case class HysteresisUpdated(hysteresis: Float) extends Event case class DpState(id: Long = 1L, name: String = "my-name", hysteresis: Float = 0.0F, events: Int = 0)
  • 30. Persistent actor class MyPersistentActor extends PersistentActor { override def persistenceId = "my-unique-id“ override def receiveCommand = { case UpdateHysteresis(hys) => persist(HysteresisUpdated(hys)) { evt => updateState(evt) sender() ! HysteresisUpdatedAck if (lastSequenceNr % snapshotInterval == 0 && lastSequenceNr != 0) saveSnapshot(state) //any other side effects } }
  • 31. Persistent actor - recovery override def receiveRecover = { case x: HysteresisUpdated => updateState(x) case SnapshotOffer(_, snapshot: DpState) => state = snapshot }
  • 32. Persistent actor - complete sealed trait Cmd sealed trait Event case class UpdateHysteresis(hysteresis: Float) extends Cmd case class HysteresisUpdated(hysteresis: Float) extends Event case class DpState(id: Long = 1L, name: String = "my-name", hysteresis: Float = 0.0F, events: Int = 0) class MyPersistentActor extends PersistentActor { override def persistenceId = "my-unique-id" var state = DpState() def updateState(event: Event): Unit = { case x: HysteresisUpdated => state = state.copy(hysteresis = x.hysteresis, events = state.events + 1) } val snapshotInterval = 100 override def receiveCommand = { case UpdateHysteresis(hys) => persist(HysteresisUpdated(hys)) { evt => updateState(evt) if (lastSequenceNr % snapshotInterval == 0 && lastSequenceNr != 0) saveSnapshot(state) //any other side effects } } override def receiveRecover = { case x: HysteresisUpdated => updateState(x) case SnapshotOffer(_, snapshot: DpState) => state = snapshot } }
  • 33. Persistent Query • Read side of events • Limited operations ( e.g. eventsBypersistenceId ) • Unless you have very limited query needs, mostly used to populate a “query friendly” datastore (e.g. RDBMS) • Stream and non-stream queries • Stream allows to subscribe to events from specific persistence Ids
  • 35. Schema evolution • As application evolves over time schema evolves as well • It should be possible to read “old” events • Transparently promote events to the latest version • Most of cases can be handled at the serialization level
  • 36. Schema evolution • Rename fields • Easy with IDL based serializers • Field name is not part of the binary representation • Add fields • Handle in the de-serialization phase • When replaying old events set a default value for the new field • Delete fields (not common) • Handle at de-serialization level (ignore) • Remove events • Serializer aware of events no longer needed and skip de- serialization
  • 37. Serialization format • By default Akka uses Java serialization (not recommended) • Alternatives • Google Protobuf • Apache Thrift • Apache Avro • Kryo • JSON • Your own (not recommended)
  • 38. Serialization - configuration actor { serializers { dpEvent = “myPackage.DpEventSerializer" } serialization-bindings { “myPackage.HysteresisUpdatedEvent" = dpEvent } } Define your serializer Bind it to your events
  • 39. Serialization - ScalaPB • Protocol buffer compiler for Scala • Supports proto2 and proto3 • It generates: • Case classes • Parsers • Serializers
  • 40. Serialization – protobuf import "scalapb/scalapb.proto"; import "google/protobuf/wrappers.proto"; package ch.cern.ensdm; message HysteresisUpdatedEvent { option (scalapb.message).extends = “myParckage.HysteresisUpdatedEvent"; float hysteresis = 1; }
  • 42. Actor supervision • Akka uses separate flows to handle normal & recovery code • Normal flow handles normal messages • Recovery flow consists of actors that monitor actors in the normal flow (supervisors) • Akka enforces parental supervision • 3 types of failures: 1. Systematic (e.g. programming error) 2. Transient (e.g. unavailable resources, database etc.) 3. Corrupt internal state • Various recovery strategies
  • 43. Backoff supervisor • Recommended to use a backoff supervisor with persistence • When a higher level actor restarts all children restart as well • Depending the level, this can mean thousands of actors restarting at the same time • Backoff supervisor can vary the intervals and restart time
  • 44. Backoff supervisor val supervisor = BackoffSupervisor.props( Backoff.onStop( childProps, childName = "myActor", minBackoff = 3.seconds, maxBackoff = 30.seconds, randomFactor = 0.2 ))
  • 46. Testing Actors • Akka TestKit • Allows asynchronous testing in a controlled environment • akka-persistence-inmemory plugin • Stores journal and snapshot messages in-memory • In general testing procedure goes like: 1. Instantiate the actor to test 2. Send commands to the actor 3. Actor persists events in memory 4. Restart the actor 5. Check actor’s state
  • 47. Akka persistence in production
  • 48. Configuration System implementation • Domain driven actor application • Akka persistent actors represent the current state of configuration • Domain driven messages notify actors of configuration changes • Read side updated using persistence query
  • 49. Topology Configuration Management System AMSRDBMS DC DC DC DCOther systems REST API DC SVN SCADA CASSANDRA
  • 50. Technologies • Akka with Cassandra for the Persistence Journal • Play framework • DI using Guice • WS Library • Eureka • Automatic discovery, registration of data concentrators • Serialization • Google protobuf
  • 54. Summary • Asynchronous operations => difficult to test, debug • Careful consideration of actor supervision strategy • Event granularity => varies per use case • Schema evolution => handle at serialization level • Longer startup times => reduce snapshot interval • If state does not fit in memory you need to use sharding • Increased disk and memory usage