Project Based Learning (A.I).pptx detail explanation
## Introducing a reactive Scala-Akka based system in a Java centric company
1. Introducing a reactive
Scala-Akka based system
in a Java centric company
Basware Belgium NV
Jeroen Verellen ( )@jeroen_v_
Milan Aleksić ( )@milanaleksic
2.
3. Basware Metrics system and dashboard
A journey through Akka and spray covering:
actor development
testing
spray routing
acceptance testing
build
micro benchmarking
and more...
5. Business case
We want to have a real time dashboard that shows
the amount of documents coming in and going out
(per channel).
And also list the amount per document type. All of
this should be made visible in a per hour view.
7. Requirements
Basically, a lightweight replacement for data warehouse /
reporting tool
Highly concurrent, non-blocking: no influence on other systems
that generate the metrics
The aggregated metrics should be stored so that the system can
recover its state after a restart. The store should be simple.
8. Requirements
The system should run on Java 8 runtime
like all our other (newer) components
Simple API that allows us to show metrics in dashing,
since that is the dashboard technology of choice
12. Pre-Aggregate
Calculate a number of statistics
while the metrics are coming in.
The system does not store raw
data but calculates e.g. how
many times a service was called
in the last hour.
But, we didn't really use
MongoDB in our case, this was
just a design pattern we liked
13. Event sourcing
Capture all changes to an application state as a sequence of events
(Fowler).
Instead of storing the latest application state, the system stores the
event that changes the state. Upon a query for the application state,
the state is rebuild from the events.
Advantages:
temporal query support
event replay in case of bugs
code changes / complete rebuild of state
reverse / undo events
Difficulties:
mind shift
interaction with other non-event sourced aplications
14. CQRS
At its heart is the notion that you can use a
different model to update information than the
model you use to read information (Martin Fowler)
Command Query Responsibility Segregation
System interactions arrive in the form of commands
some commands can be rejected (e.g. validation failure)
successful commands are stored in the persistence layer
Another part of the system can receive non-rejectable events
replayed in order on the query object side
to avoid cost of replay of all events, we use snapshots
15. CQRS
It composes well with
Eventsourcing and Actors
Model
It is not an architecture, it's a
pattern!
17. Many reasons why we like Scala, just to list some:
modern functional / OOP mix
stable, but moving faster than Java
traits > interfaces
chaining and composing Futures
case classes
pattern matching
All in all, more expressive in less code
18. Akka is a toolkit that promises scalability via Actors
each actor can have internal state
which can be changed only via messages
which are executed in order
supervision strategies, event buses, remote actors, cluster...
Vertical and horizontal scalability using single paradigm
Cons:
paradigm switch, takes time to learn
a general concurrency pattern, but doesn't fit every usage case
19. Some of the good sides:
builds on Akka, Akka IO
both client side and serve side APIs
case classes for requests, status codes
declarative routing DSL
Spray library is going to be republished as akka‐http
Most of the things we done could (and should) be migrated to
akka-http as this new library becomes GA
Important thing it currently can't do: WebSockets
DSL can cause compile issues in IDEA
JSON support relies heavily on implicits, making it hard to
debug
20. Pure Java client
We (obviously) needed a way to push ("report")
the metrics data to the server
What we came up with is simple:
in-memory bounded queue of sending tasks
The client side needs to be lightweight and non-intrusive
we would also batch increments before sending them
we delegated implementation of this to Joeri, our colleague
22. Typesafe Config
Configuration library for JVM languages.
No dependencies
Java properties, JSON, and a human-friendly JSON superset
Support for nesting
24. Contract first: define JSON in/out of the API
Use simple .sbt file
commit dc597f5a959d844c1c98b459ce5db5192b3dfc9a
Date: Mon Oct 20 16:47:04 2014 +0200
project structure + schemas
commit 81956d778e1bafc0e0decd2a74dd7e8d0e9ce5cf
Date: Mon Oct 20 17:15:36 2014 +0200
allow to post multiple metrics
25. New libraries
%% notation of dependencies
Typesafe config
Spray
spray-can
spray-routing
spray-json
commit 87d3f08716a29922a26df82ec1f23e552bd18ed1
Date: Wed Oct 22 11:19:32 2014 +0200
use base unit test class DRY
commit 1896d9d7229bbe5fd7a038c1c6661fe3c5ae8740
Date: Wed Oct 22 11:04:39 2014 +0200
first implementation of a route for handling metrics posts
26. Unit and acceptance testing
TDD from here on
Use API example in test (test the documentation)
Scala test, Spray-client, Akka-TestKit
Re-use spray-json on client side
Server and client run in same Actor System
commit ddb2d5a8fd47b9042d906ca75c26eab9f0483d88
Date: Wed Oct 22 17:51:32 2014 +0200
add first acceptance test
commit 0ada7ce978f28ecd58d077e3f10785e0ad8e559c
Date: Thu Oct 23 13:29:20 2014 +0200
add test for invalid metrics
commit 657264839cdb948df508b0fefb955ef2a89422a2
Date: Thu Oct 23 12:33:21 2014 +0200
split of the AT tests, using example from API def in test
27. SBT setup remodeling
Script -> Scala
Stolen from Spray
clearer separation modules, dependencies, build settings
commit b204e81c6c42837792e6292f4d6a494a5efa1cfa
Date: Thu Oct 23 11:28:33 2014 +0200
improve sbt setup, stole from spray setup
, , ,JSON Add Metric SPRAY REST route JSON support Unit and
Acceptance testing
28. Dashing
Re-use company standard for
dashboards
Easier adoption in the
company
Ruby gem (Sinatra,
Batman.js, CoffeeScript,
SCSS... full hipster)
Currently running on our
dashing server
It's Ruby but simple enough
29. Actor tree overview
CounterMetricsActor
CounterMetricsActorName
(metric "outgoingAS2")
CounterMetricsIntervalActor
YEAR = 2015
CounterMetricsIntervalActor
MONTH = 11
CounterMetricsIntervalActor
DAY = 11
CounterMetricsIntervalActor
HOUR= 01
BUCKETS=[00..59]
CounterMetricsActorName
(metric "incomingHTTP")
CounterMetricsActorName
(metric "messagetypeinvoice")
CounterMetricsIntervalActor
YEAR = 2014
CounterMetricsIntervalActor
YEAR = 2015 CounterMetricsIntervalActor
YEAR = 2015
.........
.........
.........
CounterMetricsIntervalActor
MONTH = 10
CounterMetricsIntervalActor
MONTH = 12
CounterMetricsIntervalActor
DAY = 10
CounterMetricsIntervalActor
HOUR= 02
BUCKETS=[00..59]
CounterMetricsIntervalActor
HOUR= 03
BUCKETS=[00..59]
......... .........
.........
30. Shaping the Actor System
Documentation, testing
"Add metrics" / "Report metrics" calls introduced
commit 182b3394db9c628a2ab26a62c6eeb351a360657c
Date: Thu Oct 23 20:39:35 2014 +0200
add post method for getting metric reports
31. Shaping the Actor System
We put a thin facade between the API and the actors
Case classes / model on InternalAPI
DTO objects on ExternalAPI
we used Cmd and Query as suffixes for commands and queries
Starting from this commit Milan starts getting more involved with
server Scala side
commit 6326978260986f90dd8cb0796ef105cd945aded2
Date: Mon Oct 27 12:43:32 2014 +0100
Introducing internal API class. Replacing Request/Response case classes
with command/view case classes. Using ScalaMock to test the api entry point
commit 1601c2de3a0d8b69fc683930d7f1e3235ba09604
Date: Mon Oct 27 14:03:30 2014 +0100
CRbased improvements
commit 3fc2363817c6b013a26feaef783835fc89dd48ba
Date: Mon Oct 27 15:40:48 2014 +0100
making first Command & Query case classes in the place of incoming DTO objects
CR-1615 CR-1618
32. Shaping the Actor System, child instantiation & caching
First actor tree
Keep your own reference cache
Create children
Watch children
Act on "Terminated"
commit a82584263da9d0dae8a54bc1dd510a95bbc45790
Date: Mon Oct 27 17:36:39 2014 +0100
start with actor tree
commit ac3167bf9b9d491235e9a8c97ee89b24e20d92f3
Date: Tue Oct 28 11:18:22 2014 +0100
handle unknown messages, check cache is empty
commit c776d5fcf814372350f3d9d9121e2858342efe58
Date: Tue Oct 28 13:11:01 2014 +0100
add child factory with default impl
commit 43cbe21a418cda4d01d767cd2aeccbfc4b263107
Date: Tue Oct 28 17:07:48 2014 +0100
add support for query reports
33. Akka Persistence Design Overview
AkkaInteralApi
implements
InternalApi
CounterMetricsActor
(1) AddMetricsCmd / <no reply>
(1) ReportMetricsQuery
(12) ReportMetricsQueryResult
CounterMetricsActorName
(metric "bwincomingONP")
(2) RecordCounterMetric
(2) GetCounterMetricReport
(11) CounterMetricReport
CounterMetricsIntervalActor
YEAR = 2014
CounterMetricsIntervalActor
MONTH = 11
..........
CounterMetricsIntervalActor
DAY = 11
CounterMetricsIntervalActor
HOUR= 01
BUCKETS=[00..59]
..........
..........
..........
(4) RecordCounterMetric
(5) RecordCounterMetric
(6) RecordCounterMetric
(3) RecordCounterMetric
(6) GetCounterMetricReport
(7) CounterMetricReport
(5) GetCounterMetricReport
(8) CounterMetricReport
(4) GetCounterMetricReport
(9) CounterMetricReport
(3) GetCounterMetricReport
(10) CounterMetricReport
.........
There is only one.
It caches actor
children per metric
name
Each HTTP request can contain
multiple metric queries.
Each metric query makes a single
"TimeScopeQuery" tree structure that
gets partially processed by adequate
children actors through
GetCounterMetricReport message.
Result of all queries is processed in
async and gathered as a single HTTP
response
Both Actor and Interval actors have
Interval actor children (cached by their
value, eg. hour or minute number).
These children actors can die if they
are not queried for long enough
period.
Only if there is no more children can a
parent actor die.
Actors can be revived to their previous
state via akkapersistence.
This actor sends a scheduled
message to its children to
SNAPSHOT their current state
(an optimization in CQRS
systems to make "reviving"
faster)
This actor is special because it
keeps minutes' information
inside "buckets" (a hashmap)
making it our minimal possible
precision
Persistent actors, keep
state between restarts.
store events, delete old events
create snapshot, delete old snapshots
MetricsExternalApi
addMetrics
getReports
Exposes a
REST/JSON interface
towards metrics
clients
34. Journal Plugin (in memory)
Keeps track of the last
few events, rest is
thrown away
Snapshot Plugin (file system)
Keep the last
snapshot, throw away
older snapshots
Persistent framework.
35. Introducing Akka Persistence
Choosing plugins
Journal in memory, Snapshots on disk
PersistentActor becomes
receiveRecover: SnapshotOffer and/or Journal entry
receiveCommand: work and call persist
commit c67826ade4b8b32dd7b67cd5c05e06529b0cfedb
Date: Tue Oct 28 10:40:05 2014 +0100
adding akka persistence dependency
36. Introducing Akka Persistence
Initial commit where we tried to split actors into name- and time-
based ones:
root actor in tree decides how to delegate based on name (of
the metric)
second level (and deeper) decide based on time
There was still lot of work to be done
commit 5dbff87119593cba7db6eef32b595135317bc17f
Date: Tue Oct 28 16:00:56 2014 +0100
time scope actors introduced
37. Run localhost
Main class
Easier startup from IDE
Local testing front end
Client simulator
Easier local testing
Used for load testing later on
commit c907474d7846865ddc277550b3a6cdaaf36c4ee8
Date: Wed Oct 29 09:02:52 2014 +0100
have our own main: easier for standalone or IDE usage
commit 2954db34ae44b52d9b4087e7ba74118871ea1f70
Date: Wed Oct 29 10:03:10 2014 +0100
utilizing localhost from a running project, correcting URLs
commit 463499a493f575b3817313640014a1c45431da09
Date: Wed Oct 29 10:23:36 2014 +0100
add client simulator
38. Extend report/query functionality
Still waging war with the journal - too much logging
Stabilizing number of "with"s we are using with common trait
BaseMetricsActor
Pre-calculation of the "query tree" when query comes in
Delegation to children only when needed
BucketScope vs RangedScope vs FullScope
Still to find better Scala-idiomatic way to do it
commit 3d92414b3c972b4b91c608287929e76a63e84a38
Date: Fri Oct 31 11:52:04 2014 +0100
query recording runs across year/month/date actors
... and many others
39. Packaging
First idea:
Settled for: uber-jar
Akka micro kernel
commit 1c0c8cebd6cf00cd6c748bd9d1c5fddb4a0f7ab3
Date: Mon Nov 3 13:21:24 2014 +0100
allow assembly and dependencytree plugins
40. Akka Persistence Part II
System scheduler: trigger snapshot creation
"Make snapshot" message sent from root actor
Each actor in tree is able to fanout to its children
Each actor sends PoisonPill to himself after making a snapshot
this decision will have performance consequences
Keep your own children's references
interesting bugfix by Jeroen
commit 5ad1d4e36fcc46919c125950ea7399256635e989
Date: Mon Nov 3 17:00:04 2014 +0000
snapshots should work from this point on
commit 7d588f29fa6f222bd1fa54a82fde3e932106c22f
Date: Tue Nov 4 12:52:08 2014 +0100
Fix fanout of MakeSnapshot
use the cache to get the children since in testing the children are not registered
Commit 1 Commit 2
41. Akka Persistence and Acceptance testing
Random snapshot directory for Acceptance testing
Remove snapshot directory after test
Execute cleanup from SBT
commit 8b577a80dbb4c0a23281bde25673d1927a563365
Date: Tue Nov 4 11:55:26 2014 +0000
some randomization introduced into the system and snapshot directory
removal moved to a SBT task
42. Fixing memory issues
Related to performance issue
actor got killed on every snapshot
revival of actor is IO intensive
once a minute peaks in VisualVM
Make sure actors die after 30 minutes
Revive and restore state when needed
commit a8750fa7a3ac3557a4028f580b216a9383d32610
Date: Wed Nov 5 12:00:26 2014 +0100
Make sure actors die if not used for 30 minutes: constrain memory usage
43. Fixing storage issues
Snapshot was made even for actors with empty state
Check if state is dirty before save snapshot
On SaveSnapshotSuccess
clean old snapshots based on metadata
delete journal entries
commit 2016179d8b43698444511ebdcbf3e81fadf8f140
Date: Wed Nov 5 18:00:53 2014 +0100
first go at cleanup of snapshots and journal entries
commit 2b8f101c04d0cb87b585c63e94cdb779d15f15ce
Date: Thu Nov 6 08:57:25 2014 +0100
improved snapshot cleanup
commit 7fdae48fbc5ca7f2be5a4ad349521b065b0fca81
Date: Thu Nov 6 10:34:21 2014 +0100
added test for snapshot cleanup
44. Actor supervision
Poison pill / suicide turned into massacre ;-)
Stop exceptions from bubbling up
Stop actor in case of failure
Revive and restore state in case needed
commit dea72b1fa0db5e05220496224c5e3c62db477c3c
Date: Fri Nov 7 09:36:46 2014 +0100
added supervisor strategy in actors, changed inmemory journal plugin,
tweaking of journal message deletion
Need to add tests for all of this!!
commit a12aea367e847825aba64603ae1e872a7183aa65
Date: Wed Nov 12 12:33:45 2014 +0100
added test to make sure snapshotting only happens when state is dirty
commit 7c3175ece5be1560e3774d55ec478818b77ab3ff
Date: Wed Nov 12 15:01:26 2014 +0100
added test: try to simulate error just after the hour
45. JVM tuning
Use simple client simulator 100 metrics / second
Monitor metrics server
Run over lunch break / over night
Large young generation for Scala
GC options for shorter pauses and better response times
production in on Solaris
Xms128M
Xmx512M
XX:MaxMetaspaceSize=128m
XX:NewSize=450M
XX:+UseConcMarkSweepGC
XX:+UseParNewGC
XX:+PrintGCDetails
XX:+PrintGCTimeStamps
verbose:gc
server
47. Possible improvements on front-end
Replace the Dashing.io framework with D3 + Spray.io
the former is a powerful visualization library
the latter we already have for serving API
the idea: serve the static JS files which would call REST API
We didn't really think about MVVC framework,
why not pure WebComponents + ES6?
48. Possible improvements on front-end
Use WebSockets for real time information
Dashing.io already uses SSE, but the server side has occasional
hiccups
49. Possible improvements on back-end
Introduce a bit more serious snapshot plugin
(Snapshot plugins)
MongoDB, PostgreSQL...
This was hurting our eyes from the start but the funny things...
it works even as is
http://akka.io/community/
50. Possible improvements on back-end
Unify actor implementation
we should be using buckets everywhere
Current state
sealed trait ActorCache[T] {
val nameRefMap: mutable.Map[T, ActorRef]
}
trait MetricActorCache extends ActorCache[Metric] {...} // root one
trait TimeScopeActorCache extends ActorCache[TimeScope] {...} // keeping per hour / minute
class CounterMetricsIntervalActor(val metric: Metric, val scope: TimeScope) { // lowest level
private var buckets: mutable.Map[Int, Int]
}
51. Possible improvements on back-end
Randomization of children sleeping pill
at this time our GC spikes on round hours
this would allow the pressure on the snapshot store to breath a
bit
it would also give us extra stability since sometimes incoming
messages get lost
52. If we had even more time...
Clustering
how scalable would it be?
what would be the throughput?
how would we handle node fail?
Functional improvements
cohort analysis
dynamic time selection
Akka Persistence
use persistent view instead of persistent actor
we only care about the query
original commands are thrown away