Johan Edstrom
SOA Executive and Apache developer.
Apache Member
Apache Camel PMC
Apache ServiceMix committer
Original CXF Blueprint Author
Cassandra client library developer
Author
jedstrom@savoirtech.com
joed@apache.org
Using common frameworks
Asynchronous applications
Savoir Technologies - This is where we started
Where are we now?
lWe have worked heavily with
• Governments
• Insurance
• Utilities
• Network companies
• Education companies
• Medical processing companies
What is this all about?
lScaling
• It is hard
• How it is done - depends
• Is this a blueprint?
lTips n’ Tricks
• Things we’ve learned over time
lExperience
• Do’s and Don't
Before we start
lWhat are we looking to change
JVM
DB
Actor
Business Logic
We will look a little at these tools and libraries
lApache Camel
lApache Karaf, Savoirtech Aetos, ServiceMix
lApache ActiveMQ
lApache Cassandra / Savoirtech Hecate
lApache CXF
lAnd somewhat on AKKA
• We really are just peeking at AKKA to validate some ideas
Apache Camel
lApache Camel is an open source Java framework with
• Concrete implementations of all the widely used EIP patterns
• Connectivity to a great variety of transports and API
• Easy to use Domain Specific Languages (DSL)
Apache Camel
lCamel is a Domain Specific Language (DSL) focused on
implementing Enterprise Integration Patterns (EIPs)
• Examples: multicast, splitter, content-based router, routing slip, “dynamic
routers”, aggregator
lEIPs are implemented by Camel “Processors”
• Users can develop their own processors
• A processor is a fundamental building block
• Bean language and bindings exists so that not a single piece of Apache
Camel Imports will be necessary when integrating your existing code
lCamel can connect to a wide variety of integration
technologies
• Examples: JMS, HTTP, FTP, SOAP, File - There are ~ 180 components
Apache Camel
Do I need all of that?
lNope, many solutions will need just a few things
• jaxb, camel, jms, soap, rest and perhaps jdbc
• Cut the container down to fit your needs
• We don’t need to load all of the 100+ Apache Camel components
• Pick and choose!
lShould I run that messaging solution inside the “ESB”
• Entirely up to you, let us look a little deeper at that in a sec.
lCan I test these solutions or am I stuck with
System.out.println and a remote debugger?
Apache Karaf
lMini OSGi Container
• Foundation of Apache ServiceMix, Aetos, Talend ESB, Cisco Prime, ODL
platforms and quite a few other offerings
• For scaling you certainly don’t need Karaf - all of the concepts are
theoretically possible in pretty much any language and platform if you do it
correctly.
§ That said, Karaf enforces modular code (OSGi), controlled deployment
and offers up many nice things like a remote console for “free”.
JMS JAX-WS JAX-RS Camel Spring Aries
OSGi
Console Logging Provision Admin Spring-DM Aries
Decoupling with OSGi
lEnforces knowledge of imports and exports
lAllows us to version
lProgramming model with micro-services allows 

for re-use on an Api level vs code level
lPromotes contracts
Apache ActiveMQ
lFast, powerful and flexible messaging system
lEasily embeddable
lCan create complex topologies
lTons of connection possibilities
lCan be scaled up / down / right / left
• Note - Currently merging with HornetMQ
Apache Cassandra
lEntered Apache as Incubator Project in 2009
• Went through incubation to build community of committers
lBecame Top Level Apache Project in February 2010
lHas proven to be a very flexible and widely used distributed
big data solution.
lCQL3 changes data-modeling “slightly”
lCassandra was named after the Greek goddess
• Cassandra could accurately predict things that would come
• She spurred the Oracle of Delphi (thus a possible pun)
What are some language tools we can use?
lFutures, Promises, Continuations (Async JaxRs, RIFE),
JMS, Executors, Actors, Consumers, Producers, Runnables

• Let’s not go too deep here; frameworks in Java land

are there to help you not have to write super duper

low level code, like Mina / Netty for networking.

Guava for concurrency and collection handling. 

CXF for JaxWs / JaxRs abstraction to just name a

few.

Traditional “full stack” application
lSynchronous in design
• Browser -> Servlet container is response time sensitive and expensive
• Servlet code -> Service code can be time consuming
• Service code -> Connection factories can easily block 

• Probably uses a Java / JSP framework, some JS
§ Developers tend to be forced to know Java, JS, a bit of RDBMS, fiddle with
servlet containers, rely quite a bit on QA for testing and shows weird and
spurious errors during load testing

JVM
DB
Actor
Business Logic
Persistence - Are you a noSQL candidate?
• Do you need to do complex queries on your data
• Many JOINs and foreign keys with many relationships
• Can be done in Cassandra with Hadoop or other MapReduce (Spark)
• Need to weigh the development effort vs. just writing a SQL query
• Do you need very strict ACID transactionality
• Banking/Financial transactions could be difficult
• ACID/Transactions can be built in Cassandra with complex application
code using tools such as ZooKeeper (Distributed locks exist)
• Need to weigh the development effort vs. using a RDBMS which
supports transactionality out-of-the-box
• Do you have very complex indexing requirements
• Are you indexing multiple fields and having to create many Column
Families to access your data many different ways?
Let us say we are.
lThe work here would be in data modeling
• The benefits we’d reap are eventual/controllable consistency
• Extremely fast and Asynchronous writes
• Automatic Data distribution and partitioning
• No single point of failure
• You have to unlearn some things
*Images courtesy of DataStax
What does that look like in Java?
lTo create our keyspace.
lTo use it with CQL3 - DataStax driver

lUsing it with Hecate - Hecate maps POJO’s to prepared and
async statements.
Compared to a traditional RDBMS?
lIn one project we mapped in 17 registry services
• ~20 million users in the system more than 40 mil transactions / day
§ Handled all load-test scenarios on a 5 node Cassandra ring.
• We are talking about average response times < 100ms

end to end.
• We also don’t need to worry much about
§ Second and 1st level caches
§ Cache synchronization or distribution
§ Locking
§ Select for update
§ Autoscaling
§ Building out
§ Building out geographically
JVM
Actor
Business Logic
To sum up Cassandra
lIf you are looking at doing this
lselect a,b,c from table_X where y > 15 allow filtering;
• You need to rethink your data model, you are looking at something that with
sufficient amounts of data can take down pretty much any cluster if y is not
part of a partition key, index (Indexes are bad for other reasons as well).
• Solve these types of problems by writing your data the way you want to
retrieve it.
Front facing stuff
JVM
Actor
Business Logic
Apache CXF
lLibrary that is passing the TCK for the Web Profile
lBuilds on the base JaxWS / JaxRs in the JDK
lAlso does esoteric stuff like Corba / RMI
lCan be used to do JMS
lYou can build pub sub systems (WSN Notification)
Now lets look at the browser facing part
lJAXRS 2.0 Provides for Async
Sync
Async
And the execution of the response
lUsing an executor service, with timeouts
lSame thing but without lambdas
Lets put some load on this
lWe use Gatling (You could use JMeter or any other tool)
l500 user load
l500.000 Requests
lWe measure the total time, time / request, mean avg.
With a delay of 200-500ms - Async
Async ResponseTimes
With a delay of 10-100ms - Sync
Sync responsetimes
Observations
lWith a fast response that is linear
• Async creates overhead, response times are worse
• With just a minimal blocking introduced, sync starts choking pretty fast
• It is hard to simulate on one machine
lWhere this these techniques are utilized
§ A Large EC2 system is currently hitting around 163 r/s going against
Cassandra with prepared (sync) statements and utilizing Async JaxRs
§ It is slated to be a replacement for a sync system that uses JSP and
Cassandra (or Mongo), it does about the same load over 3 physical
machines, 16GB JVM’s and yes…. 



There were better developers involved on the second system
Onto the last part!
lLet’s make that business thing Async too!
lWe’ll make it an almost “BPEL’ey” process.
lWS Call from our main

web service, response

coming via a queue
lNow we could solve this

over JMS request/reply

but we want to cluster too.
JVM
Actor
Business Logic
What is it we want to solve?
lWe want some storage
lWe treat that as completely transient
lWe want to share data between nodes
lWe don’t want to re-invent the wheel
HazelCast
We really need a MemoryGrid!!!!
lWe can use HazelCast as it has neat side effects
Init()
loadConfig()
A HazelCast Config
lYou can setup Hazelcast
• To be unicast
• Multicast
• EC2 aware
• Implement persistence if you want
• Use existing adaptors for things like Hibernate
Why the HazelCast use?
Could we do this differently
lJMS Request reply
• But then you need to watch for
• Connectivity, queue speed, timeout
• If you don’t use Camel it is
§ Significantly more complex code
§ More error prone
lAnd it is down the road interesting with a Grid
lThis grid could be used for “slip” patterns, park
transactional stuff, be a lock / mutex
And the callback
lWe rely on HazelCast to inform us
And once we correlate
lWe can now resume our Webservice out of band!
What happened there?
lThe entry listener is the nice part
All done.
JVM
Actor
Business Logic
Business Logic
Business Logic
MEMORY GRID
QUEUEING
Summary
lWe had a monolithic app
lAnd we ended up with
• Replaced persistence with a more non blocking solution
• Replaced all of the synchronous web side
• Introduced queueing
• Introduced a memory grid
• We added asynchronous behavior to something BPEL’ey so that we can run
the process across multiple callers or multiple systems in parallel
Thank you!

Building Asynchronous Applications

  • 1.
    Johan Edstrom SOA Executiveand Apache developer. Apache Member Apache Camel PMC Apache ServiceMix committer Original CXF Blueprint Author Cassandra client library developer Author jedstrom@savoirtech.com joed@apache.org Using common frameworks Asynchronous applications
  • 2.
    Savoir Technologies -This is where we started
  • 3.
    Where are wenow? lWe have worked heavily with • Governments • Insurance • Utilities • Network companies • Education companies • Medical processing companies
  • 4.
    What is thisall about? lScaling • It is hard • How it is done - depends • Is this a blueprint? lTips n’ Tricks • Things we’ve learned over time lExperience • Do’s and Don't
  • 5.
    Before we start lWhatare we looking to change JVM DB Actor Business Logic
  • 6.
    We will looka little at these tools and libraries lApache Camel lApache Karaf, Savoirtech Aetos, ServiceMix lApache ActiveMQ lApache Cassandra / Savoirtech Hecate lApache CXF lAnd somewhat on AKKA • We really are just peeking at AKKA to validate some ideas
  • 7.
    Apache Camel lApache Camelis an open source Java framework with • Concrete implementations of all the widely used EIP patterns • Connectivity to a great variety of transports and API • Easy to use Domain Specific Languages (DSL)
  • 8.
    Apache Camel lCamel isa Domain Specific Language (DSL) focused on implementing Enterprise Integration Patterns (EIPs) • Examples: multicast, splitter, content-based router, routing slip, “dynamic routers”, aggregator lEIPs are implemented by Camel “Processors” • Users can develop their own processors • A processor is a fundamental building block • Bean language and bindings exists so that not a single piece of Apache Camel Imports will be necessary when integrating your existing code lCamel can connect to a wide variety of integration technologies • Examples: JMS, HTTP, FTP, SOAP, File - There are ~ 180 components
  • 9.
  • 10.
    Do I needall of that? lNope, many solutions will need just a few things • jaxb, camel, jms, soap, rest and perhaps jdbc • Cut the container down to fit your needs • We don’t need to load all of the 100+ Apache Camel components • Pick and choose! lShould I run that messaging solution inside the “ESB” • Entirely up to you, let us look a little deeper at that in a sec. lCan I test these solutions or am I stuck with System.out.println and a remote debugger?
  • 11.
    Apache Karaf lMini OSGiContainer • Foundation of Apache ServiceMix, Aetos, Talend ESB, Cisco Prime, ODL platforms and quite a few other offerings • For scaling you certainly don’t need Karaf - all of the concepts are theoretically possible in pretty much any language and platform if you do it correctly. § That said, Karaf enforces modular code (OSGi), controlled deployment and offers up many nice things like a remote console for “free”. JMS JAX-WS JAX-RS Camel Spring Aries OSGi Console Logging Provision Admin Spring-DM Aries
  • 12.
    Decoupling with OSGi lEnforcesknowledge of imports and exports lAllows us to version lProgramming model with micro-services allows 
 for re-use on an Api level vs code level lPromotes contracts
  • 13.
    Apache ActiveMQ lFast, powerfuland flexible messaging system lEasily embeddable lCan create complex topologies lTons of connection possibilities lCan be scaled up / down / right / left • Note - Currently merging with HornetMQ
  • 14.
    Apache Cassandra lEntered Apacheas Incubator Project in 2009 • Went through incubation to build community of committers lBecame Top Level Apache Project in February 2010 lHas proven to be a very flexible and widely used distributed big data solution. lCQL3 changes data-modeling “slightly” lCassandra was named after the Greek goddess • Cassandra could accurately predict things that would come • She spurred the Oracle of Delphi (thus a possible pun)
  • 15.
    What are somelanguage tools we can use? lFutures, Promises, Continuations (Async JaxRs, RIFE), JMS, Executors, Actors, Consumers, Producers, Runnables
 • Let’s not go too deep here; frameworks in Java land
 are there to help you not have to write super duper
 low level code, like Mina / Netty for networking.
 Guava for concurrency and collection handling. 
 CXF for JaxWs / JaxRs abstraction to just name a
 few.

  • 16.
    Traditional “full stack”application lSynchronous in design • Browser -> Servlet container is response time sensitive and expensive • Servlet code -> Service code can be time consuming • Service code -> Connection factories can easily block 
 • Probably uses a Java / JSP framework, some JS § Developers tend to be forced to know Java, JS, a bit of RDBMS, fiddle with servlet containers, rely quite a bit on QA for testing and shows weird and spurious errors during load testing
 JVM DB Actor Business Logic
  • 17.
    Persistence - Areyou a noSQL candidate? • Do you need to do complex queries on your data • Many JOINs and foreign keys with many relationships • Can be done in Cassandra with Hadoop or other MapReduce (Spark) • Need to weigh the development effort vs. just writing a SQL query • Do you need very strict ACID transactionality • Banking/Financial transactions could be difficult • ACID/Transactions can be built in Cassandra with complex application code using tools such as ZooKeeper (Distributed locks exist) • Need to weigh the development effort vs. using a RDBMS which supports transactionality out-of-the-box • Do you have very complex indexing requirements • Are you indexing multiple fields and having to create many Column Families to access your data many different ways?
  • 18.
    Let us saywe are. lThe work here would be in data modeling • The benefits we’d reap are eventual/controllable consistency • Extremely fast and Asynchronous writes • Automatic Data distribution and partitioning • No single point of failure • You have to unlearn some things *Images courtesy of DataStax
  • 19.
    What does thatlook like in Java? lTo create our keyspace. lTo use it with CQL3 - DataStax driver
 lUsing it with Hecate - Hecate maps POJO’s to prepared and async statements.
  • 20.
    Compared to atraditional RDBMS? lIn one project we mapped in 17 registry services • ~20 million users in the system more than 40 mil transactions / day § Handled all load-test scenarios on a 5 node Cassandra ring. • We are talking about average response times < 100ms
 end to end. • We also don’t need to worry much about § Second and 1st level caches § Cache synchronization or distribution § Locking § Select for update § Autoscaling § Building out § Building out geographically JVM Actor Business Logic
  • 21.
    To sum upCassandra lIf you are looking at doing this lselect a,b,c from table_X where y > 15 allow filtering; • You need to rethink your data model, you are looking at something that with sufficient amounts of data can take down pretty much any cluster if y is not part of a partition key, index (Indexes are bad for other reasons as well). • Solve these types of problems by writing your data the way you want to retrieve it.
  • 22.
  • 23.
    Apache CXF lLibrary thatis passing the TCK for the Web Profile lBuilds on the base JaxWS / JaxRs in the JDK lAlso does esoteric stuff like Corba / RMI lCan be used to do JMS lYou can build pub sub systems (WSN Notification)
  • 24.
    Now lets lookat the browser facing part lJAXRS 2.0 Provides for Async Sync Async
  • 25.
    And the executionof the response lUsing an executor service, with timeouts lSame thing but without lambdas
  • 26.
    Lets put someload on this lWe use Gatling (You could use JMeter or any other tool) l500 user load l500.000 Requests lWe measure the total time, time / request, mean avg.
  • 27.
    With a delayof 200-500ms - Async
  • 28.
  • 29.
    With a delayof 10-100ms - Sync
  • 30.
  • 31.
    Observations lWith a fastresponse that is linear • Async creates overhead, response times are worse • With just a minimal blocking introduced, sync starts choking pretty fast • It is hard to simulate on one machine lWhere this these techniques are utilized § A Large EC2 system is currently hitting around 163 r/s going against Cassandra with prepared (sync) statements and utilizing Async JaxRs § It is slated to be a replacement for a sync system that uses JSP and Cassandra (or Mongo), it does about the same load over 3 physical machines, 16GB JVM’s and yes…. 
 
 There were better developers involved on the second system
  • 32.
    Onto the lastpart! lLet’s make that business thing Async too! lWe’ll make it an almost “BPEL’ey” process. lWS Call from our main
 web service, response
 coming via a queue lNow we could solve this
 over JMS request/reply
 but we want to cluster too. JVM Actor Business Logic
  • 33.
    What is itwe want to solve? lWe want some storage lWe treat that as completely transient lWe want to share data between nodes lWe don’t want to re-invent the wheel
  • 34.
  • 35.
    We really needa MemoryGrid!!!! lWe can use HazelCast as it has neat side effects
  • 36.
  • 37.
  • 38.
    A HazelCast Config lYoucan setup Hazelcast • To be unicast • Multicast • EC2 aware • Implement persistence if you want • Use existing adaptors for things like Hibernate
  • 39.
  • 40.
    Could we dothis differently lJMS Request reply • But then you need to watch for • Connectivity, queue speed, timeout • If you don’t use Camel it is § Significantly more complex code § More error prone lAnd it is down the road interesting with a Grid lThis grid could be used for “slip” patterns, park transactional stuff, be a lock / mutex
  • 41.
    And the callback lWerely on HazelCast to inform us
  • 42.
    And once wecorrelate lWe can now resume our Webservice out of band!
  • 43.
    What happened there? lTheentry listener is the nice part
  • 44.
    All done. JVM Actor Business Logic BusinessLogic Business Logic MEMORY GRID QUEUEING
  • 45.
    Summary lWe had amonolithic app lAnd we ended up with • Replaced persistence with a more non blocking solution • Replaced all of the synchronous web side • Introduced queueing • Introduced a memory grid • We added asynchronous behavior to something BPEL’ey so that we can run the process across multiple callers or multiple systems in parallel
  • 46.