POC d'une architecture distribuee de calculs financiers

289 views

Published on

Présentation effectuée pendant l'Open-XKE de Xebia France.
Ceci est le résultat d'un POC sur la création d'une architecture distribuée de calculs financiers.
On y parle de Scala, programmation fonctionnelle, de Stream, du patter Iteratee, de Akka Actors et Akka Cluster

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
289
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

POC d'une architecture distribuee de calculs financiers

  1. 1. Système distribué de calculs financiers Par Xavier Bucchiotty
  2. 2. ME @xbucchiotty https://github.com/xbucchiotty http://blog.xebia.fr/author/xbucchiotty
  3. 3. Build a testable, composable and scalable cash-flow system
  4. 4. Step 1 Step 2 Step 3 Step 4 Stream API Iteratees Akka actor Akka cluster
  5. 5. Use case Financial debt management
  6. 6. CAUTION
  7. 7. initial = 1000 € duration = 5 years fixed interets rate = 5% Date 1000 € Amort Interests Outstanding 2013-01-01 200 € 50 € 800 € 2014-01-01 200 € 40 € 600 € 2015-01-01 200 € 30 € 400 € 2016-01-01 200 € 20 € 200 € 2017-01-01 200 € 10 € 0 €
  8. 8. initial = 1000 € duration = 5 years fixed interets rate = 5% Date 1000 € Amort Interests Outstanding 2013-01-01 200 € 50 € 800 € 2014-01-01 200 € 40 € 600 € 2015-01-01 200 € 30 € 400 € 2016-01-01 200 € 20 € 200 € 2017-01-01 200 € 10 € 0 € date = last date + (1 year)
  9. 9. initial = 1000 € duration = 5 years fixed interets rate = 5% Date 1000 € Amort Interests Outstanding 2013-01-01 200 € 50 € 800 € 2014-01-01 200 € 40 € 600 € 2015-01-01 200 € 30 € 400 € 2016-01-01 200 € 20 € 200 € 2017-01-01 200 € 10 € 0 € amort = initial / duration
  10. 10. initial = 1000 € duration = 5 years fixed interets rate = 5% Date 1000 € Amort Interests Outstanding 2013-01-01 200 € 50 € 800 € 2014-01-01 200 € 40 € 600 € 2015-01-01 200 € 30 € 400 € 2016-01-01 200 € 20 € 200 € 2017-01-01 200 € 10 € 0 € outstanding = last oustanding - amort
  11. 11. initial = 1000 € duration = 5 years fixed interets rate = 5% Date 1000 € Amort Interests Outstanding 2013-01-01 200 € 50 € 800 € 2014-01-01 200 € 40 € 600 € 2015-01-01 200 € 30 € 400 € 2016-01-01 200 € 20 € 200 € 2017-01-01 200 € 10 € 0 € interests = last outstanding * rate
  12. 12. val f = (last: Row) => new Row { def date = last.date + (1 year) def amortization = last amortization def outstanding = last.outstanding - amortization def interests = last.outstanding * fixedRate }
  13. 13. Step 1 Stream API
  14. 14. Date Amort Interests Outstanding 2013-01-01 200 € 50 € 800 € 2014-01-01 200 € 40 € 600 € 2015-01-01 200 € 30 € 400 € 2016-01-01 200 € 20 € 200 € 2017-01-01 200 € 10 € 0 €
  15. 15. Date Amort Interests Outstanding first 2013-01-01 200 € 50 € 800 € f(first) 2014-01-01 200 € 40 € 600 € f(f(first)) 2015-01-01 200 € 30 € 400 € 2016-01-01 200 € 20 € 200 € 2017-01-01 200 € 10 € 0 €
  16. 16. case class Loan( ... ) { def first: Row def f:(Row => Row) def rows = Stream.iterate(first)(f) .take(duration) }
  17. 17. case class Portfolio(loans: Seq[Loan]) { def rows = loans.stream.flatMap(_.rows) }
  18. 18. Date Amort Interests Total paid 2013-01-01 200 € 40 € 240 € 2015-01-01 200 € 30 € 230 € 200 € 20 € 220 € 2017-01-01 200 € 10 € 210 € 2013-01-01 200 € 50 € 250 € 2014-01-01 200 € 40 € 240 € 2015-01-01 200 € 30 € 230 € 2016-01-01 200 € 20 € 220 € 2017-01-01 200 € 10 € 210 € 2013-01-01 200 € 50 € 250 € 2014-01-01 Loan 3 250 € 2016-01-01 Loan 2 50 € 2014-01-01 Loan 1 200 € 200 € 40 € 240 € 2015-01-01 200 € 30 € 230 € 2016-01-01 200 € 20 € 220 € 2017-01-01 200 € 10 € 210 € Total 3450 €
  19. 19. // Produce rows val totalPaid = portfolio.rows // Transform rows to amount .map(row => row.interests + row.amortization) //Consume amount .foldLeft(0 EUR)(_ + _)
  20. 20. // Produce rows val totalPaid = portfolio.rows // Transform rows to amount .map(row => row.interests + row.amortization) type RowProducer = Iterable[Row] type RowTransformer[T] = (Row=>T) //Consume amount .foldLeft(0 EUR)(_ + _) type AmountConsumer[T] = (Iterable[Amount]=>T)
  21. 21. RowProducer (Iterable[Row]) //Loan Stream.iterate(first)(f) take duration //Porfolio loans => loans flatMap (loan => loan.rows) + on demand computation - sequential computation
  22. 22. RowTransformer (Row => T) object RowTransformer { val totalPaid = (row: Row) => row.interests + row.amortization } + function composition - type limited to «map»
  23. 23. AmountConsumer (Iterable[Amount] => T) object AmountConsumer { def sum = (rows: Iterable[Amount]) => rows.foldLeft(Amount(0, EUR))(_ + _) } + function composition - synchronism
  24. 24. Step 1 Stream API 5000 loans 50 rows ~ 560 ms
  25. 25. Pros Cons On demand computation Sequential computation Function composition Synchronism Transformation limited to «map»
  26. 26. Step 2 Iteratees
  27. 27. Integrating Play iteratees libraryDependencies ++= Seq( "com.typesafe.play" %% "play-iteratees" % "2.2.0-RC2" )
  28. 28. Producer Enumerator Input Status Iteratee Consumer
  29. 29. Enumerator Iteratees are immutable Input Status Asynchronous by design Type safe Iteratee
  30. 30. Enumerator enumerate and interleave
  31. 31. case class Loan(initial: Amount, duration: Int, rowIt: RowIt) { def rows(implicit ctx: ExecutionContext) = Enumerator.enumerate( Stream.iterate(first)(f).take(duration) ) } Data producer
  32. 32. case class Portfolio(loans: Seq[Loansan]) { def rows(implicit ctx: ExecutionContext) = Enumerator.interleave(loans.map(_.rows)) } producers can be combined
  33. 33. Date Amort Interests Total paid 2013-01-01 200 € 50 € 250 € 2014-01-01 200 € 40 € 240 € 2015-01-01 200 € 30 € 230 € 2016-01-01 200 € 20 € 220 € 2017-01-01 200 € 10 € 2013-01-01 200 € 50 € 210 € 250 € 2014-01-01 200 € 40 € 240 € 2015-01-01 200 € 30 € 230 € 2016-01-01 200 € 20 € 220 € 2017-01-01 200 € 10 € 210 € 2013-01-01 200 € 50 € 250 € 2014-01-01 200 € 40 € 240 € 2015-01-01 200 € 30 € 230 € 2016-01-01 200 € 20 € 220 € 2017-01-01 200 € 10 € 210 € 3450 € Total
  34. 34. Iteratee Consumer as a state machine
  35. 35. Iteratees consume Input
  36. 36. object Input { case class El[+E](e: E) case object Empty case object EOF }
  37. 37. and propagates a state
  38. 38. object Step { case class Done[+A, E] (a: A, remaining: Input[E]) case class Cont[E, +A] (k: Input[E] => Iteratee[E, A]) case class Error[E] (msg: String, input: Input[E]) }
  39. 39. Enumerator Status Continue Iteratee Input El(...) def step = ... val count = 1 computes Iteratee def step = ... val count = 0
  40. 40. Enumerator Status Done Iteratee Input def step = ... val count = 1 EOF computes Iteratee def step = ... val count = 1
  41. 41. Enumerator Status Error Input Iteratee El(...) def step = ... val error = "Runtime Error" computes Iteratee def step = ... val count = 1
  42. 42. val last: RowConsumer[Option[Row]] = { def step(last: Option[Row]): K[Row,Option[Row]]= { case Input.Empty => Cont(step(last)) case Input.EOF => Done(last, Input.EOF) case Input.El(e) => Cont(step(Some(e))) } Cont(step(Option.empty[Row])) }
  43. 43. object AmountConsumer { val sum: AmountConsumer[Amount] = (rows: Iterable[Amount]) => rows.foldLeft(Amount(0, EUR))(_ + _) }
  44. 44. object AmountConsumer { val sum: AmountConsumer[Amount] = Iteratee.fold[Amount, Amount] (Amount(0, EUR))(_ + _) }
  45. 45. import RowTransformer.totalPaid import AmountConsumer.sum val totalPaidComputation: Future[Amount] = portfolio.rows.run(sum)
  46. 46. import RowTransformer.totalPaid import AmountConsumer.sum val totalPaidComputation: Future[Amount] = portfolio.rows |>>> sum
  47. 47. Enumeratee map and filter
  48. 48. Producer Enumerator Input Status Iteratee Consumer
  49. 49. Producer Enumerator Input[A] Transformation Enumeratee Status Input[B] Iteratee Consumer
  50. 50. object RowTransformer { val totalPaid = Enumeratee.map[Row](row => row.interests + row.amortization ) } Data transformation
  51. 51. def until(date: DateMidnight) = Enumeratee.filter[Row]( row => !row.date.isAfter(date) ) Data filtering
  52. 52. type RowProducer = Iterable[Row] type RowTransformer[T] = (Row=>T) type AmountConsumer[T] = (Iterable[Amount]=>T) type RowProducer = Enumerator[Row] type RowTransformer[T] = Enumeratee[Row, T] type AmountConsumer[T] = Iteratee[Amount, T]
  53. 53. Futures are composable map, flatMap, filter onComplete, onSuccess, onError, recover
  54. 54. // Produce rows val totalPaidComputation: Future[Amount] = portfolio.rows &> totalPaid |>>> sum // Blocking the thread to wait for the result val totalPaid = Await.result( totalPaidComputation, atMost = defaultTimeout) totalPaid should equal(3480 EUR)
  55. 55. We still have function composition and prepares the code for asynchronism
  56. 56. RowProducer //Loan Enumerator.enumerate( Stream.iterate(first)(f).take(duration) ) //Porfolio Enumerator.interleave(loans.map(_.rows)) + on demand computation + parallel computation
  57. 57. RowTransformer val totalPaid = Enumeratee.map[Row](row => row.interests + row.amortization ) + Function composition + map, filter, ...
  58. 58. AmountConsumer def sum = Iteratee.fold[Amount, Amount] (Amount(0, EUR))(_ + _) + Function composition + Asynchronism
  59. 59. Step 1 Step 2 Stream API Iteratees 5000 loans 50 rows 5000 loans 50 rows ~ 560 ms ~ 3500 ms ?
  60. 60. simple test complex test Thread.sleep((Math.random() * 1000) % 2) toLong)
  61. 61. Step 1 Step 2 Stream API Iteratees 5000 loans 50 rows 5000 loans 50 rows ~ 560 ms ~ 3500 ms with pause with pause ~ 144900 ms ~ 157285 ms ?
  62. 62. Cost of using this implementation of iteratees is greater than gain of interleaving for such small operations
  63. 63. Bulk interleaving
  64. 64. //Portfolio val split = loans.map(_.stream) .grouped(loans.size / 4)
  65. 65. Step 1 Step 2 Stream API Iteratees 5000 loans 50 rows 5000 loans 50 rows ~ 560 ms ~ 4571 ms with pause with pause ~ 144900 ms ~ 39042 ms
  66. 66. Pros Cons On demand computation Sequential computation Function composition Synchronism Transformation limited to «map»
  67. 67. Pros On demand computation Function composition Sequential computation Synchronism Cons
  68. 68. Pros Cons On demand computation No error management Function composition No elasticity Parallel computation No resilience Asynchronism
  69. 69. Step 3 Akka actor
  70. 70. Integrating Akka libraryDependencies ++= Seq( "com.typesafe.akka" %% "akka-actor" % "2.2.0" )
  71. 71. Actors are objects They communicate with each other by messages asynchronously
  72. 72. class Backend extends Actor { def receive = { case Compute(loan) => sender.tell( msg = loan.stream.toList, sender = self) } } case class Compute(loan: Loan)
  73. 73. case class Loan def rows(implicit calculator: ActorRef, ctx: ExecutionContext) = { val responseFuture = ask(calculator,Compute(this)) val rowsFuture = responseFuture .mapTo[List[Row]] rowsFuture.map(Enumerator.enumerate(_)) ) } }
  74. 74. val system = ActorSystem.create("ScalaIOSystem") val calculator = system.actorOf(Props[Backend] .withRouter( RoundRobinRouter(nrOfInstances = 10) ) ,"calculator") }
  75. 75. Supervision val simpleStrategy = OneForOneStrategy() { case _: AskTimeoutException => Resume case _: RuntimeException => Escalate } system.actorOf(Props[Backend] ... .withSupervisorStrategy(simpleStrategy)), "calculator")
  76. 76. Routee 1 Compute Router Routee 2 Routee 3
  77. 77. Routee 1 AskTimeoutException Router Routee 2 Resume Routee 3
  78. 78. Actor System Routee 1 Router Routee 2 Routee 3
  79. 79. RowProducer //Loan ask(calculator,Compute(this)) .mapTo[List[Row]] .map(Enumerator.enumerate(_)) //Porfolio Enumerator.interleave(loans.map(_.rows)) + parallel computation - on demand computation
  80. 80. RowTransformer val totalPaid = Enumeratee.map[Row](row => row.interests + row.amortization ) + Nothing changed
  81. 81. AmountConsumer def sum = Iteratee.fold[Amount, Amount] (Amount(0, EUR))(_ + _) + Nothing changed
  82. 82. Step 1 Step 2 Step 3 Stream API Iteratees Akka actor 5000 loans 50 rows 5000 loans 50 rows 5000 loans 50 rows ~ 560 ms ~ 4571 ms ~ 4271 ms with pause with pause with pause ~ 144900 ms ~ 39042 ms ~ 40882 ms
  83. 83. Pros Cons On demand computation No error management Function composition No elasticity Parallel computation No resilience Asynchronism
  84. 84. Pros Cons No error management On demand computation Function composition No elasticity Parallel computation No resilience Asynchronism
  85. 85. Pros Cons Error management No on demand computation Function composition No elasticity Parallel computation No resilience Asynchronism
  86. 86. Step 4 Akka cluster
  87. 87. Integrating Akka Cluster libraryDependencies ++= Seq( "com.typesafe.akka" %% "akka-cluster" % "2.2.0" )
  88. 88. Cluster Router ClusterRouterConfig Can create actors on different nodes of the cluster Role Local actors or not Control number of actors per node per system
  89. 89. Cluster Router AdaptiveLoadBalancingRouter Collect metrics (CPU, HEAP, LOAD) via JMX or Hyperic Sigar and make load balancing
  90. 90. val calculator = system.actorOf(Props[Backend] .withRouter( RoundRobinRouter(nrOfInstances = 10) ) ,"calculator") } val calculator = system.actorOf(Props[Backend] .withRouter(ClusterRouterConfig( local = localRouter, settings = clusterSettings) ) , "calculator") }
  91. 91. Actor System Routee 3 Actor System Routee 1 Routee 4 Elasticity Router Routee 5 Routee 3 Routee 6 Actor System
  92. 92. application.conf cluster { seed-nodes = [ "akka.tcp://ScalaIOSystem@127.0.0.1:2551", "akka.tcp://ScalaIOSystem@127.0.0.1:2552" ] auto-down = on }
  93. 93. Actor System Routee 3 Actor System Routee 1 Routee 4 Resilience Router Routee 5 Routee 3 Routee 6 Actor System
  94. 94. RowProducer //Loan ask(calculator,Compute(this)) .mapTo[List[Row]] .map(Enumerator.enumerate(_)) //Porfolio Enumerator.interleave(loans.map(_.rows)) + Nothing changed
  95. 95. RowTransformer val totalPaid = Enumeratee.map[Row](row => row.interests + row.amortization ) + Nothing changed
  96. 96. AmountConsumer def sum = Iteratee.fold[Amount, Amount] (Amount(0, EUR))(_ + _) + Nothing changed
  97. 97. Pros Cons Error management No on demand computation Function composition No elasticity Parallel computation No resilience Asynchronism
  98. 98. Pros Error management Function composition Parallel computation Asynchronism No elasticity No resilience Cons No on demand computation
  99. 99. Pros Cons Error management No on demand computation Function composition Network serialization Parallel computation Asynchronism Elasticity Resilience
  100. 100. Step 1 Step 2 Step 3 Step 4 Stream API Iteratees Akka actor Akka cluster 5000 loans 50 rows 5000 loans 50 rows 5000 loans 50 rows 5000 loans 50 rows ~ 560 ms ~ 4571 ms ~ 4271 ms ~ 6213 ms with pause with pause with pause with pause ~ 144900 ms ~ 39042 ms ~ 40882 ms ~ 77957 ms 1 node / 2 actors
  101. 101. Step 1 Step 2 Step 3 Step 4 Stream API Iteratees Akka actor Akka cluster 5000 loans 50 rows 5000 loans 50 rows 5000 loans 50 rows 5000 loans 50 rows ~ 560 ms ~ 4571 ms ~ 4271 ms ~ 5547 ms with pause with pause with pause with pause ~ 144900 ms ~ 39042 ms ~ 40882 ms ~ 39695 ms 2 nodes / 4 actors
  102. 102. Conclusion
  103. 103. Step 1 Step 2 Step 3 Step 4 Stream API Iteratees Akka actor Akka cluster powerful library elegant API error management elasticity low memory enable asynchronism and parallelism control on parallel execution via configuration resilience performance when single threaded monitoring
  104. 104. It’s all about trade-off
  105. 105. But do you really need distribution?
  106. 106. Hot subject Recet blog post from «Mandubian» for Scalaz stream machines and iteratees [1] Recent presentation from «Heather Miller» for spores (distribuables closures) [2] Recent release of Scala 2.10.3 and performance optimization of Promise Release candidate of play-iteratee module with performance optimization Lots of stuff in the roadmap of Akka cluster 2.3.0
  107. 107. Hot subject [1] : http://mandubian.com/2013/08/21/playztream/ [2] : https://speakerdeck.com/heathermiller/on-pickles-and-sporesimproving-support-for-distributed-programming-in-scala
  108. 108. THANK YOU FOR watching Merci!

×