HBase RowKey design for Akka Persistence

1,486 views

Published on

Short lightning talk about the HBase plugin for Akka Persistence and how it's how key design was specifically tuned for increasing numeric sequential idenfitiers, so that the cluster can be utilised properly.

https://github.com/ktoso/akka-persistence-hbase

Published in: Technology

HBase RowKey design for Akka Persistence

  1. 1. RowKey design in HBase
 Akka Persistence Plugin Konrad 'ktoso' Malawski GeeCON 2014 @ Kraków, PL Konrad `@ktosopl` Malawski
  2. 2. Konrad `@ktosopl` Malawski hAkker @
  3. 3. Konrad `@ktosopl` Malawski typesafe.com geecon.org Java.pl / KrakowScala.pl sckrk.com / meetup.com/Paper-Cup @ London GDGKrakow.pl meetup.com/Lambda-Lounge-Krakow hAkker @
  4. 4. PersistentActor Akka  Persistence  ScalaDays  2014
  5. 5. PersistentActor
  6. 6. Compared to “Good ol’ CRUD Model” state “Mutable Record” state = apply(es) “Series of Events”
  7. 7. super quick domain modelling! sealed trait Command! case class GiveMe(geeCoins: Int) extends Command! case class TakeMy(geeCoins: Int) extends Command Commands - what others “tell” us; not persisted case class Wallet(geeCoins: Int) {! def updated(diff: Int) = State(geeCoins + diff)! } State - reflection of a series of events sealed trait Event! case class BalanceChangedBy(geeCoins: Int) extends Event! Events - reflect effects, past tense; persisted
  8. 8. var state = S0 ! persistenceId = “a” ! PersistentActor Command ! ! Journal
  9. 9. PersistentActor var state = S0 ! persistenceId = “a” ! ! ! Journal Generate Events
  10. 10. PersistentActor var state = S0 ! persistenceId = “a” ! ! ! Journal Generate Events E1
  11. 11. PersistentActor ACK “persisted” ! ! Journal E1 var state = S0 ! persistenceId = “a” !
  12. 12. PersistentActor “Apply” event ! ! Journal E1 var state = S0 ! persistenceId = “a” ! E1
  13. 13. PersistentActor ! ! Journal E1 var state = S0 ! persistenceId = “a” ! E1 Okey!
  14. 14. PersistentActor ! ! Journal E1 var state = S0 ! persistenceId = “a” ! E1 Okey!
  15. 15. PersistentActor ! ! Journal E1 var state = S0 ! persistenceId = “a” ! E1 Ok, he got my $.
  16. 16. PersistentActor class BitCoinWallet extends PersistentActor {! ! var state = Wallet(coins = 0)! ! def updateState(e: Event): State = {! case BalanceChangedBy(coins) => state.updatedWith(coins)! }! ! // API:! ! def receiveCommand = ??? // TODO! ! def receiveRecover = ??? // TODO! ! }!
  17. 17. persist(e) { e => }
  18. 18. PersistentActor def receiveCommand = {! ! case TakeMy(coins) =>! persist(BalanceChangedBy(coins)) { changed =>! state = updateState(changed) ! }! ! ! ! ! ! ! } async callback
  19. 19. persist(){} - Ordering guarantees ! ! Journal E1 var state = S0 ! persistenceId = “a” ! C1 C2 C3
  20. 20. ! ! Journal E1 var state = S0 ! persistenceId = “a” ! C1 C2 C3 Commands get “stashed” until processing C1’s events are acted upon. persist(){} - Ordering guarantees
  21. 21. ! ! Journal var state = S0 ! persistenceId = “a” ! C1 C2 C3 E1 E2 E2E1 events get applied in-order persist(){} - Ordering guarantees
  22. 22. C2 ! ! Journal var state = S0 ! persistenceId = “a” ! C3 E1 E2 E2E1 and the cycle repeats persist(){} - Ordering guarantees
  23. 23. Recovery Akka  Persistence  ScalaDays
  24. 24. Eventsourced, recovery /** MUST NOT SIDE-EFFECT! */! def receiveRecover = {! case replayedEvent: Event => ! state = updateState(replayedEvent)! } re-using updateState, as seen in receiveCommand Akka  Persistence  ScalaDays
  25. 25. Snapshots
  26. 26. Snapshots (in SnapshotStore)
  27. 27. Eventsourced, snapshots def receiveCommand = {! case command: Command =>! saveSnapshot(state) // async!! } /** MUST NOT SIDE-EFFECT! */! def receiveRecover = {! case SnapshotOffer(meta, snapshot: State) => ! this.state = state! ! case replayedEvent: Event => ! updateState(replayedEvent)! } snapshot!? how?
  28. 28. …sum of states… Snapshots ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8
  29. 29. state until [E8] Snapshots S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 ! ! Snapshot Store snapshot!
  30. 30. state until [E8] Snapshots S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 crash! ! ! Snapshot Store snapshot! S8
  31. 31. Snapshots ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 crash! ! ! Snapshot Store S8
  32. 32. “bring me up-to-date!” Snapshots ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 restart! replay! ! ! Snapshot Store S8
  33. 33. “bring me up-to-date!” Snapshots restart! replay! S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 ! ! Snapshot Store S8
  34. 34. state until [E8] Snapshots restart! replay! S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 ! ! Snapshot Store S8
  35. 35. state until [E8] Snapshots S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 We could delete these! ! ! Snapshot Store S8
  36. 36. trait MySummer extends Processor {! var nums: List[Int]! var total: Int! ! def receive = {! case "snap" => saveSnapshot(total)! case SaveSnapshotSuccess(metadata) => // ...! case SaveSnapshotFailure(metadata, reason) => // ...! }! }! Snapshots, save Async!
  37. 37. Snapshot Recovery class Counter extends Processor {! var total = 0! ! def receive = {! case SnapshotOffer(metadata, snap: Int) => ! total = snap! ! case Persistent(payload, sequenceNr) => // ...! }! }
  38. 38. Persistence Plugins
  39. 39. Akka Persistence TCK class JournalTCKSpec extends JournalSpec {! lazy val config = ConfigFactory.parseString(“").! withFallback(ConfigFactory.load())! }! ! ! ! ! class SnapshotStoreTCKSpec extends SnapshotStoreSpec {! lazy val config = ConfigFactory.parseString(“").! ! ! ! ! ! ! ! ! ! ! withFallback(ConfigFactory.load())! }!
  40. 40. Journal Plugin API ! def asyncWriteMessages(! messages: immutable.Seq[PersistentRepr]! ): Future[Unit]! ! def asyncDeleteMessagesTo(! persistenceId: String, ! toSequenceNr: Long, ! permanent: Boolean! ): Future[Unit]! ! @deprecated("writeConfirmations will be removed.", since = "2.3.4")! def asyncWriteConfirmations(! confirmations: immutable.Seq[PersistentConfirmation]! ): Future[Unit]! ! @deprecated("asyncDeleteMessages will be removed.", since = "2.3.4")! def asyncDeleteMessages(! messageIds: immutable.Seq[PersistentId], ! permanent: Boolean! ): Future[Unit]!
  41. 41. Optimising writes
  42. 42. Hbase table data layout a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
  43. 43. Hbase table data layout a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 HBase regions
  44. 44. Hbase table data layout a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 HBase regions (lexicographical order)
  45. 45. Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b”
  46. 46. Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b”
  47. 47. Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b” HOT region
  48. 48. Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b” HOT region
  49. 49. Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b” impossible to spread load!
  50. 50. Optimising writes hot-spotting
  51. 51. Optimising writes hot-spotting utilise entire cluster
  52. 52. Optimising writes 001-a-0001,
 001-b-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-b-0002,
 002-a-00052 003-a-0003,
 003-b-0003,
 003-a-0053 049-a-0049,
 049-b-0049,
 049-a-0099 … “partition” seeds [HMaster]
  53. 53. Optimising writes 001-a-0001,
 001-b-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-b-0002,
 002-a-00052 003-a-0003,
 003-b-0003,
 003-a-0053 049-a-0049,
 049-b-0049,
 049-a-0099 … “partition” seeds [HMaster]
  54. 54. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… [HMaster] Write load spread to cluster!
  55. 55. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… Writes to different regions! [HMaster]
  56. 56. Optimising writes
  57. 57. Optimising reads a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” replay!
  58. 58. Optimising reads a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” replay! Ordered, batch read, super efficient!!!
  59. 59. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  60. 60. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  61. 61. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  62. 62. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  63. 63. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  64. 64. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  65. 65. Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… this is madness! replay!
  66. 66. Optimising reads
  67. 67. Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  68. 68. Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… async! replay! + re-sequence
  69. 69. Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… async! small batches replay! + re-sequence
  70. 70. Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay! (to seqNr = 2)
  71. 71. Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay! (to seqNr = 2)
  72. 72. Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… for short recovery = no need to check all servers! replay! (to seqNr = 2)
  73. 73. Akka  Persistence  ScalaDays  2014 Akka Persistence Plugins • Journals / Snapshot Stores (http://akka.io/community/) • Cassandra • HBase • Kafka • DynamoDB • MongoDB • shared LevelDB journal for testing
  74. 74. Akka  Persistence  ScalaDays  2014 Links • http://akka.io • https://groups.google.com/forum/#!forum/akka-user ! • https://github.com/ktoso/akka-persistence-hbase ! • http://www.slideshare.net/alexbaranau/intro-to-hbase- internals-schema-design-for-hbase-users • http://blog.sematext.com/2012/04/09/hbasewd-avoid- regionserver-hotspotting-despite-writing-records-with- sequential-keys/ • https://github.com/OpenTSDB/asynchbase
  75. 75. ©Typesafe 2014 – All Rights Reserved

×