HBase RowKey design for Akka Persistence
Upcoming SlideShare
Loading in...5
×
 

HBase RowKey design for Akka Persistence

on

  • 487 views

Short lightning talk about the HBase plugin for Akka Persistence and how it's how key design was specifically tuned for increasing numeric sequential idenfitiers, so that the cluster can be utilised ...

Short lightning talk about the HBase plugin for Akka Persistence and how it's how key design was specifically tuned for increasing numeric sequential idenfitiers, so that the cluster can be utilised properly.

https://github.com/ktoso/akka-persistence-hbase

Statistics

Views

Total Views
487
Views on SlideShare
487
Embed Views
0

Actions

Likes
0
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

HBase RowKey design for Akka Persistence HBase RowKey design for Akka Persistence Presentation Transcript

  • RowKey design in HBase
 Akka Persistence Plugin Konrad 'ktoso' Malawski GeeCON 2014 @ Kraków, PL Konrad `@ktosopl` Malawski
  • Konrad `@ktosopl` Malawski hAkker @
  • Konrad `@ktosopl` Malawski typesafe.com geecon.org Java.pl / KrakowScala.pl sckrk.com / meetup.com/Paper-Cup @ London GDGKrakow.pl meetup.com/Lambda-Lounge-Krakow hAkker @
  • PersistentActor Akka  Persistence  ScalaDays  2014
  • PersistentActor
  • Compared to “Good ol’ CRUD Model” state “Mutable Record” state = apply(es) “Series of Events”
  • super quick domain modelling! sealed trait Command! case class GiveMe(geeCoins: Int) extends Command! case class TakeMy(geeCoins: Int) extends Command Commands - what others “tell” us; not persisted case class Wallet(geeCoins: Int) {! def updated(diff: Int) = State(geeCoins + diff)! } State - reflection of a series of events sealed trait Event! case class BalanceChangedBy(geeCoins: Int) extends Event! Events - reflect effects, past tense; persisted
  • var state = S0 ! persistenceId = “a” ! PersistentActor Command ! ! Journal
  • PersistentActor var state = S0 ! persistenceId = “a” ! ! ! Journal Generate Events
  • PersistentActor var state = S0 ! persistenceId = “a” ! ! ! Journal Generate Events E1
  • PersistentActor ACK “persisted” ! ! Journal E1 var state = S0 ! persistenceId = “a” !
  • PersistentActor “Apply” event ! ! Journal E1 var state = S0 ! persistenceId = “a” ! E1
  • PersistentActor ! ! Journal E1 var state = S0 ! persistenceId = “a” ! E1 Okey!
  • PersistentActor ! ! Journal E1 var state = S0 ! persistenceId = “a” ! E1 Okey!
  • PersistentActor ! ! Journal E1 var state = S0 ! persistenceId = “a” ! E1 Ok, he got my $.
  • PersistentActor class BitCoinWallet extends PersistentActor {! ! var state = Wallet(coins = 0)! ! def updateState(e: Event): State = {! case BalanceChangedBy(coins) => state.updatedWith(coins)! }! ! // API:! ! def receiveCommand = ??? // TODO! ! def receiveRecover = ??? // TODO! ! }!
  • persist(e) { e => }
  • PersistentActor def receiveCommand = {! ! case TakeMy(coins) =>! persist(BalanceChangedBy(coins)) { changed =>! state = updateState(changed) ! }! ! ! ! ! ! ! } async callback
  • persist(){} - Ordering guarantees ! ! Journal E1 var state = S0 ! persistenceId = “a” ! C1 C2 C3
  • ! ! Journal E1 var state = S0 ! persistenceId = “a” ! C1 C2 C3 Commands get “stashed” until processing C1’s events are acted upon. persist(){} - Ordering guarantees
  • ! ! Journal var state = S0 ! persistenceId = “a” ! C1 C2 C3 E1 E2 E2E1 events get applied in-order persist(){} - Ordering guarantees
  • C2 ! ! Journal var state = S0 ! persistenceId = “a” ! C3 E1 E2 E2E1 and the cycle repeats persist(){} - Ordering guarantees
  • Recovery Akka  Persistence  ScalaDays
  • Eventsourced, recovery /** MUST NOT SIDE-EFFECT! */! def receiveRecover = {! case replayedEvent: Event => ! state = updateState(replayedEvent)! } re-using updateState, as seen in receiveCommand Akka  Persistence  ScalaDays
  • Snapshots
  • Snapshots (in SnapshotStore)
  • Eventsourced, snapshots def receiveCommand = {! case command: Command =>! saveSnapshot(state) // async!! } /** MUST NOT SIDE-EFFECT! */! def receiveRecover = {! case SnapshotOffer(meta, snapshot: State) => ! this.state = state! ! case replayedEvent: Event => ! updateState(replayedEvent)! } snapshot!? how?
  • …sum of states… Snapshots ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8
  • state until [E8] Snapshots S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 ! ! Snapshot Store snapshot!
  • state until [E8] Snapshots S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 crash! ! ! Snapshot Store snapshot! S8
  • Snapshots ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 crash! ! ! Snapshot Store S8
  • “bring me up-to-date!” Snapshots ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 restart! replay! ! ! Snapshot Store S8
  • “bring me up-to-date!” Snapshots restart! replay! S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 ! ! Snapshot Store S8
  • state until [E8] Snapshots restart! replay! S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 ! ! Snapshot Store S8
  • state until [E8] Snapshots S8 ! ! Journal E1 E2 E3 E4 E5 E6 E7 E8 We could delete these! ! ! Snapshot Store S8
  • trait MySummer extends Processor {! var nums: List[Int]! var total: Int! ! def receive = {! case "snap" => saveSnapshot(total)! case SaveSnapshotSuccess(metadata) => // ...! case SaveSnapshotFailure(metadata, reason) => // ...! }! }! Snapshots, save Async!
  • Snapshot Recovery class Counter extends Processor {! var total = 0! ! def receive = {! case SnapshotOffer(metadata, snap: Int) => ! total = snap! ! case Persistent(payload, sequenceNr) => // ...! }! }
  • Persistence Plugins
  • Akka Persistence TCK class JournalTCKSpec extends JournalSpec {! lazy val config = ConfigFactory.parseString(“").! withFallback(ConfigFactory.load())! }! ! ! ! ! class SnapshotStoreTCKSpec extends SnapshotStoreSpec {! lazy val config = ConfigFactory.parseString(“").! ! ! ! ! ! ! ! ! ! ! withFallback(ConfigFactory.load())! }!
  • Journal Plugin API ! def asyncWriteMessages(! messages: immutable.Seq[PersistentRepr]! ): Future[Unit]! ! def asyncDeleteMessagesTo(! persistenceId: String, ! toSequenceNr: Long, ! permanent: Boolean! ): Future[Unit]! ! @deprecated("writeConfirmations will be removed.", since = "2.3.4")! def asyncWriteConfirmations(! confirmations: immutable.Seq[PersistentConfirmation]! ): Future[Unit]! ! @deprecated("asyncDeleteMessages will be removed.", since = "2.3.4")! def asyncDeleteMessages(! messageIds: immutable.Seq[PersistentId], ! permanent: Boolean! ): Future[Unit]!
  • Optimising writes
  • Hbase table data layout a-0001,a-0002,a-0003 b-0001,b-0002,b-0003
  • Hbase table data layout a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 HBase regions
  • Hbase table data layout a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 HBase regions (lexicographical order)
  • Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b”
  • Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b”
  • Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b” HOT region
  • Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b” HOT region
  • Optimising writes a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” id = “b” impossible to spread load!
  • Optimising writes hot-spotting
  • Optimising writes hot-spotting utilise entire cluster
  • Optimising writes 001-a-0001,
 001-b-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-b-0002,
 002-a-00052 003-a-0003,
 003-b-0003,
 003-a-0053 049-a-0049,
 049-b-0049,
 049-a-0099 … “partition” seeds [HMaster]
  • Optimising writes 001-a-0001,
 001-b-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-b-0002,
 002-a-00052 003-a-0003,
 003-b-0003,
 003-a-0053 049-a-0049,
 049-b-0049,
 049-a-0099 … “partition” seeds [HMaster]
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… [HMaster] Write load spread to cluster!
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… Writes to different regions! [HMaster]
  • Optimising writes
  • Optimising reads a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” replay!
  • Optimising reads a-0001,a-0002,a-0003 b-0001,b-0002,b-0003 id = “a” replay! Ordered, batch read, super efficient!!!
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  • Optimising writes 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… this is madness! replay!
  • Optimising reads
  • Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay!
  • Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… async! replay! + re-sequence
  • Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… async! small batches replay! + re-sequence
  • Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay! (to seqNr = 2)
  • Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… replay! (to seqNr = 2)
  • Optimising recovery 001-a-0001,
 001-a-0051 id = “a” 002-a-0002,
 002-a-00052 003-a-0003,
 003-a-0053 049-a-0049,
 049-a-0099… for short recovery = no need to check all servers! replay! (to seqNr = 2)
  • Akka  Persistence  ScalaDays  2014 Akka Persistence Plugins • Journals / Snapshot Stores (http://akka.io/community/) • Cassandra • HBase • Kafka • DynamoDB • MongoDB • shared LevelDB journal for testing
  • Akka  Persistence  ScalaDays  2014 Links • http://akka.io • https://groups.google.com/forum/#!forum/akka-user ! • https://github.com/ktoso/akka-persistence-hbase ! • http://www.slideshare.net/alexbaranau/intro-to-hbase- internals-schema-design-for-hbase-users • http://blog.sematext.com/2012/04/09/hbasewd-avoid- regionserver-hotspotting-despite-writing-records-with- sequential-keys/ • https://github.com/OpenTSDB/asynchbase
  • ©Typesafe 2014 – All Rights Reserved