Your SlideShare is downloading. ×
Cassandra Community Webinar: Apache Cassandra Internals
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Cassandra Community Webinar: Apache Cassandra Internals

1,468
views

Published on

Apache Cassandra solves many interesting problems to provide a scalable, distributed, fault tolerant database. Cluster wide operations track node membership, direct requests and implement consistency …

Apache Cassandra solves many interesting problems to provide a scalable, distributed, fault tolerant database. Cluster wide operations track node membership, direct requests and implement consistency guarantees. At the node level, the Log Structured storage engine provides high performance reads and writes. All of this is implemented in a Java code base that has greatly matured over the past few years.

In this webinar Aaron Morton will step through read and write requests, automatic processes and manual maintenance tasks. He will also discuss the general approach to solving the problem and drill down to the code responsible for implementation.

Speaker: Aaron Morton, Apache Cassandra Committer
Aaron Morton is a Freelance Developer based in New Zealand, and a Committer on the Apache Cassandra project. In 2010 he gave up the RDBMS world for the scale and reliability of Cassandra. He now spends his time advancing the Cassandra project and helping others get the best out of it.

Published in: Technology

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,468
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
60
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. CASSANDRA COMMUNITY WEBINARS AUGUST 2013 CASSANDRA INTERNALS Aaron Morton @aaronmorton Co-Founder & Principal Consultant www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • 2. AboutThe Last Pickle Work with clients to deliver and improve Apache Cassandra based solutions. Apache Cassandra Committer, DataStax MVP, Hector Maintainer, 6+ years combined Cassandra experience. Based in New Zealand & Austin,TX.
  • 3. Architecture Code www.thelastpickle.com
  • 4. Cassandra Architecture API's Cluster Aware Cluster Unaware Clients Disk www.thelastpickle.com
  • 5. Cassandra Cluster Architecture API's Cluster Aware Cluster Unaware Clients Disk API's Cluster Aware Cluster Unaware Disk Node 1 Node 2 www.thelastpickle.com
  • 6. Dynamo Cluster Architecture API's Dynamo Database Clients Disk API's Dynamo Database Disk Node 1 Node 2 www.thelastpickle.com
  • 7. Architecture API Dynamo Database www.thelastpickle.com
  • 8. APITransports Thrift Native Binary www.thelastpickle.com
  • 9. ThriftTransport //Custom TServer implementations o.a.c.thrift.CustomTThreadPoolServer o.a.c.thrift.CustomTNonBlockingServer o.a.c.thrift.CustomTHsHaServer www.thelastpickle.com
  • 10. APITransports Thrift Native Binary www.thelastpickle.com
  • 11. Native BinaryTransport Beta in Cassandra 1.2 Uses Netty Enabled with start_native_transport (Disabled by default) www.thelastpickle.com
  • 12. o.a.c.transport.Server.run() //Setup the Netty server new ExecutionHandler() new NioServerSocketChannelFactory() ServerBootstrap.setPipelineFactory() www.thelastpickle.com
  • 13. o.a.c.transport.Message.Dispatcher.messageReceived() //Process message from client ServerConnection.validateNewMessage() Request.execute() ServerConnection.applyStateTransition() Channel.write() www.thelastpickle.com
  • 14. Messages Defined in the Native Binary Protocol $SRC/doc/native_protocol.spec www.thelastpickle.com
  • 15. API Services JMX Thrift CQL 3 www.thelastpickle.com
  • 16. JMX Management Beans Spread around the code base. Interfaces named *MBean www.thelastpickle.com
  • 17. JMX Management Beans Registered with names such as org.apache.cassandra.db: type=StorageProxy www.thelastpickle.com
  • 18. API Services JMX Thrift CQL 3 www.thelastpickle.com
  • 19. o.a.c.thrift.CassandraServer // Implements Thrift Interface // Access control // Input validation // Mapping to/from Thrift and internal types www.thelastpickle.com
  • 20. Thrift Interface Thrift IDL $SRC/interface/cassandra.thrift www.thelastpickle.com
  • 21. o.a.c.thrift.CassandraServer.get_slice() // get columns for one row Tracing.begin() ClientState cState = state() cState.hasColumnFamilyAccess() multigetSliceInternal() www.thelastpickle.com
  • 22. CassandraServer.multigetSliceInternal() // get columns for may rows ThriftValidation.validate*() // Create ReadCommands getSlice() www.thelastpickle.com
  • 23. CassandraServer.getSlice() // Process ReadCommands // return Thrift types readColumnFamily() thriftifyColumnFamily() www.thelastpickle.com
  • 24. CassandraServer.readColumnFamily() // Process ReadCommands // Return ColumnFamilies StorageProxy.read() www.thelastpickle.com
  • 25. API Services JMX Thrift CQL 3 www.thelastpickle.com
  • 26. o.a.c.cql3.QueryProcessor // Prepares and executes CQL3 statements // Used by Thrift & Native transports // Access control // Input validation // Returns transport.ResultMessage www.thelastpickle.com
  • 27. CQL3 Grammar ANTLR Grammar $SRC/o.a.c.cql3/Cql.g www.thelastpickle.com
  • 28. o.a.c.cql3.statements.ParsedStatement // Subclasses generated by ANTLR // Tracks bound term count // Prepare CQLStatement prepare() www.thelastpickle.com
  • 29. o.a.c.cql3.statements.CQLStatement checkAccess(ClientState state) validate(ClientState state) execute(ConsistencyLevel cl, QueryState state, List<ByteBuffer> variables) www.thelastpickle.com
  • 30. statements.SelectStatement.RawStatement // Implements ParsedStatement // Input validation prepare() www.thelastpickle.com
  • 31. statements.SelectStatement.execute() // Create ReadCommands StorageProxy.read() www.thelastpickle.com
  • 32. Architecture API Dynamo Database www.thelastpickle.com
  • 33. Dynamo Layer o.a.c.service o.a.c.net o.a.c.dht o.a.c.gms o.a.c.locator o.a.c.stream www.thelastpickle.com
  • 34. o.a.c.service.StorageProxy // Cluster wide storage operations // Select endpoints & check CL available // Send messages to Stages // Wait for response // Store Hints www.thelastpickle.com
  • 35. o.a.c.service.StorageService // Ring operations // Track ring state // Start & stop ring membership // Node & token queries www.thelastpickle.com
  • 36. o.a.c.service.IResponseResolver preprocess(MessageIn<T> message) resolve() throws DigestMismatchException RowDigestResolver RowDataResolver RangeSliceResponseResolver www.thelastpickle.com
  • 37. Response Handlers / Callback implements IAsyncCallback<T> response(MessageIn<T> msg) www.thelastpickle.com
  • 38. o.a.c.service.ReadCallback.get() //Wait for blockfor & data condition.await(timeout, TimeUnit.MILLISECONDS) throw ReadTimeoutException() resolver.resolve() www.thelastpickle.com
  • 39. o.a.c.service.StorageProxy.fetchRows() getLiveSortedEndpoints() new RowDigestResolver() new ReadCallback() MessagingService.sendRR() --------------------------------------- ReadCallback.get() # blocking catch (DigestMismatchException ex) catch (ReadTimeoutException ex) www.thelastpickle.com
  • 40. Dynamo Layer o.a.c.service o.a.c.net o.a.c.dht o.a.c.gms o.a.c.locator o.a.c.stream www.thelastpickle.com
  • 41. o.a.c.net.MessagingService.verb<<enum>> MUTATION READ REQUEST_RESPONSE TREE_REQUEST TREE_RESPONSE (And more...) www.thelastpickle.com
  • 42. o.a.c.net.MessagingService.verbHandlers new EnumMap<Verb, IVerbHandler>(Verb.class) www.thelastpickle.com
  • 43. o.a.c.net.IVerbHandler<T> doVerb(MessageIn<T> message, String id); www.thelastpickle.com
  • 44. o.a.c.net.MessagingService.verbStages new EnumMap<MessagingService.Verb, Stage>(MessagingService.Verb.class) www.thelastpickle.com
  • 45. o.a.c.net.MessagingService.receive() runnable = new MessageDeliveryTask( message, id, timestamp); StageManager.getStage( message.getMessageType()); stage.execute(runnable); www.thelastpickle.com
  • 46. o.a.c.net.MessageDeliveryTask.run() // If dropable and rpc_timeout MessagingService.incrementDroppedMessag es(verb); MessagingService.getVerbHandler(verb) verbHandler.doVerb(message, id) www.thelastpickle.com
  • 47. Architecture API Layer Dynamo Layer Database Layer www.thelastpickle.com
  • 48. Database Layer o.a.c.concurrent o.a.c.db o.a.c.cache o.a.c.io o.a.c.trace www.thelastpickle.com
  • 49. o.a.c.concurrent.StageManager stages = new EnumMap<Stage, ThreadPoolExecutor>(Stage.class); getStage(Stage stage) www.thelastpickle.com
  • 50. o.a.c.concurrent.Stage READ MUTATION GOSSIP REQUEST_RESPONSE ANTI_ENTROPY (And more...) www.thelastpickle.com
  • 51. Database Layer o.a.c.concurrent o.a.c.db o.a.c.cache o.a.c.io o.a.c.trace www.thelastpickle.com
  • 52. o.a.c.db.Table // Keyspace open(String table) getColumnFamilyStore(String cfName) getRow(QueryFilter filter) apply(RowMutation mutation, boolean writeCommitLog) www.thelastpickle.com
  • 53. o.a.c.db.ColumnFamilyStore // Column Family getColumnFamily(QueryFilter filter) getTopLevelColumns(...) apply(DecoratedKey key, ColumnFamily columnFamily, SecondaryIndexManager.Updater indexer) www.thelastpickle.com
  • 54. o.a.c.db.IColumnContainer addColumn(IColumn column) remove(ByteBuffer columnName) ColumnFamily SuperColumn www.thelastpickle.com
  • 55. o.a.c.db.ISortedColumns addColumn(IColumn column, Allocator allocator) removeColumn(ByteBuffer name) ArrayBackedSortedColumns AtomicSortedColumns TreeMapBackedSortedColumns www.thelastpickle.com
  • 56. o.a.c.db.Memtable put(DecoratedKey key, ColumnFamily columnFamily, SecondaryIndexManager.Updater indexer) flushAndSignal(CountDownLatch latch, Future<ReplayPosition> context) www.thelastpickle.com
  • 57. o.a.c.db.ReadCommand getRow(Table table) SliceByNamesReadCommand SliceFromReadCommand www.thelastpickle.com
  • 58. o.a.c.db.IDiskAtomFilter getMemtableColumnIterator(...) getSSTableColumnIterator(...) IdentityQueryFilter NamesQueryFilter SliceQueryFilter www.thelastpickle.com
  • 59. Summary CustomTThreadPoolServer Message.Dispatcher CassandraServer QueryProcessor ReadCommand StorageProxy IResponseResolver IAsyncCallback MessagingService IVerbHandler Table ColumnFamilyStore IDiskAtomFilter API Dynamo Database www.thelastpickle.com
  • 60. Thanks. www.thelastpickle.com
  • 61. Aaron Morton @aaronmorton Co-Founder & Principal Consultant www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License