1. Cozy with Cassandra Getting to know the Cassandra Codebase Gary Dusbabek • Rackspace @gdusbabek Cassandra Summit • Mission Bay Conference Center • San Francisco • 10 August 2010
8. CassandraDaemon Loads configuration Transport initialization Storage (Keyspace initialization) CommitLogrecovery StorageService.initServer() Initializes CassandraServer Passes it off to transport
9. CassandraServer Implements IDL interface methods (cassandra.thrift, cassandra.genavro) Good place to start diving when adding or troubleshooting API methods
10. Configuration DatabaseDescriptor Via CassandraDaemon.setup() Looks for config path, loads yaml Doesn’t spin anything up Defines system tables KS and CF described by CFMetaData and KSMetaData
12. Main Controllers End with *Service or *Manager StorageService, MessagingService CompactionManager, HintedHandoffManager, StageManager, StreamInManager
13. StorageProxy Put & Get methods Collection of static methods Merges local and distributed operations Tracks latency Exposed via StorageProxyMBean
14. StorageService initServer()—Starts services Registers verb handlers (in MessagingService) Main event responders Repository of replication strategies and TokenMetadata Ring topology & token information
15. MessagingService Verb handlers reside here Sets up socket listeners Gateway for outbound messages MS.sendRR() MS.sendOneWay() Inbound too MS.receive()
16. Table & ColumnFamilyStore Also RowMutation Low-level storage operations o.a.c.db.* SSTable Local operations
28. Threads MessagingService.listen() spawns thread. Each incoming connection spawns a new short-lived thread (IncomingTcpConnection) Non-stream ops go to MS.messageDeserializerExecutor_ Stream ops handled there. Anti-entropy repair
30. <=0.6 Bootstraping A wants data, B has data. StreamingRequestMessage A->B Handled on B by StreamRequestVerbHandler For each range StreamOut.transferRanges() Flush, anticompaction StreamInitiateMessage B->A for each range transfer Meanwhile, back on A… StreamInitiateVerbHandler gets the SIM from B, does some nesting. StreamInitiateDone A->B Back to B… StreamInitiateDoneHandler gets the SID from A Calls StreamOutManager.startNext() which sends a single file to A MessagingService on A picks this up and the file is streamed. Sstable is created STREAM_FINISHED A->B B gets rid of the file, calls SOM.startNext()
31. 0.7 Bootstrapping A wants data, B has data StreamRequestMessage A->B On B, StreamRequestVerbHandler If single file, sends it. If range, StreamOut.transferRangesForRequest() Send next file (first will contain meta data about all files) On A, IncomingStreamReader.read() Data is received, sstable created Ack, request next file
32. Tests Testable & Untestable Unit tests ant clean build test System tests ant gen-thrift-py nosetests test/system/test_thrift_server.py
33. IDE Configuration file must be in the classpath Treat as source lib vs build/lib Log at debug
35. Adding API methods Same goes for modifying Define method and structures in IDL interface/cassandra.thrift Regenerate files ant gen-thrift-java gen-thrift-py Implement methods in o.a.c.thrift.CassandraServer Create a system test (tests/system/test_thrift_server.py)