C* Summit EU 2013: The State of CQL


Published on

Speaker: Sylvain Lebresne, Software Engineer at DataStax
Video: http://www.youtube.com/watch?v=4GSfAS4nFAs&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=18
Since its inception, the Cassandra Query Language (CQL) has grown and matured, resulting in the 3rd version of the language (CQL3) being finalized in Cassandra 1.2 and further improved in Cassandra 2.0. Compared to the legacy Thrift API, CQL3 aims at providing an API that is higher level, more user friendly, but still fully assumes the distributed nature of Cassandra and it's storage engine. This talk will present CQL3, describing the reasoning and goals behind the language as well as the language itself. We will also touch on CQL's relationship with Thrift and will present the CQL binary protocol that has been introduced in Cassandra 1.2. We will wrap up by discussing the future of CQL.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

C* Summit EU 2013: The State of CQL

  1. 1. The State of CQL Sylvain Lebresne (DataStax)
  2. 2. A short CQL primer New in Cassandra 2.0 Native protocol What's next? 2/20
  3. 3. A better API for Cassandra Thrift is not satisfactory: · Not user friendly, hard to use. · Low level, very little abstraction. · Hard to evolve (in a backward compatible way). · Unreadable without driver abstraction. Cassandra has often been regarded as hard to develop against. It doesn't have to be that way! 3/20
  4. 4. Quick historical notes · CQL1 first introduced in Cassandra 0.8, became CQL2 in Cassandra 1.0 · "These aren't the CQL you are looking for" · CQL3 (CQL for short thereafter) introduced in Cassandra 1.2 · Semantically, CQL1/CQL2 are closer to the Thrift API than to CQL3. · CQL3 is the version that's here to stay: no plan for a CQL4 any time soon. 4/20
  5. 5. A short CQL primer
  6. 6. The Cassandra Query Language · Syntactically, a subset of SQL (with a few extensions) CET TBEues( RAE AL sr ue_dui, sri ud nm tx, ae et pswr tx, asod et ealtx, mi et pcuepoiebo, itr_rfl lb PIAYKY(sri) RMR E ue_d ) · INSERT and UPDATE are both upserts · No joins, no sub-queries, no aggregation, ... · Denormalization is the norm: do the work at write time, not read time 6/20 CL Q
  7. 7. Denormalization: Cassandra modeling 101 Efficient queries in Cassandra are based on 2 principles: · the data queried is collocated on one replica set · the data queried is collocated on disk on those replicas Denormalization is the technique that allows to achieve this in practice. But this means CQL exposes: · how to collocate data on the same replica set · how to collocate data on disk (for a given replica) 7/20
  8. 8. This is done in CQL through the primary key CET TBEibxs( RAE AL noe CL Q ue_dui, sri ud eali tmui, mi_d ieud sne tx, edr et rcpet sttx> eiins e<et, sbettx, ujc et i_edboen sra ola, P I A Y K Y (u e _ d, e a l i RMR E sri m i _ d) ) CQL distinguishes 2 sub-parts in the PRIMARY KEY: partition key: decides the node on which the data is stored clustering columns: within the same partition key, (CQL3) rows are physically ordered following the clustering columns · · This is important, because CQL only allow queries for which an explicit index exists: - Gtls 5 eal i ue 5b2-b ibx - e at 0 mis n sr 1-3a8 no 8/20 SLC *FO ibxsWEEue_d5b2-b ODRB eali DS LMT5; EET RM noe HR sri=1-3a8 RE Y mi_d EC II 0 CL Q
  9. 9. CQL main features · Collections (set, map and list) · Secondary indexes · Convenience functions (timeuuid, type conversions, ...) · ... For more details: · http://cassandra.apache.org/doc/cql3/CQL.html · http://www.datastax.com/documentation/cql/3.1/webhelp/index.html 9/20
  10. 10. New in Cassandra 2.0
  11. 11. New in Cassandra 2.0 Lightweight transactions: ISR IT ts (d nm)VLE (2 'o' I NTEIT; NET NO et i, ae AUS 4, Tm) F O XSS CL Q UDT ts STpswr=nwas WEEi=2I pswr=odas; PAE et E asod'eps' HR d4 F asod'lps' Triggers: CET TIGRmTigrO ts UIG'ytigrCas; RAE RGE yrge N et SN m.rge.ls' CL Q ALTER DROP: CET TBEts ( itPIAYKY po1it po2tx,po3fot; RAE AL et k n RMR E, rp n, rp et rp la) CL Q ATRTBEts DO po3 LE AL et RP rp; Preparing TIMESTAMP, TTL and LIMIT: SLC *FO mTbeLMT? EET RM yal II ; UDT mTbeUIGTL?STv=2WEEk='o' PAE yal SN T E HR fo; 11/20 CL Q
  12. 12. New in Cassandra 2.0 Conditional DDL: CET TBEI NTEIT ts ( itPIAYKY; RAE AL F O XSS et k n RMR E) CL Q DO KYPC I EIT k; RP ESAE F XSS s Secondary indexes everywhere (almost): CET TBEtmln ( RAE AL ieie CL Q eeti ui, vn_d ud cetda tmui, rae_t ieud cnetbo, otn lb PIAYKY(vn_d cetda) RMR E eeti, rae_t ) ; CET IDXO tmln (rae_t; RAE NE N ieie cetda) SELECT aliases: SLC eeti, EET vn_d dtO(rae_t A ceto_ae aefcetda) S raindt, 12/20 FO tmln; RM ieie CL Q
  13. 13. Coming in Cassandra 2.0.2 Named bind variables: L S L C * F O t m l n W E E c e t d a > : l w A D c e t d a < : h g A D k y = Q; EET RM ieie HR rae_t t o N r a e _ t = t i h N e Ck : Prepared IN: SLC *FO uesWEEue_dI ? EET RM sr HR sri N ; CL Q Limited SELECT DISTINCT: CET TBEts ( RAE AL et eeti it vn_d n, cetda tmsap rae_t ietm, cnetbo, otn lb PIAYKY(vn_d cetda) RMR E eeti, rae_t ) ; SLC DSIC eeti FO ts; EET ITNT vn_d RM et 13/20 CL Q
  14. 14. The native protocol A binary transport protocol for CQL
  15. 15. Native protocol · Binary transport protocol for CQL · Query execution, prepared statements, authentication, compression, ... · Asynchronous (allows multiple concurrent queries per connection) · Server notifications (Only generic cluster events currently) · Existing drivers for Java, C#, Python, C++, Golang, ... Example usage of the Java driver (https://github.com/datastax/java-driver): Cutrcutr=Cutrbidr)adotcPit"2...".ul(; lse lse lse.ule(.dCnaton(17001)bid) Ssinssin=cutrcnet"yesae) eso eso lse.onc(mKypc"; fr(o rw:ssineeue"EET*FO mTbe) o Rw o eso.xct(SLC RM yal") / D smtig.. / o oehn . 15/20 JV AA
  16. 16. New in Cassandra 2.0: native protocol 2 Cursors: fr(o rw:ssineeue"EET*FO mTbe) o Rw o eso.xct(SLC RM yal") JV AA / D smtig.. / o oehn . Batching prepared statements: P e a e S a e e t p = s s i n p e a e " N E T I T m T b e ( 1 p ) V L E ( , ?A ; rprdttmn s eso.rpr(ISR NO yal p, 1 AUS ? J V )) "A Bthttmn b =nwBthttmn(; acSaeet s e acSaeet) b.d(sbn(,"1); sadp.id0 v") b.d(sbn(,"2); sadp.id1 v") b.d(sbn(,"3); sadp.id2 v") ssineeueb) eso.xct(s; One-shot prepare and execute: s s i n e e u e " N E T I T u e s ( d p o o V L E ( , ? " s m I , p o o y eJ V e s o . x c t ( I S R N O s r i , h t ) A U S ? ) , o e d h t B t sA A ) ; SASL for authentication 16/20
  17. 17. What's next? Cassandra 2.1 and beyond
  18. 18. CQL: some ideas · Storage engine optimizations for CQL · Secondary index for collections · Server side functions · User defined types · ... 18/20
  19. 19. User defined types CET TP ades( RAE YE drs sre tx, tet et zpcd it i_oe n, saetx, tt et poe sttx> hns e<et ) ; CET TBEues( RAE AL sr i ui PIAYKY d ud RMR E, nm tx, ae et adessmptx,ades drse a<et drs> ) ; ISR IT ues(d nm)VLE (3-a71 "yvi Lben"; NET NO sr i, ae AUS 244-6, Slan erse) UDT uesSTadess"ok]={ PAE sr E drse[wr" sre:'7 Mrnr Iln Bv #1' tet 77 aies sad ld 50, zpcd:944 i_oe 40, sae 'A, tt: C' 19/20 poe:{603960 } hns 5-8-00 }WEEi =244-6; HR d 3-a71 CL Q
  20. 20. Thank You! (Questions?)