Data modeling is one of the most important steps ensuring performance and scalability of Cassandra-powered applications. The existing Chebotko data modeling methodology lays out important data modeling principles, rules and patterns to design a conceptual, logical and physical data models. While this approach enables rigorous and sound schema design, it requires specialized training and experience. To dramatically reduce time, simplify and streamline the Cassandra database design process, we develop an online tool that automates the most complex, error-prone, and time-consuming data modeling tasks: conceptual-to-logical mapping, logical-to-physical mapping, and CQL generation.
In this talk, using real life examples from the IoT domain, we demonstrate how to design correct and efficient database schemas for Cassandra. First, we use our tool, called KDM, to design a conceptual data model and specify application access patterns. Second, we demonstrate how KDM generates a logical data model that is visualized using Chebotko diagram notation. Third, we explain how to configure a logical data model and automatically generate a physical data model. Fourth, we showcase how KDM generates a CQL script for instantiating a physical data model in Cassandra. Finally, we discuss best practices for Cassandra data modeling with KDM.
The KDM tool is available for free at kdm.dataview.org and is used by many in industry and academia.
Andrey Kashlev - Wayne State University
Andrey Kashlev is a PhD candidate in big data, working in the Department of Computer Science at Wayne State University. His research focuses on big data, including data modeling for NoSQL, big data workflows, and provenance management. He has published numerous research articles in peer-reviewed international journals and conferences, including IEEE Transactions on Services Computing, Data and Knowledge Engineering, International Journal of Computers and Their Applications, and the IEEE International Congress on Big Data.