A Big Data Modeling Methodology for Apache Cassandra

Abstract
Apache Cassandra is a leading distributed database of choice when it comes to big data management with zero downtime, linear scalability, and seamless multiple data center deployment. With increasingly wider adoption of Cassandra for online transaction processing by hundreds of Web-scale companies, there is a growing need for a rigorous and practical data modeling approach that ensures sound and efficient schema design. This work i) proposes the first query-driven big data modeling methodology for Apache Cassandra, ii) defines important data modeling principles, mapping rules, and mapping patterns to guide logical data modeling, iii) presents visual diagrams for Cassandra logical and physical data models, and iv) demonstrates a data modeling tool that automates the entire data modeling process.

This publication has 7 references indexed in Scilit: