View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
Model-DrivenCloud Data StorageJuan Castrejón, Genoveva Vargas-Solar, Christine Collet, Rafael LozanoUniversité de Grenoble, CNRS, Grenoble INP, Tecnológico de MonterreyCloudMDE 2012
2Background• Cloud computing (NIST-2011) • Utility computing model for enabling ubiquitous, convenient, on- demand network access to a shared pool of configurable resources• Cloud data storage (Ruiz-2011, Armbrust-2009) • Store, retrieve and manage large amounts of data, using highly scalable distributed infrastructures• Polyglot persistence (Fowler-2011) • Different data storage technologies for different kinds of data • Each storage mechanism introduces a new interface to be learned • To get decent performance, you have to understand a lot about how the technology works
3Background• Variety of data storage models and implementations (Cattell-2011, Edlich-2012) • Models: key-value, document, extensible record, graph, blob, object, queue, xml, relational • Implementations: Redis, Voldemort, MongoDB, CouchDB, Cassandra, Neo4J, db4o, eXist-db, etc. (As of today, over 120 options)• Cloud deployment environments (Ruiz-2011) • Different combinations of pricing, support, service level agreements, and management APIs • Public providers (Amazon, Windows Azure, Xeround, etc.) • Private providers (Eucalyptus, OpenNebula, etc.)
4Use the right tool for the right job… How do I know which is the right tool for the right job? (Katsov-2012)
5Problem• How to specify data requirements for cloud environments?• For a set of data requirements, how to choose an appropriate combination of cloud storage system implementation and deployment provider?• How to generate/manage everything that’s required to work with the selection that I make?
6Existing solutions• Integration of cloud storage platforms (Livenson-2011) • Cloud Data Management Interface (CDMI) (SNIA-2011) proxy to integrate blob and queue data stores• Data integration over NoSQL stores (Curé-2011) • Integration of relational and NoSQL databases (Document, column) • Focus on efficient answering of queries• Storage provider selection (Ruiz-2011, Ruiz-2012) • Characterize storage providers features (Ex: performance, cost) • Specify requirements for application datasets (Ex: expected size, access latency, concurrent clients) • Based on the previous information, an assignment of datasets to different storage systems is proposed
7Existing solutions• Modeling as a Service (Bruneliere-2010) • Deploy and execute model-driven services over the Internet (SaaS)• Design and deploy applications in the cloud (Peidro-2011) • Promotes graphical models to capture cloud requirements • Models automatically deployed to PaaS and IaaS environments• Application design/execution in multiple clouds (Ardagna-2012) • MDE quality-driven method for design, development and operation • Monitoring and feedback system
8Limitations of existing solutions• Support for a limited set of cloud storage interfaces• Data integration can be highly based on the relational model• Limited information for the selection of data storage systems• Consideration for high-level cloud models (SaaS) but limited support for low-level models (PaaS and IaaS)
9Objectives1. Provide adequate notations and environments to characterize cloud data storage requirements2. Selection of cloud data storage implementations and deployment providers3. Management of the required artifacts to work with different combinations of cloud storage implementations and providers
10 Objectives Cloud requirements Conceptual High-level of abstraction models (Conceptual models and environments)Selection process Logical Logical LogicalArtifacts management model model model Physical Physical Physical Low-level of abstraction model model model (Storage implementations and providers)
11Proposed solution• Rely on Model-Driven Engineering (MDE) (Kent-2002) to: • Characterize cloud storage requirements • Encapsulate selection, administration and use of cloud data storage implementations• Why MDE? • Avoid dependencies between high-level (data models) and low- level abstractions (storage implementations and providers) • Emphasis on relying on different levels of modeling notations • Generation of low-level abstractions by using automatic transformation procedures
12Objective 1: Data requirements for the cloud• Do traditional modeling notations (ER and UML diagrams) make sense for data storage in the cloud? • Define-extend notations and environments for cloud data modeling• What requirements should a cloud data storage notation consider? • Rely on quality standards (ISO/IEC SQuaRE, S-Cube) to guide this analysis. Example: performance, efficiency, portability, etc.• How to characterize the proposed requirements? • Associate quality metrics relevant to (cloud) scenarios, based on the characteristics of the reference standard (Jureta-2010) • Validate currently proposed metrics. For example: throughput, cost, access latency, etc.
13Objective 2: Data storage selection• Based on the analysis of historic data and usage patterns • Both in test applications and within systems generated in our modeling environment• Monitoring data is gathered in a non-intrusive manner • AOP monitoring • Monitor the behaviour of the selected implementation/providers, based on the metrics specified in the modeling environment • Compare expected values and actual performance• Monitoring data is shared in open/collaborative manner • Used by our decision process • Available for external users• Users could work, at the same time, with multiple combinations of storage implementations and providers • Test the performance of the different combinations
14Objective 3: Cloud artifacts management• Generate the low-level artifacts to work with data storage implementations and deployment providers • Configuration files for deployment providers • Data management interfaces (CDMI, Spring Data, etc.)• Different levels of transformation procedures • From the high-level data model to an intermediate Domain Specific Language (DSL) (Liu-2010, SpringRoo-2012) • From the intermediate DSL to configuration files, AOP monitoring aspects and data management interfaces (SpringData-2012)• MDE transformation techniques • Model-to-Model (M2M), Model-to-Text (M2T)
15Proof of concept Work in progress… 1• Extension - Model2Roo (http://code.google.com/p/model2roo/) High-level abstractions Java web App Spring DataUML class diagram Spring Roo 2 Low-level abstractions Graph database Relational database
16Preliminary results• Castrejón, J., Vargas-Solar, G., Collet, C., Lozano, R., : “Model-Driven Cloud Data Storage”. In: First International Workshop on Model-Driven Engineering on and for the Cloud (CloudMDE 2012). Co-located with ECMFA ’12. July 2012• Castrejón, J., Vargas-Solar, G., Lozano, R., : “Model2Roo: Web Application Development based on the Eclipse Modeling Framework and Spring Roo”. In: First Workshop on Academics Modeling with Eclipse (ACME 2012). Co- located with ECMFA ’12. July 2012
18References• Ardagna, D., Di Nitto, E., Casale, G., et al. MODACLOUDS, A Model-Driven Approach for the Design and Execution of Applications on Multiple Clouds. Models in Software Engineering Workshop (MiSE 2012). Co-located with ICSE ’12. (2012)• Armbrust M. , Fox A., Griffith R., Joseph A. D, et al. Above the Clouds: A Berkeley View of Cloud Computing, 2009.• Bruneliere, H., Cabot, J., Jouault, F.: Combining model-driven engineering and cloud computing. In: Modeling, Design, and Analysis for the Service Cloud Workshop. MDA4ServiceCloud ’10 (2010)• Cattell, R.: Scalable sql and nosql data stores. SIGMOD Rec. 39, 12–27 (May 2011)• Curé, O., Hecht, R., Le Duc, C., Lamolle, M.: Data Integration over NoSQL Stores Using Access Path Based Mappings. A. In: Proceedings of the 22nd International Conference on Database and Expert Systems Applications (DEXA 2011). Hameurlain et al. (Eds.), Part I, LNCS 6860, pp. 481–495, (2011)• Edlich, S.: List of nosql databases. http://nosqldatabase.org/ (March 2012)• Fowler, M.: Polyglot persistence. http://martinfowler.com/bliki/PolyglotPersistence.html (November 2011)• Jureta, I., Borgida, A., Ernst, N., Mylopoulos, J.: Techne: Towards a New Generation of Requirements Modeling Languages with Goals, Preferences, and Inconsistency Handling. In: Proceedings of the 18th IEEE International Requirements Engineering Conference. pp. 115-124. RE 2010. IEEE Computer Society (2010)• Katsov, I.: Nosql data modeling techniques. http://highlyscalable.wordpress.com/ 2012/03/01/ nosql-data-modeling-techniques/ (March 2012)
19References• Kent, S.: Model driven engineering. In: Butler, M., Petre, L., Sere, K. (eds.) Integrated Formal Methods, LNCS, vol. 2335, pp. 286–298. Springer Berlin (2002)• Lenzerini, M.: Data integration is harder than you thought. In: Proceedings of the 9th International Conference on Cooperative Information Systems. pp. 22-26. CooplS ’01, Springer-Verlag, London, UK (2001)• Livenson, I., Laure, E.: Towards Transparent Integration of Heterogeneous Cloud Storage Platforms. In: Fourth International Workshop on Data Intensive Distributed Computing. DIDC ’11. Co-located with HDPC ‘11 (2011)• Liu, D., Zic, J.: Cloud#: A specification language for modeling cloud. In: Proceedings of the 2011 IEEE 4th International Conference on Cloud Computing. pp. 533–540. CLOUD ’11, IEEE Computer Society, Washington, DC, USA (2011)• Peidro, J.E., Muñoz-Escoí, F.D.: Towards the next generation of model driven cloud platforms. In: 1st International Conference on Cloud Computing and Services Science. pp. 494–500. CLOSER ’11 (2011)• Ruiz-Alvarez, A., Humphrey, M.: An automated approach to cloud storage service selection. In: Proceedings of the 2nd international workshop on Scientific cloud computing. pp. 39–48. ScienceCloud ’11, ACM, New York, NY, USA (2011)• Ruiz-Alvarez, A., Humphrey, M.: A model and decision procedure for data storage in cloud computing. In: Proceedings of the IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. CCGrid ’12 (2012)• Storage Networking Industry Association (SNIA): Cloud data management interface (CDMI). http:// www.snia.org/cdmi (September 2011)• SpringSource: Spring data projects. http://www.springsource.org/spring-data (March 2012)• SpringSource: Spring roo. http://www.springsource.org/spring-roo (March 2012)