Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

  • Be the first to comment

  • Be the first to like this


  1. 1. Anita Ahuja, Ajay Kumar, Ramveer Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.352-355An Approach for Virtualization and Integration of Heterogeneous Cloud Databases Anita Ahuja* Ajay Kumar ** Ramveer Singh*** *(Department of Computer Science, Asst. Professor, Mewar University, Chittorgarh (India) ** (Department of Computer Science, Asst. Professor, Mewar University, Chittorgarh (India)) *** ( Department of Computer Science,Professor, R.K.G.I.T.,Mahamaya Technical University, Ghaziabad(India)ABSTRACT: Virtualization is the key technology distribution transparency, Global schema- Commonbehind cloud computing that allows the creation data descriptions & Data placement information,of an abstraction layer of the underlying cloud Centralized admin through global catalog,Infrastructure. Using virtualization, resources Distributed functions, Query processing,(hardware and software) can be shared and Transaction management, Access control etc[1].utilized while hiding the complexity from thecloud users. A lot of cloud database are available II WHY NOT RDBMS?that managed by different organization such as- RDBMS all have a distributed and parallelAmazon Storage for the Cloud, Google Storage version with SQL support for all kinds of datafor the Cloud, Hadoop Storage for the Cloud, (structured, XML, multimedia, streams, etc.) [1]Yahoo!’s PNUTS, Cassandra, CouchDB etc. Standard SQL a major argument for adoption by This paper is presented to propose a virtual tool vendors (e.g. analytics, business intelligence),Database framework that enables the centralized but the “one size fits all” approach has reached theglobal object oriented database. A virtually limits result loss of performance.integrated huge database that will hide the Now simplicity and flexibility required forheterogeneity of various cloud databases. Once applications with specific, tight requirements. Newthey are integrated a consistent access is provide specialized DBMS engines more efficient: column-to the end user. oriented DBMS for OLAP, DSMS for stream processing, SciDB[11] for scientific analytics, etc.Keywords – OOMDS, Virtualization, Cloud, RDBMS provides ACID transactions, complexDatabases, cloud computing, Mediator query language, lots of tuning knobs but it is lessFramework, Peers. suitable for specific optimizations for OLAP, flexible programming model, flexible schema andI. INTRODUCTION scalability. Cloud computing is a model for enablingconvenient, on- demand network access to a shared III INTEGRATED DATA MANAGEMENTpool of configurable computing resources (e.g., PROBLEM IN CLOUDnetworks, servers, storage, applications, and Cloud data are very large (lots of dataservices) that can be rapidly provisioned and spaces, very large collections, multimedia etc).released with minimal management effort or They are Complex, unstructured or semi-structuredservice provider interaction[3] often schema less but metadata (tags,). DifferentThe different cloud providers adopt different file formats, access protocols and query languagesarchitecture and data models such as Amazon‟s are used. Table decompositions may vary, columnstorage building block Dynamo[6], S3, SimpleDB, names (data labels) may be different (but have theand RDS, S3, Google storage building blocks same semantics), and data encoding schemes mayBigtable, Hadoop‟s building block HDFS, Hive, vary it also referred as schematic heterogeneity[8].HadoopDB, and HBase, Yahoo‟s PNUTS, Cloud users and application developers are in veryCassandra data model, CouchDB data model. high numbers with very diverse expertise but veryIt is realized that traditional DBMS does not fit little DBMS expertise.well for the cloud computing environment so newdata model row oriented, document oriented, IV PROPOSED FRAMEWORKwidecolumn are widely used in cloud. Different Object Oriented Mediator Database Systemcloud providers use different architecture and data (OOMDS):models that best suit their application. The proposed system is object orientedNow A Virtual integrated database management mediator data base system of varioussystem should be developed that Provides heterogeneous cloud data bases that having object 352 | P a g e
  2. 2. Anita Ahuja, Ajay Kumar, Ramveer Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.352-355oriented query language in which object oriented for different application areas in mediator peers.views of data can be specified .In OOMDS has The object oriented data model provides veryprimitive to translate data from different clouds powerful query and data integration primitivesdatabase into object oriented data base. These which require advanced query optimization.translated cloud data can be used to build views.This OOMDS supports multiple data base exists The mediator/wrapper approach has been used foron cloud. integrating heterogeneous data in several projects. Most mediator systems integrate data through a central mediator server accessing one or several data sources through a number of “wrapper” interfaces that translate data to a global data model. However, one of the original goals for mediator architectures was that mediators should be relatively simple distributed software modules that transparently encode domain-specific knowledge about data and share abstractions of that data with higher layers of mediators or applications. Larger networks of mediators would then be defined through these primitive mediators by composing new mediators in terms of other mediators and data sources. The core of OOMDS is an open, light-weight, and extensible object oriented database management system with a object oriented data model. Each OOMDS server must contains all the traditional database facilities, such as a storage manager, a recovery manager, a transaction manager, and a functional query language named OOMDSQL. The system can be used as a single-user database or as a multi-user server to applications and to other OOMDS peers.FIGURE : OBJECT ORIENTED MEDIATOR DISTRIBUTION:DATABASE SYSTEM OOMDS is a distributed mediator system where several mediator peers communicate overDATA INTEGRATION IN OODMS SYSTEM the Internet. Each mediator peer appears as a OOMDS is a distributed mediator system virtual functional database layer having datathat uses a object oriented data model and has a abstractions and a object oriented queryrelationally complete object oriented query language. Object oriented views providelanguage, OOMDSQL. Through its distributed transparent access to data sources from clients andobject oriented multi-database facilities many other mediator peers. Conflicts and overlapsautonomous and distributed OOMDS peers can between similar real- world entities beinginteroperate. Object oriented multi-database queries modeled differently in different data sources areand views can be defined where external data reconciled through the mediation primitives of thesources of different kinds are translated through multi-mediator query language OOMBSQL. TheOOMDS and reconciled through its functional mediation services allow transparent access toobject oriented mediation primitives. Each similar data structures represented differently inmediator peer provides a number of transparent different data sources[13]. Applications access datafunctional views of data reconciled from other from distributed data sources through queries tomediator peers, wrapped data sources, and data views in some mediator peer[9].stored in OOMDS itself. The composition of Logical composition of mediators is achieved whenmediator peers in terms of other peers provides a multi-database views in mediators are defined inway to scale the data integration process by terms of views, tables, and functions in othercomposing mediation modules. The OOMDS mediators or data sources. The multi-databasedata manager and query processor must be views make the mediator peers appear to the userextensible so that new application oriented data as a single virtual database. OOMDS mediators aretypes and operators can be added to OODMSQL, compostable since a mediator peer can regard otherimplemented in some external programming mediator peers as data sources[16].language (Java, C, C++ or Lisp). The extensibilityallows wrapping data representations specialized 353 | P a g e
  3. 3. Anita Ahuja, Ajay Kumar, Ramveer Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.352-355WRAPPING DATA The Object oriented Data Model and query In order to access data from external language forming the basis for data integration indata sources OOMDS mediators may contain one OOMDS. The distributed multi-mediator queryor several wrappers which process data from decomposition strategies used were summarized.different kinds of external data sources[15], e.g. The mediator peers are autonomous without anyODBC-based access to relational databases, access central schema. A special mediator, the centralto XML files, CAD systems, or Internet search name server, keeps track of what mediator peersengines to extract data from heterogeneous cloud are members of a group. The central name serversdata bases. A wrapper is a procedure in OOMDS can be queried for the location of mediator peers inhaving specialized facilities for query processing a group. Meta-queries to each mediator peer can beand translation of data from a particular class of posed to investigate the structure of its schema.external data sources. It contains both interfaces to Some unique features of OOMDS are: Aexternal data sources and knowledge of how to distributed mediator Framework where query plansefficiently translate and process queries involving are distributed over several communicatingaccesses to different cloud databases. In particular, mediator peers. Using declarative object orientedexternal OOMDS peers known to a mediator are queries to model reconciled object oriented viewsalso regarded as external data sources and there is a spanning multiple mediator peers. Queryspecial wrapper for accessing other OOMDS processing and optimization techniques for queriespeers[18]. However, among the OOMDS peers to reconcile views involving function overloading,special query optimization methods are used late binding, and type-aware query rewrites.that take into account the distribution, capabilities,costs, etc., of the different peers[20]. REFERENCESTHE CENTRAL NAME SERVER [1] S. Aulbach, T. Grust, D. Jacobs, A. Kemper, Every mediator peer must belong to a and J. Rittinger. Multi-tenant databases forgroup of mediator peers. The mediator peers in a software as a service: Schema-mappinggroup are described through a meta-schema stored techniques. In SIGMOD, a mediator server called central name server. [2] M. Brantner, D. Florescu, D. Graf, D.The mediator peers are autonomous and there is no Kossmann, and T. Kraska. Building acentral schema in the name server [13]. The central database on S3. In SIGMOD, server contains only general meta- [3] F. Chang, J. Dean, S. Ghemawat, W. Hsieh,information such as the locations and names of the D. Wallach, M. Burrows, T. Chandra, A.peers in the group while each mediator peer has its Fikes, and R. Gruber. Bigtable: Aown schema describing its local data and data distributed storage system for structuredsources. The information in the central name data. In OSDI, 2006.server is managed without explicit operator [4] B. F. Cooper, R. Ramakrishnan, U.intervention; its content is managed through Srivastava, A. Silberstein, P. Bohannon,messages from the mediator peers. To avoid a H.-A. Jacobsen, N. Puz, D. Weaver, andbottleneck, mediator peers usually communicate R. Yerneni. PNUTS: Yahoo!‟s hosted datadirectly without involving the name server; it is serving platform. PVLDB, 1(2), 2008.normally involved only when a connection to some [5] C. Curino, E. Jones, Y. Zhang, and mediator peer is established [21]. Madden. Schism: A Workload-Driven Approach to Database Replication and Partitioning. In VLDB, 2010.CONCLUSION: [6] E. Damiani, S. D. C. di Vimercati, S. Jajodia, We have given an overview of the S. Paraboschi, and P. Samarati. BalancingOOMDS mediator system where groups of Confidentiality and Efficiency indistributed mediator peers are used to integrate data Untrusted Relational DBMS. CCS, 2003.from different sources. Each mediator in a group [7] S. Das, D. Agrawal, and A. E. Abbadi.has DBMS facilities for query compilation and ElasTraS: An elastic transactional dataexchange of data and meta-data with other store in the cloud. HotCloud, 2009.mediator peers. Derived functions can be defined [8] R. Freeman. Oracle Database 11g Newwhere data from several mediator peers is Features. McGraw-Hill, Inc., New York,abstracted, transformed, and reconciled. Wrappers NY, USA, 2008.are defined by interfacing OOMDS systems with [9] R. Gennaro, C. Gentry, and B. Parno. Non-external systems through its multi-directional Interactive Verifiable Computing:foreign function interface. OOMDS can Outsourcing Computation to Untrustedfurthermore be embedded in applications and used Workers. STOC, stand-alone databases. 354 | P a g e
  4. 4. Anita Ahuja, Ajay Kumar, Ramveer Singh / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 Vol. 2, Issue 5, September- October 2012, pp.352-355 [11] H. Hacigumus, B. Iyer, C. Li, and S. system for data integration. PhD Thesis, Mehrotra. Executing SQL over Encrypted Linko¨ ping U., Sweden. Data in the Database-Service-Provider˜udbl/publ/vanjaphd. Model. ACM SIGMOD, 2002. pdf [1999]. [12] “Kernel based virtual machine (KVM).” [17] Bukhres O, Elmagarmid A (eds.). Object- [Online]. Available:http://www.linux- oriented Multidatabase Systems. Pretince Hall, 1996. [13] G. Giunta, R. Montella, G. Agrillo, [18] Dayal U, Hwang H-Y. View definition and G. Coviello, “A GPGPU and generalization for database integration transparent virtualization component in a multidatabase system.IEEE for high performance computing Transactions on Software Engineering clouds,” in Proceedings of the 16th 1984; 10(6):628–645. international Euro-Par conference on [19] A. N. Laboratory. (2010, Jul.) Heckle. Parallel processing: Part I, ser. [Online]. Available: EuroPar‟10. Berlin, Heidelberg: Springer-Verlag, 2010, pp. 379–391. [20] xCat Open Source Project. [Online]. Available: (2011,May)xCat extreme cloud administration toolkit. [Online]. Available: 695.1887738 [14] L. Shi, H. Chen, and J. Sun, “vCUDA: [21] P. O. S. Project. (2010, Apr.) Perceus GPU accelerated high performance provision enterprise resources and clusters computing in virtual machines,” in enabling uniform systems. [Online]. Proceedings of the 2009 IEEE Available: International Symposium on Parallel&Distributed Processing. AUTHOR: Washington, DC, USA: IEEE Computer Society, 2009, pp. 1–11. [Online]. ANITA AHUJA is an Asst. Available: Professor in Department of m?id=1586640.1587737 Computer Science and [15] F. Bellard, “QEMU, a fast and portable Information Technology at dynamic translator,” in Proceedings of the Mewar University, Chittorgarh annual conference on USENIX Annual (Rajasthan). She has completed „A‟ level Technical Conference, ser. ATEC ‟05. DOEACC Society, M.Sc (IT) from M.C.R.P.V, Berkeley, CA, USA: USENIX Bhopal,M.Phil, Rajasthan Vidyapeeth, and Association, 2005, pp. 441. [Online]. Udaipur. And M.Tech.(P) at Mewar University, Available: Chittorgarh . Her research interest is in the m?id=1247360.1247401 fields of Network Security, Cloud Computing , [16] Josifovski V. Design, implementation and Advance Data Structure and Algorithms. evaluation of a distributed mediator 355 | P a g e