Published on

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Research on Storage Virtualization Structure in Cloud Storage Environment Jun-wei Ge Yong-long Deng, Yi-qiu Fang Faculty of Software Faculty of Computer Science and Technology Chongqing Univ. of Posts and Telecom. Chongqing Univ. of Posts and Telecom. Chongqing, China, 400065 Chongqing, China, 400065 e-mail: gejw@cqupt.edu.cn e-mail: fedora.dyl@gmail.comAbstract—As soon as Cloud Storage concept was proposed, it has II. RELATED WORKSreceived strong support and extensive attention from majorvenders, and become a hot topic in many fields. This paper There are so many cloud storage providers, such asproposes a layered and generalized architecture of Cloud Amazon, EMC, IBM, HP, and NetApp, etc. Also there areStorage, and on this basis, we come up with a storage more and more cloud storage platforms, such as HDFS, GFS,virtualization structure, combining with traditional Amazon S3, EMC Atoms, Data ONTAP, HP Upline,storage virtualization technology. It has two steps CloudNAS, FileStore, and KFS, etc.virtualization: first is the physical layer to the logical layer In 2009 more than 140 companies set up the Storagevirtualization, and the second step is the logical layer to the Network Industry Association (SNIA), and the SNIA hoped tovirtual layer virtualization. We can greatly enhance the storage work out Cloud Storage Standards which can benefit all cloudcapacity utilization and scalability through this structure. storage users, service providers, developers, and brokers. And in same year in October, Cloud Storage Initiative (CSI) was set Keywords-Cloud Computing; Cloud Storage; Architecture of up, which proposed to adopt Cloud Data Management InterfaceCloud Storage; Storage Virtualization (CDMI) standard as cloud service standard. I. INTRODUCTION The paper[1] introduced MetaCDN, a system that exploits Cloud computing, developed from distribute computing, ‘Storage Cloud’ resources, creating an integrated overlayparallel computing and grid computing, is a rising computing network that provides a low cost, high performance CDN forplatform and service mode, which organize and schedule content creators. MetaCDN removes the complexity of dealingservice based on internet. Cloud storage is a service mode with multiple storage providers, by intelligently matching andwhich provide storage resource and service for users based on placing users’ content onto one or many storage providersthe remote storage servers based on cloud computing. Cloud based on their quality of service, coverage and budgetstorage should provide storage service at a lower cost, preferences. MetaCDN makes it trivial for content creators andavailability, scalability and security. consumers to harness the performance and coverage of numerous ‘Storage Clouds’ by providing a single united Cloud storage system, with cluster application, namespace that makes it easy to integrate into origin websites,virtualization and grid technology or distributed file system, is and is transparent for end-users.a storage service system which integrates massive differentkinds of storage devices in network through application Cloud storage enables new application types[2] throughsoftware for collaborative work, and together to provide data SOA, Web services APIs and unified service interface viastorage and service accessing. The development of cloud virtualization over a network at low cost, and can providestorage system relies on technologies as follows: Development anytime and anywhere access, massive data storing, sharingof the broadband network; Web 2.0 technology; Development and collaboration via a single namespace, and policyof application storage, Cluster technology, grid technology and management of storage, etc.distributed file system; CDN (Content Delivery Network), P2P The paper[3] proposed some related algorithms about the(Peer-2-Peer), data compression, data deduplication and data rapid development of cloud computing and cloud storage,encryption; Storage virtualization technology and storage which will produce cloud resource market and brought in thenetwork management. cloud services selection challenge. And in paper[4] introduced This paper is organized as follows. Section 2 presents a layered and generalized architecture of cloud storage, andrelated works. In section 3, we present our architecture of cloud analyzed key technologies for cloud storage. Cloud storagestorage. In section 4, we present storage virtualization structure management technology such as Ying Zhan etc.[5] proposed isand analyze its performances to cloud storage system. In also an urgent problem which should pay more attention to.section 5, we present a conclusion. The paper[6] presents a scalable distributed system: Admon. It offers an interface that supports virtual storage 978-1-4244-7874-3/10/$26.00 ©2010 IEEE
  2. 2. system management (e.g., create or modify virtual space, share though technologies, such as cluster technology, distributed filestorage resources, access to shared storage resources, etc.). It systems, and grid computing, etc., enables many storageconsists of an administration module that manages virtual devices to be possible to provide the identical kinds of servicesstorage resources, according to their workloads based on the outward, and provides greatly the better data accessinginformation collected by a monitoring module. performance. In this paper, we will analyze and propose the architecture The app service interface provides kinds of application andof cloud storage, as well as an effective storage virtualization service interfaces. It packages applications under cloud storagestructure in cloud storage environment. environment to be services, which will be provided for users. III. ARHCITECTURE OF CLOUD STORAGE The user interface is the uniform interface which is provided by the cloud storage system for clients to access. It There are a lot of different architectures of cloud storage also can filter the illegal users out of the system.varies from different cloud storage providers. And they are allusually complexity and incompatible. We propose a layered To construct a cloud storage system, which is available,and generalized architecture of cloud storage base on current reliable, scalable, secure, cooperative, concurrency at economicarchitectures. and practical mode, the distributed storage devices with their related management software should be connected with each Cloud storage is a storage service system, which is other by virtualization, cluster and integration to provide acomposed of thousands of storage devices clustered by unified boundless virtual storage resource pool to users. Andnetwork, distributed file systems and other storage middleware. also the QoS (Quality of Service) is one of the most importantIts major function is to provide cloud storage service for users. factors for cloud storage system, which includes storage rates,The typical architecture of cloud storage includes network bandwidth, responsibility, availability, reliability,devices, storage resource pool, distributed file systems, Service recoverability, scalability, security, etc.Level Agreement (SLA), service interface and common accessinterface, etc. They can globally be divided by physical and IV. ANALYSIS AND DESIGN OF OUR STORAGElogical functions to provide more compatibilities and VIRTUALIZATION STRUCTUREinteractions. Based on this, we propose the layered architectureas follows. The architecture from bottom to upper is Cloud storage system is composed of thousands of storageinfrastructure (network, storage, etc.), storage management devices, which are produced by different manufacturers.layer (Metadata management, Storage management, etc.), basic Therefore, there are very huge differences in the physicalmanagement layer, app/service interface, user interface. The property. To eliminate these differences, the storagedetailed functions will be described later. virtualization technology is the best choice. Meanwhile, the major function of cloud storage is to provide data storage service for users. And so cloud storage should allocate storage User Interface capacity through users’ needs and the storage capacity should be scalable. With the storage virtualization technology, it is App Service Interface very convenient to allocate storage capacity to users through needs, and enhance the utilization of storage capacity. Basic Management Overlay A. Traditional storage virtualization technology Storage Management The traditional storage virtualization technology is which maps kinds of decentralized and heterogeneous storage devices in SAN to a single continuous logical storage space or a Virtual Network and Storage Infrastructure Storage Pool (VSP)[7], and provides the access interface of VSP to application systems. What does this mapping operation Figure 1. Layered Model of Cloud Storage is the SVM (Storage Virtualization Middleware). The SVM can shield physical properties of storage devices, so storage In network and storage Infrastructure, there are distributed devices in SAN are transparent for clients. A client justwired and wireless networks, and storage devices networks accesses the logical volumes allocated to it and accesses the(FC, NAS, iSCSI, SCSI or SAS, etc.). VSP through the same way as accessing the LUN (Logical Unit Number). The structure of storage virtualization is shown as In storage management, there is a unified storage follow.management system, which can be implemented by storagevirtualization technology, Multi-link redundancy management,and storage devices condition monitoring and faultmaintenance; the other important task to this layer is metadatamanagement. It can cluster the global domain data storagemetadata information and collaborate different domains to loadbalance. The basic management overlay is the central part of cloudstorage. It makes many storage devices work collaboratively
  3. 3. Host ... Host SVM Logical Volume ... Logical Volume Virtual Storage Pool NAS FC iSCSI, Figure 2. Traditional storage virtualization structure There are three methods to storage virtualizationimplementation: based-on hosts virtualization, based-on Figure 3. Our storage virtualization structurestorage devices virtualization and based-on SAN virtualization.Looked from the trend of development, based-on SAN For each DFS (Distributed File System), there is a mappingvirtualization have more flexibility, and more effective to make table, built between the virtual volume and the logical volume.full use of storage capacity and centrally manage storages. The mapping table includes information like mappingCertainly, it will have a huge space of development. There are relationships between virtual blocks and logical blocks (blockstwo methods to classify the SAN. Based-on topological in the storage pool); however, there have non real physicalstructure, it can be classified to symmetric structure and spaces generally. Here we use an idea like lazy thinking innon-symmetric structure; based-on implementation memory management of operating system, allocating realmechanism, it can be classified to in-band virtualization and physical spaces just when needed. Each DFS can see clearlyband virtualization, and their difference is whether the data I/O enough storage spaces to meet its storage needs; but actually,in SAN and the control message use the same channel or not. these spaces are virtual disks on virtual volume. The capacity of virtual disks can expand dynamically through needs ofB. Design of our storage virtualization structure storage capacity while the DFS is being used: allocating logical The cloud storage system is naturally a storage system. blocks in storage pool, and building and updating the mappingTherefore, there are a lot of indexes to weigh the performance table. Owing to the huge gap between the virtual space and theof the cloud storage system, such as storage capacity, logical space, in order to manage mapping tables better, ascalability, manageability, availability, and security, etc. The specific component of Mapping Table Manager (LV-MTM)storage capacity is one of the most important indexes. between the virtual space and the logical space should be setNormally, there are two methods to increase storage capacity: up. The LV-MTM manages and maintains mapping tablesadding storage devices and improving capacity utilization. between the virtual volume and the logical volume with givenObviously, improving capacity utilization is the effective mapping policies and implementation mechanisms, in order tomethod to increase capacity with no extra costs. And what the accurately, fast, efficiently, and security build and updatebest method to improve capacity utilization is using storage mapping tables. Meanwhile, the LV-MTM also manages thevirtualization technology to allocate the capacity dynamically allocation of logical blocks with needs.through users’ needs. For the storage pool, there is also a mapping table between In order to achieve the goal of allocating the capacity logical blocks and physical blocks (blocks on storage devices).dynamically through users’ needs, we analyze and design a The mapping table describes the mapping relationship betweenstorage virtualization structure. And its design idea can be logical blocks and physical blocks. Owing to the logical spacesummarized as follows: firstly, the storage systems are and the physical space has basically the same size, we can usevirtualized with traditional storage virtualization ideas. With those two mapping strategies shown as follows to build thethis step, the differences among storage devices can be mapping relationship between logical blocks and physicaleliminated, and a unified storage logical view (storage pool) is blocks: Linear Mapping, allocating physical blocks within aprovided to clients; secondly, based on storage pool from the certain range sequentially to logical volumes; Interleavedfirst step, through strategies of allocation according to needs, a Mapping, mapping physical blocks to different logical volumeslarge storage space can be virtualized and provided to each interleaved. A specific mapping manager (PL-MTM) is alsouser. The structure of our storage virtualization is shown as needed here, managing and maintaining mapping tablesfollows. between logical blocks and physical blocks with given mapping policies and implementation mechanisms. The PL-MTM can help logical volume indentify its real storage address. When the system needs to access storage devices, it
  4. 4. can follow the mapping mechanisms built by PL-MTM to Last, we analyzed performances to cloud storage systemaccess the real storage address, and complete the I/O operation. provided by our storage virtualization structure.C. Performance analysis for the above storage virtualization ACKNOWLEDGMENT structure This work was supported by Chongqing Municipal We have designed the above storage virtualization Education Commission, NO.KJ090519.structure, which can improve a lot of performances of cloudstorage system. Here we mainly analyze performances of REFERENCESincreasing storage capacity utilization and improving [1] G James Broberg, Rajkumar Buyya, Zahir Tari. MetaCDN: Harnessingscalability. Apart from these two, there have other ‘Storage Clouds’ for high performance content delivery. Journal ofperformances, such as availability, reliability, security, etc. Network and Computer Applications 32 (2009), 1012–1022. [2] Steve Lesem. Cloud Storage and The Innovators Dilemma. Through virtualizing the logical volume to virtual volume, http://cloudstoragestrategy.com/cloud-ecosystem/, July 19, 2009.with strategies of allocating spaces through needs, each user [3] Wenying Zeng, Yuelong Zhao, Junwei Zeng. Cloud service and servicecan be allocated a huge virtual storage space. And this storage selection algorithm research. GEC 09: Proceedings of the firstspace even possibly exceeds the real physical storage space. ACM/SIGEVO Summit on Genetic and Evolutionary Computation,Therefore, the storage capacity utilization is greatly increased. Shanghai, China, June 2009 1045-1048. [4] Wenying Zeng, Yuelong Zhao, Kairi Ou, Wei Song. Research on Cloud For the scalability, with LV-MTM, we can easily Storage Architecture and Key Technologies. Proceedings of the 2nddynamically expand the virtual storage space according to International Conference on Interaction Sciences, Seoul, Korea, Nov.principle of allocating spaces through needs; meanwhile, when 2009 1044–1048.virtualizing physical blocks to logical blocks, differences [5] Ying Zhan, Yong Sun. 2009. Cloud Storage Management Technology. Second International Conference on Information and Computingamong storage devices have eliminated, so we can easily Science. Manchester, England, UK, May 21-May 22, 2009, icic, vol. 1,expand the physical space and the logical space. The scalability 309–311.of system is improved effectively. [6] S. Traboulsi, A. Ortiz, J. Jorda, and A. M’zoughi, “Admon: ViSaGe Administration And Monitoring Service For Storage Virtualization in V. CONCLUSION Grid Environment,” in IEEE International Conference on Information and Communication Technologies: from Theory to In this paper, we firstly proposed a layered architecture of Applications(ICTTA), Damascus, Syria, 07/04/08-11/04/08, avril 2008,cloud storage, and discussed the basic functions of each layer. pp. 1–6.And then, we analyzed and designed a storage virtualization [7] Jon Tate, Richard Moore. Virtualization in a SAN [EB/OL].structure, which had two steps of virtualization: first, http://www.ibm.tom, 2003virtualizing the physical space to the logical volume, which [8] Changsheng Xie, Wei Jin. Research and Design of the Implementationcould eliminate differences among storage devices; second, of Network Level Storage Virtualization of SAN. Application Researchvirtualizing the logical volume to the virtual volume, which of Computers, 2004, 21(4): 191–193.provided a large virtual space to users according to their needs.