IBM Global Technology Services October 2011Thought Leadership White PaperA blueprint for smarterstorage managementOptimizing the storage environment with automation and analytics
2 A blueprint for smarter storage management Contents Finally, a common contributor to the usage paradox is that, because the provisioning process for storage can be labor 2 Introduction intensive and time consuming, application owners and other 3 The basics of optimal storage management consumers of storage tend to over-request. The thinking seems to be that they can save lead time and paperwork by requesting 4 Making smarter storage management a reality more storage less frequently and letting it sit until needed. 8 The bottom line: Labor and infrastructure savings 9 IBM can help Big data: The big challengeIntroduction Adding to the storage dilemma is the exponential growth inThe skyrocketing demand for data storage capacity is a major data that must be stored. Every day, 2.5 quintillion bytes ofdriver of escalating IT expense. The costs associated with new data are generated. In fact, 90 percent of the data in the world today has been created in the last two years alone.storage growth include not just additional or upgraded devices, This “big data” is being generated by billions of devices—frombut also data center floor space, electricity, HVAC and ongoing sensors used to gather climate information to GPS chips insystem management. smart phones—as well as posts to social media sites, digital pictures and videos posted online, Internet text and docu-The addition of storage capacity continues to grow exponen- ments, medical records, and transaction records such astially, even though most organizations—due to over-provisioning online purchases and cell phone call data.and poor visibility into their storage environment—use nomore than 30 to 40 percent of their available storage. Big data spans three dimensions: variety, velocity and volume.There are several factors contributing to this paradox. The Variety: Big data extends beyond structured data to includeperception is that “storage is cheap,” so storage requests are unstructured data of all varieties: text, audio, video, click streams, log files and more.routinely honored without challenge. Individuals who havethe responsibility for storage performance or access to data are Velocity: Often time sensitive, big data must be used as it issensitive to performance issues and reluctant to risk degradation streaming into the enterprise in order to maximize its valueof service for want of high-tier storage. to the business. Volume: Big data comes in one size: Huge. Enterprises are awash with data, easily amassing terabytes and even petabytes of information.
IBM Global Technology Services 3 TIER Description COST / PERFORMANCE / AVAILABILITY TIER 0 Ultra-high performance. Meet QoS for high-end, Solid state drives only mission-critical applications. Zettabytes • Storage requirements growing 20% - 40% per year High performance and/or availability. Drive up • Information doubling every 18 - 24 months Instrumented. utilization of high-ended storage subsystems and still Interconnected. TIER 1 maintain performance QoS objectives. • Storage budgets up 1% - 5% in 2010 Intelligent. For low-capacity requirements, smaller, less powerful Exabytes • Storage utilization remains <50% devices may meet tier definition. Medium performance and/or availability. Revenue- generating applications. Meet QoS for non-mission- Petabytes + TIER 2 critical applications. For low-capacity requirements, smaller, less powerful devices may meet tier definition. The information explosion meets budget reality Low performance and/or availability. Non-mission- Terabytes + TIER 3 critical applications. Gigabytes TIER 4 Archival, long-term retention; backup. 2000 2005 2010 2015Figure 1: Smarter systems—instrumented, interconnected, intelligent— Figure 2: Tiering—an underlying principle of information lifecycleare creating a data explosion. The digital universe is projected to grow management—is a storage networking method where data is stored onfrom 1.8 zettabytes in 2011 to 72 zettabytes by 2015. various types of media based on performance, availability and recovery requirements.A word about tiering The basics of optimal storage managementNot all data is created equal: There are many types in a typical As the demands on IT continue to grow, organizations faceIT environment, and its value fluctuates during its lifecycle. significant challenges around storage growth, continued costFor example, email is highly critical initially, but its value often pressures and complexity of available technologies. Best practicesdrops rapidly. Project data files may be less critical at any for storage optimization begin with the following principles:given moment but remain important longer. ●● Store only what is needed and only for as long as it needsTiering—an underlying principle of information lifecyclemanagement—is a storage networking method where data to be stored. To accomplish this many organizations applyis stored on various types of media based on performance, data reduction technologies such as data compressionavailability and recovery requirements (see Figure 2). In and de-duplication, and demand management processes.general, newer data and data that must be accessed more ●● Get more out of existing storage infrastructure throughfrequently is stored on faster but more expensive storage virtualization, thin provisioning, consolidation and propermedia, while less critical data is stored on less expensive monitoring.but slower media. Data intended for restoration in the eventof data loss or corruption could be stored locally—for fastrecovery—while data stored only for regulatory purposescould be archived to lower-cost media. A tiered storageinfrastructure can consist of as few as two to as many asfive or six tiers.
4 A blueprint for smarter storage management●● Data should be moved to the “right” place—and done so on an ongoing basis. Data often loses value over its lifecycle— Smarter storage management: sometimes quickly—creating opportunities to optimize by moving data from expensive disk to lower-cost disk (see • Reduces complexity while preserving storage infrastructure Figure 3). Even active data may have a requirements pattern flexibility that changes periodically (for example, the data associated with • Governs both supply and demand to minimize custom some quarter-close applications may only have high perfor- solutions and reactive work and drives this governance mance requirements for a few weeks each quarter). Having through service automation the ability to move such data up and down the disk hierarchy • Uses analytics to infuse intelligence into tools and automa- tion for workflow, migration and provisioning to achieve as necessary can allow for a more optimal storage hierarchy operational efficiency supporting an organization’s data requirements. • Frees up staff to focus on more critical projects Prior Storage Pyramid New Storage Pyramid+ Making smarter storage management Tier 0 Tier 0 1-5% a reality Tier 1 15-20% Tier 1 IBM’s smarter approach to storage management includes a set$ Tier 2 20-25% Tier 2 of tools, services, and software and hardware technology to help clients realize cost-saving opportunities in storage. By pulling-- Tier 3 50-60% Tier 3 together virtualization, a storage service catalog driven by busi- ness policy, workflow automation and IBM Research-developed solutions, IBM can help clients optimize the storageFigure 3: Most companies, due to over-provisioning and poor visibility intotheir storage environment, have too much data on Tier 1 storage. A more environment.balanced distribution across all tiers can improve application performanceand data availability and help lower storage portions of the IT budget. IBM’s smarter storage management approach comprises three essential elements to making smarter storage management aUsing these principles as a foundation for storage management reality within a storage environment:practices, organizations can be prepared to take a smarterapproach to storage management which addresses exploding Essential element #1: Create a responsive infrastructurestorage growth and costs. across multiple vendor storage assets and tiers of storage. A responsive infrastructure reduces complexity and lowers overallWhat follows is a discussion of IBM’s actionable approach to costs of storage while preserving flexibility. Storage virtualizationsmarter storage management. improves the utilization and efficiency of the storage hardware resources. Optimizing the environment through virtualization
IBM Global Technology Services 5generally results in fewer storage components that need to be IBM’s approach to virtualization is:managed, making it easier to monitor and protect critical data.A virtualization effort in the storage environment can decrease ●● Vendor neutral: It provides integration and a single pointcomplexity and free up resources, as well as reduce costs. of control for more than 120 multi-vendor storage systems. ●● Reliable: It uses standards and repeatable processes withIn a virtualized storage environment, multiple storage devices the latest IBM virtualization technology and products.function as a “virtual” single storage unit, making tasks such as ●● Proven to deliver cost savings: In one example, a virtualiza-provisioning or placement, application migration, tier migration, tion effort for an IBM client reduced annual KW hoursreplication and archiving easier and faster. Storage assets that by 3,565,320; recovered 1,700 square feet of floor space;previously had no common interface can be used interchange- produced annual energy savings of US$320,878 andably. Among other things, that interface enables data to be reduced annual maintenance costs by 57 percent.moved to less expensive tiers of heterogeneous storage in thevirtualized storage environment as the data ages without inter- Essential element #2: Standardize storage usage andruption to the business. process holistically for all data. Improving the way storage is used begins with changing the behavior around how it isGetting more utilization from the existing storage assets is requested. Standardization minimizes reactive work and customone of the best ways to manage rapid information growth and solutions and lays the foundation for workflow automation.big data more effectively. Improving the utilization of existing Standardization, implemented through a storage service catalog,storage is preferable to adding physical storage devices, because ensures a set of standards that reduces manual intervention andin addition to saving the hardware costs, this approach doesn’t decision making (which often results in data being placed on arequire more floor space and associated data center and energy higher tier of storage than needed). Standardization includescosts. IBM storage virtualization can improve utilization up to policies for correct size, initial class of service and management30 percent across both IBM and non-IBM storage, while overtime. It also addresses the storage request process, stream-improving administrator productivity. lining it and continuously driving these standards and policies into the request and provisioning process. IBM’s patent-pending intelligent storage service catalog (ISSC)Over the next decade: promotes more efficient storage allocation and governance (“supply and demand”) by establishing standards for storage• The number of servers (virtual and physical) worldwide will grow by a factor of 10. consumption that can be used to optimize provisioning, backup,• The amount of information managed by enterprise data replication and archiving. centers will grow by a factor of 50.• The number of files the data center will have to deal with Simplifying the request by asking the right business-oriented will grow by a factor of 75 or more. questions up front in the process, and using established standards in the balance, helps drive more cost-effective data managementMeanwhile, the number of IT professionals in the world overtime. End users are no longer asked to specify the manywill grow by a factor of less than 1.5 1 parameters associated with each storage request: Array, disk size and type, RAID configuration, stroked drives, and so on.
6 A blueprint for smarter storage managementInstead of asking the storage user to define the specific require- is sometimes difficult. Use of a storage service catalog helpsments, the questions are structured to drive the conversation reduce or eliminate over-provisioning at the Tier 1 level bytoward purpose: What are you trying to accomplish? What kind using preset standards that permit the use of less expensiveof data are you creating? This is easy for the requesters, because tiers during the request stage.they know if they are creating, for instance, emails, video files,images, user documents, or transactional databases or develop- Figure 4 illustrates this approach as used in a smarter storagement code. They require storage for the purpose of housing environment. During an ISSC engagement, data is categorizeddata. That’s important, because each client has unique business into specific types and dropped into the correct “bucket.”requirements for each type of data; but all of those data types Policies and other criteria are used to size and place the dataare relatively standard across any business. at the correct tier. Later in the lifecycle of the data, it might be archived, saved to tape or deleted from storage entirely asBy defining and codifying business value and requirements for its aged importance suggests.data once, then mapping these requirements to infrastructure,data types can be used over and over to properly request storageand manage storage demand.The intelligent storage service catalog:●● Optimizes and simplifies storage requests to help reduce over-provisioning and the need for highly skilled personnel●● Enables more consistent storage processes and governance by defining standards and policies●● Offers policy-based storage management to enable and Use policies, analytics and automation to size enhance information lifecycle management and place data onto correct tiers at the request●● Lays a foundation for automationThe concept of defining policies and parameters once, then Tier 1 Tier 2 Tier 3 Archive Tapereusing them for all projects, instead of defining at the beginningof every request for service is at the core of IBM’s smarterstorage management. Utilization of existing storage is shownto increase as much as 50 percent with correct size and place- Use policies, analytics and automationment. In addition, provisioning is more efficient and delivers to move data as it changes valuerequested storage in less time than with traditional processes,thus reducing or eliminating the tendency to over-provision. Archive TapeBecause adding storage is perceived to be relatively affordableas part of an overall IT budget, the tendency of vendors has Figure 4: Predetermined policies for the management of data are key to thebeen to propose Tier 1 solutions even when it is not specifically ILM concept and can increase storage utilization up to 50 percent.necessary. In these cases, migration to lower tiers (rebalancing)
IBM Global Technology Services 7Developing the intelligent storage service catalog manner. This allows a business to achieve operational efficiencyTo develop the intelligent storage service catalog, IBM works and to reduce labor cost and risk. IBM is focused on all of thesewith the client’s storage managers, architects and subject matter aspects of automation.experts to: For intelligent automation, IBM’s capabilities leverage●● Replace manual allocation decisions with standardized IBM Research-driven analytics for policy creation and automa- policies. The IBM team develops and enables the use of tion. Beginning with the data types defined in the catalog, user predefined requirements and architectures with respect to input is simplified. At this point, the IBM team has determined storage demand and data management. The goal is to make the correct size and tier on which to place the data. it possible for the customer to “Define once, execute repeatedly.” In the next step—and this is another area in which IBM capabili-●● Break down and evaluate the value of the data types to ties differ from other storage management models—the intelli- the business. A representative set of applications are broken gent storage placement manager (ISPM) uses historical and down into holistic data types. The team captures business current performance data to automatically configure and provi- requirements or key performance indicators for these data sion on most cost- or performance-effective storage devices. types. Looking at applications holistically through common ISPM initially analyzes the disk pools of a respective tier and types of data and defining requirements formally once are intelligently and automatically provisions storage on the most unique to the IBM approach. effective storage devices. ISPM can provision, de-provision●● Define a matching catalog of services and technologies. and create volumes and virtual disks and do host mapping. The client works with the IBM team to design business request logic to change the way storage is requested, making Another IBM Research-developed technology, intelligent stor- it purpose driven. age tier manager (ISTM) recommends best migration targets●● Simplify user requests so that type and quantity of data is and windows overtime as a function of workload and access the only input provided by the requester. This framework requirements. ISTM analyzes and balances performance and allows the IBM team to standardize storage provisioning and cost throughout the life of data. For instance, highly accessed engender a business valuation of data into data management data is “cached” on high performance storage when needed, from inception until disposal. and rarely accessed or old data is archived on cheaper storage as it loses value to the business.Essential element #3: Intelligently automate datamovement and decision making across the data center. These patent-pending analytic tools create an environment thatAutomation, with respect to storage management, can occur in is responsive to the business value of the data. Data placementseveral ways. A data type-based request results in streamlined is not only workload optimized when it is new but across itsand purpose-driven storage request workflow automation. lifecycle according to the business rules associated with the data.Intelligent automation assesses the existing environment toensure the storage environment grows in a workload-optimized
8 A blueprint for smarter storage management Create a responsive, business Standardize storage Automate data movement oriented infrastructure. usage and process. and decision making. Goal: Lower overall cost of storage Goal: Minimize reactive work and Goal: Achieve operational efﬁciency, while preserving ﬂexibility. custom solutions while freeing-up and reduce labor cost and risk highly skilled resources. through intelligent automation. HOW? • Implement a virtualized multi-tier HOW? HOW? infrastructure. • Standardize the storage request • Automate storage request workﬂow. • Deploy thin provisioning and process to reduce planning and • Automate storage provisioning and de-duplication. delivery time. workload analysis. • Standardize and operationalize • Automate tier movement within a provisioning by data type. storage array. • Correctly size and place data (on • Automate policy-driven tier the right tier) from the start. movement across arrays.Figure 5: The dashed line around the infrastructure arrow (left) and standardization arrow (center) indicates that these steps can happen either simultaneouslyor serially as dictated by the needs of the organization. Together they lay the foundation for intelligent automation, as indicated in the arrow to the right.The bottom line: Labor and infrastructure In a traditional storage implementation scenario, the storagesavings architects and the requestors (typically application architects)The implementation costs for a storage project vary widely meet several times to capture all of the business and storagedepending on the amount of automation involved at the coordi- requirements for a given project. By contrast, under smarternation and execution layers. In IBM’s own experience with both storage management, storage requirements, services and tech-internal projects and client deployments, standardization alone nologies are pre-defined, allowing the application owners tocan reduce the effort by 50 percent. Automation can reduce simply select data types from the catalog, and the rest is drivenexecution hours by as much as 90 percent. to standardized solutions downstream.
IBM Global Technology Services 9In a recently conducted two-stage pilot for a client, IBM usedresearch-developed tools to automatically rebalance five tera- In 2009, the National Football League (NFL), the largestbytes of data based on administrator policies. The result was that professional American football league in the world, ap-a two-to-three-day process was reduced to two to three hours. proached IBM to help reduce its overall IT infrastructureIn stage two, IBM automatically moved 57 terabytes of data expense while enhancing storage capabilities.overnight, with no failures. This right-tiering initiative currentlyproduces US$21,000 per month in savings by using lower-tiered The IBM storage and data services team provided thestorage instead of higher tiers. Based on this client’s enterprise intelligent storage service catalog (ISSC) solution. Thetotal storage volume of 600 terabytes, the savings could be IBM team performed a business impact analysis, built theextended to US$2.6 million a year at full implementation. request logic behind the process and then helped to deliver a storage catalog framework that streamlines the request process from NFL departments and teams. Once integratedIBM can help into a service request tool, the ISSC can save many hoursIBM has a long history in information management. Today, of time for the IT organization when responding to storageIBM continues to be a leader in information lifecycle manage- requests and will also enable the IT department to recoverment, offering comprehensive solutions that drive business its storage costs through an automated charge-back system.results and encompass complementary hardware, software and The standard practices implemented by the IBM team can beservices. IBM storage systems, with enterprise-class disk and tape adhered to regardless of the storage vendor.storage tiers, offer best-in-class virtualization and drive increasedreturn on investment. IBM Research is developing significant The storage catalog service and the cost-recovery system, designed by IBM, will allow the league’s management team tonew ILM tools that will provide IBM an end-to-end approach— better understand and monitor the volume and worth of thehardware, software and services solutions—unequaled in the storage services provided to the league’s departments andmarketplace. Additionally, IBM has patent-pending ILM customers. The cost-recovery tool quantifies the expense ofaccelerators that deliver proven, repeatable techniques for the storage services provided by the NFL’s IT department,optimizing storage and information management environments. allocating those expenses back to the groups that use storage.Finally, IBM software has the capacity to enable end-to-end The league’s catalog application will be integrated into amanagement featuring advanced virtualization, orchestration, service request management tool and is being used as aautomation and robust information management capabilities. model for additional cost-recovery initiatives. As an added benefit, IBM elevated the role of IT within the league by providing tangible benefits, cost savings opportuni- ties, and enhanced user services, as well as the charge-back system for storage usage.
10 A blueprint for smarter storage managementSprint—owner and operator of two wireless communications IBM helped Sprint identify potential storage efficiencynetworks and an internet backbone—wanted to develop an improvements, proposing a storage architecture built aroundinformation lifecycle framework that included categorization a set of service classes and storage tiers to be enabled byand retention capabilities. In doing so, Sprint needed a recommended key technologies and tools. IBM consultantsmethodology and roadmap for information storage and developed a data and information lifecycle framework thatdeployment—in short, a system that would help it meet includes the categorization of information and the time periodregulatory, legal and business standards of information for retention. IBM also built a Data-Information-Functionality-retention. Information accessibility and availability were Usability matrix for network data and information; architectedimportant, as were the framework’s alignment with overall a high-level storage infrastructure; provided a methodologybusiness operations and strategic goals. and roadmap for information retention and data lifecycle management; and generated a business case that highlightedSprint’s existing hardware and software technologies the potential financial impact of implementing IBM’swere insufficient for the job. Outdated hardware limited recommendations.the company’s ability to store data efficiently. Softwarein use proved inadequate to manage the storage. Partnering with IBM, Sprint has begun revamping its informa- tion lifecycle architecture to meet regulatory standards andSprint partnered with IBM to develop a solution. The IBM team business goals. In doing so, the company hopes to realize anleveraged best-of-breed practices and methodologies ROI of up to 117 percent.available through IBM Information Lifecycle Management(ILM) services. In doing so, the consulting capabilities ofILM Lifecycle Management—Integrated ILM Services andInformation Lifecycle Management—Archiving and RetentionServices proved valuable.
IBM Global Technology Services 11 For more informationIBM is of course experiencing the same big data challenges To learn more about how IBM can help you derive maximumas its customers, with similar increased demands on existing business value through storage optimization, please contact yourstorage infrastructure. Several initiatives are currently in prog- IBM marketing representative or IBM Business Partner, or visitress to bring costs into better alignment with the total IT spend. the following website: ibm.com/servicesFirst, an archiving initiative was undertaken to address theTier 1 storage growth of 30 percent per year. A highly scal-able and reliable file system service was developed to enablearchiving to the lowest tier, resulting in an estimated savingsof US$2.1 million a year. Second, consolidation of backup andblock storage has helped reduce costs, increase utilization,and cut provisioning time from months to days, for anestimated annual savings of US$50 million a year.Future initiatives include enablement of self-service provision-ing for the storage cloud and automated policy-based tieringfor email and other data.