Aberdeen Report: The Most Important Storage Tool for Managing Big Data


In this January 2012 report, Aberdeen surveyed the IT community to see how fast their data is growing and how they are dealing with the increasing demand for capacity. From the 106 respondents, Aberdeen identified one key storage technology that large enterprises use to manage Big Data.

  1. 1. February, 2012 The Most Important Storage Tool for Managing Big DataOne of the key IT challenges large enterprises are facing today is how to Analyst Insightmanage and store greater amounts of data without dramatically increasing Aberdeen’s Insights provide thetheir IT spending. In January 2012 Aberdeen surveyed 30 large enterprises analyst perspective of thespanning various sectors (see sidebar) to hear how fast their data is growing research as drawn from anand how they are dealing with it. From the 106 respondents, Aberdeen aggregated view of the researchidentified one key storage technology that large enterprises are using to surveys, interviews, andmanage Big Data effectively. data analysisBig Data TrendsAberdeen asked survey respondents to report on their most pressing IT Survey Respondentsissues. Fifty eight percent (58%) of respondents reported that the growing Individuals answering thisdemand for storage capacity is one of their top three IT job pressures, survey came from diversemaking it by far the most widely felt concern among professionals from geographies, industries andcompanies of all sizes. corporate roles: Industries:Figure 1: Storage Capacity – A Top IT Pressure √ Government – 19% 70% √ IT Services – 15% 58% √ Education – 9% 60% % of Companies Surveyed 50% √ Finance – 7% 39% √ Health – 7% 40% 30% 25% √ Industrial – 6% 20% √ Utilities – 6% 10% √ Telecomm – 4% 0% √ Others – 27% Meeting the increasing Reduced budgets Managing outages Roles: demands for storage √ Director and above -51%% n = 106 √ Managers -31% Source: Aberdeen Group January 2012 Geography:The pressure of increasing storage demands was cited 38% more often than √ North America: - 80%the pressure of dealing with reducing IT budgets, a seeming constant in thisworld of doing more with less. √ EMEA - 11%Aberdeen asked how quickly the demand for storage is rising. On average, √ Rest of World - 8%the storage capacity required by organizations grew 35% from 2010 toThis document is the result of primary research performed by Aberdeen Group. Aberdeen Groups methodologies provide for objective fact-based research andrepresent the best analysis available at the time of publication. Unless otherwise noted, the entire contents of this publication are copyrighted by Aberdeen Group, Inc.and may not be reproduced, distributed, archived, or transmitted in any form or by any means without prior written consent by Aberdeen Group, Inc.
  2. 2. The Most Important Tool for Managing Big DataPage 22011. In addition, 8% of all respondents reported that their data grew Company Size Definedbetween 90% and140% per year, and another 6% claimed their data grew For the purposes of thisover 150% per year. document, Aberdeen defines company size as the following:Table 1: Data Growth Rates √ Small (less than 100 All employees) Small Mid-Sized Large Respondents √ Mid-sized (Between 100 and2010 to 2011 1000 employees) 35% 29% 35% 44%Growth Rate √ Large (Greater than 1000 Source: Aberdeen Group, January 2012 employees)The growth of storage capacity is especially troubling for large enterprises, Big Data Checklistwhich reported an average growth rate of 44%. This growth rate means thatdata capacity needs to double every 1.5 years. Unless organizations get Big Data is a term used to describe the vast pool of datasmarter and more efficient, they will almost annually need twice the number organizations are required toof storage devices, twice the space in the data center, and twice the manage, mine, and store. Threepersonnel time and effort to manage their business data. aspects need to be consideredAberdeen asked IT professionals about the challenges they feel in deploying when managing Big Data:new storage capacity. √ First there is the total size of the data stored. Today it isFigure 2: Challenges of Deploying Storage not unusual to hear of companies managing petabytes (PB) or exabytes 45% 41% (EB) of storage (a petabyte is Percentage of Survey Respondents 40% 1000 terabytes, and an 34% 35% exabyte is 1000 petabytes). 30% 28% 28% √ The second factor to 25% consider is how fast the data 20% changes. One PB of static, 15% slowly changing data can be 10% managed differently than a similar volume of data that 5% changes every month, every 0% week, or every day. Deploying Understanding Anticipated Cost Deploying Storage is too the Scope and of Solution Storage is too √ Finally organizations must Expensive Complexity Complex understand the different data n = 30 types they handle. Video, spreadsheets, formatted Source: Aberdeen Group January 2012 databases, and fully unstructured data requireChallenges are seen across two dimensions - expense and complexity. The different tools to optimizerate of data growth exacerbates both of these issues: storage. • Complexity. Storage deployments are very complex, especially when compared to deploying other computing devices such as servers. Concepts such as RAID sets, LUNs, data tiering and thin provisioning must be provisioned. Several of these features are very© 2012 Aberdeen Group. Telephone: 617 854 5200www.aberdeen.com Fax: 617 723 7897
  3. 3. The Most Important Tool for Managing Big DataPage 3 difficult or impossible to change if not setup correctly in the initial deployment. • Expense. Enterprise storage arrays can be very expensive, particularly those that manage TBs (terabytes) of storage. IT budgeting must be done in a step function, meaning that when their existing storage array is full, purchasing the next MB of capacity requires buying a whole new frame costing tens-of-thousands of dollars. Disks are purchased as volumes grow but frames and reserve capacity must be planned over time and deployed in anticipation of data growth.For more information about Big Data and how to manage it, see theAberdeen report Big Data, Big Moves August, 2011.Scale-Out NASIn the January survey, Aberdeen asked the 106 organizations about theirstorage deployments, and specifically what storage features and tools theyuse and what operational benefits they received. One storage tool waschosen more widely than any other for managing storage pools of 100TBand above. This storage tool is scale-out Network Attached Storage (NAS).Rather than having a single array with its own unique operating system, "Verify that the recommendedscale-out NAS is a clustered storage system that allows for connecting many solution will meetstorage devices together into a single virtual system. Scale-out NAS is requirements for future needsdesigned for managing unstructured data (information not stored in without a forklift upgrade."traditional rows and columns of a formal database) including video, audio, ~ IT Manager, Large Oil anddigital images, computer models, PDF files, scanned information, and test Gas Company, USand simulation data.Table 2: Scale-Out NAS Statistics Scale-out NAS StatisticsPercentage of large enterprises using scale-out NAS 41%  Average number of nodes 6.25  Percentage of users with over 15 nodes 21%Average amount of data in NAS array 260TB  Percentage with over 100TB 46%  Percentage with over 1PB 28% Source: Aberdeen Group, January 2012A typical scale-out NAS solution consists of three or more nodes. Eachnode is a self-contained, rack-mountable device with industry standard diskdrives, processors, memory, and networking, and is unified with a singleversion of an operating system. This operating system allows the multiple© 2012 Aberdeen Group. Telephone: 617 854 5200www.aberdeen.com Fax: 617 723 7897
  4. 4. The Most Important Tool for Managing Big DataPage 4modules to be managed as a single unit and gives all nodes access to eachothers data.As shown in Table 2, 41% of the surveyed large enterprises (companies withover 1000 employees) currently use scale-out NAS storage in their “After you project storageinfrastructure. The average scale of the NAS deployment is 6.25 nodes and growth, double that estimate.”contains 260 TB (terabyte) of data. Twenty one percent (21%) of scale outNAS users reported they had 15 or more nodes, a configuration capable of ~ CIO, Large Governmentsupporting storage volumes over 1 PB (PB (petabyte) = 1000 TB). Agency, USAWe then compared the storage capabilities between those organizationswith scale-out NAS and those without. We found those that those who usethis technology benefit in the area of managing Big Data.Table 3: Scale-Out NAS Statistics Users of Non-Users of Metrics Scale -out Scale-out NAS NASWhat percentage did your data grow 52% 32%between 2010 and 2011?Percentage of respondents with data 27% 9%growth rates over 90%Average total amount of data managed 214TB 50TBPercentage change in IT spending 2010 to - 2.2% + .4%2011 Source: Aberdeen Group, January 2012Users of scale-out NAS manage very large data stores that are growing veryquickly. Scale-out NAS users are managing, on average databases 4 times thesize of non-users and growing 60% faster. Even with these very high growthrates, scale-out NAS users have found a way to keep the increasing demandfor storage capacity from greatly affecting their overall IT budget, reportingon average that they have been able to slightly reduce their spendingbetween 2010 and 2011.End User ImpactOne very important view of the success of a data storage program is to seehow it impacts the end user. The end users of the surveyed companies mostlikely have no idea of what storage technologies their IT department is usingto manage their data. Instead they see factors such as applicationperformance and application downtime as indicators of a successful IToperation.© 2012 Aberdeen Group. Telephone: 617 854 5200www.aberdeen.com Fax: 617 723 7897
  5. 5. The Most Important Tool for Managing Big DataPage 5Table 4: End User Satisfaction Users of Non-Users of Metrics Scale-out Scale-out NAS NASHow has application downtime changed in - 4.2% - 2.8%the last 12 months?On a scale of 1 to 10 (10 being the best), 7.5 6.6how happy are your end users with IT?Percentage of end users rating IT 80% 56%performance 7 or higher. Source: Aberdeen Group, January 2012Aberdeen asked the surveyed IT organizations how their applicationdowntime has changed over the last 12 months and scale-out NAS usersreported a 50% greater improvement in the reduction of applicationdowntime than non-users. While this reduction in application downtimecannot be entirely attributed to the use of scale-out NAS, it does have someinfluence as loss of data or data access is one of the major causes ofapplication failure.Aberdeen asked the survey respondents how they believe their end userswould rate their overall IT performance. This self -assessment may not bethe actual ratings they would achieve from the end users themselves, but ITemployees generally have an understanding of how they are perceivedwithin their organization.IT organizations using scale-out NAS believe they are able to satisfy theirend users and achieve a 14% higher rating than non-users. Eighty percent(80%) of scale-out NAS IT departments believe their end users would ratethem as 7 (good, very good or excellent) or above vs. just 56% of non-users.Key TakeawaysThe growth of data storage requirements is causing stress in the IT world.Growth rates of 90% to 100% are not uncommon and require ITprofessionals to seek the most efficient ways to deal with data doublingevery year or year and a half.Organizations that manage large amounts of fast growing data have widelydeployed scale-out NAS arrays to help them manage the expense andcomplexity of big data environments: • Scale-out NAS devices are being used by large organizations with the largest data storage requirements and the fastest annual data growth rates. • Scale-out NAS appears to be allowing their IT professionals to better manage the data and ensure that storage issues are less likely to impact their end users.© 2012 Aberdeen Group. Telephone: 617 854 5200www.aberdeen.com Fax: 617 723 7897
  • Scale-out NAS devices are being used by large organizations with the largest data storage requirements and the fastest annual data growth rates. • Scale-out NAS appears to be allowing their IT professionals to better manage the data and ensure that storage issues are less likely to impact their end users. • Users of this technology are also successfully managing the potential negative impact on their budgets of having to procure larger and larger amounts of storage capacity. Scale-out NAS users are able to hold their budgets flat even though they, on average have to double their existing storage capacity every year.Clearly these leaders are meeting the challenges posed of managing big dataand scale out NAS is an important part of their arsenal.For more information on this or other research topics, please visitwww.aberdeen.com.