Good morning everyone. Thank you for that nice introduction. My name is Tony Pearson and I will be presenting future trends in storage.
This is my agenda, it is quite simple. With only 40 minutes, I decided to focus on just three specific trends in the storage marketplace. First, there is a shift in the varying roles of different storage types. Rising energy costs, economics and performance are forcing everyone to re-evaluate how we use Solid-State Drives, how we use Spinning Disk and how we use physical tape. The second trend is a convergence of networks. Improvements in communications technology and bandwidth are allowing us to merge different LANs and SANs into a single “Data Center Network”, which will reduce costs and simply the infrastructure. Third, cloud computing is driving new levels of standardization, automation and management in a manner that will also change how internal IT departments will operate their own equipment. Source: Purple Ethernet Cables: http://americangadgetgeeks.blogspot.com/2009/12/be-careful-how-you-handle-ethernet.html
Let’s look at the way energy is typically used. Only about 40 percent of the total energy consumed by a data center goes to the actual IT equipment itself, your storage, your servers and network switches. The rest is for power and cooling to keep the IT equipment comfortable and happy at the right temperature and humidity. This includes Air Conditioners, Water Chillers, PDUs, UPS’s, et cetera. So when you add an extra Kilowatt of IT equipment, you need 1.5 kw for power and cooling, for a total 2.5 Kilowatts more for the data center. This multiplying factor is measured as the Power Usage Effectiveness, or PUE. Storage is a major portion of this IT Load, around 37 percent, and the fastest growing component, so any savings we can achieve in storage, will have a multiplying effect as well. The PUE works in both directions. If we reduce energy consumption of storage by 1 Kilowatt, we reduce the total data center consumption by 2.5 kilowatts.
So here is the storage hierarchy. I’ve grouped them into four levels, using the analogy of the food pyramid, because we all want out data centers on a healthy and balanced diet. At the top, our most expensive storage type which includes DRAM cache, Solid-State drives, and Phase-Change Memory. Next, we have FC and SAS disks, these are high-speed 15K RPM drives, typically connected through Storage Area Network. The third level is less-expensive SATA disks, running only 7200 RPM, typically found in Virtual Tape Libraries, NAS and iSCSI devices. The fourth is the least expensive: physical tape. Most production workloads today run on high-speed FC drives. Enterprise class disk arrays based on FC drives consume as much as 435 Watts/TB. Compare that to only 120 Watts/TB for Solid-State, as low as 40 Watts/TB for SATA, and less than 2 Watts per TB for tape. The obvious answer to reducing energy footprint, is to move away from FC disks, and over to these alternative choices. Let’s start with Solid-State. Source http://www.cesa8.k12.wi.us/teares/math/it/summer2001/Pizza/pyramid.gif
I am glad to have found a graphic that accurately reflects the current hype around Solid State Drives. This borrowed this one from a presentation by Jon Toigo, a storage consultant in the US. If you look closely, you will see a USB stick being passed between hands. In the upper right corner, we have ambitious manufacturers trying to convince the rest of the industry that Solid-State Drive technology is the only storage type you will need in the future. It is permanent, non-volatile storage that is faster than disk, but can also be lightweight and portable to replace tape cartridges. The lower left corner represents a skeptical marketplace. Many companies are struggling to figure out how to use Solid-State in their environment, and whether the benefits outweigh the costs. Solid-State can be quite expensive.
Let’s start with benefits If we look at the amount of IOs per second that can be processed per Watt of energy, Solid State drives do very well, 20,000 IOPS compared to only 70 IOPS per Watt for spinning disk. This is what Solid-State is best used for, random workloads that are read-intensive. How did enterprises get anything done before Solid State Drives? Well, we heavily used DRAM cache with sophisticated caching algorithms, we short stroked the drives to reduce seek time, and we striped the data across many spindles to improve parallel throughput. All of these, unfortunately, consumed more energy per usable TB.
Solid State Drives have other valuable characteristics. They have no moving parts, and have proven to be more reliable, with an Annual Failure Rate of only 1 percent, compared to 3 to 8 percent for spinning disk. They do consume fewer Watts per Drive, as show in the top graph. The red represents a 15K RPM FC disk, orange is slower SATA disk, purple is 73GB Solid State, and blue is 16GB Solid State. The lower graph paints a different picture, however. When you double the capacity of a spinning disk, you cut the Watts per TB in half, because the single motor consumes the same amount of Watts regardless of the capacity. By contrast, if you double the capacity of a Solid-State drive, you consume twice as much energy, roughly 120 Watts per TB. The initial winning play was to look for places were individual drives could be replaced. Replacing the only drive in a laptop or blade server would not only reduce energy consumption, but also reduce outages because Solid-State Drives fail less often. A full server rack can save as much as 1500 Watts.
There might be a few out there who are waiting for Solid-State drives to be in the same price range as spinning disk. That won’t happen anytime soon. Here are projections I got from the IBM Almaden research center. Let’s start at the bottom and work our way up. The blue line represents SATA drives, which enjoys a steady decline in $/gb, dropping an order of magnitude every 7 years. Next up is the red line representing enterprise class FC drives. These are not mass-produced like SATA, have to be designed to stricter engineering tolerances to handle the faster speed and vibrations. As a result, these are not declining in price as fast as SATA disk. The green line represents 2-bit NAND flash, often referred to as multi-level cell or MLC. These are the consumer grade flash found in ipods, mp3 players and cell phones. These are have low write endurance, the cells can only tolerate being written a few thousand times, and therefore not appropriate for datacenter workloads. The purple line at top is 1-bit NAND flash, often referred to as single-level cell or SLC. These are built to last 3-5 years in production workloads. However, they will always be 2-3x more expensive than MLC flash, and as you can see here, will continue to be at least 8x more expensive than spinning disk over the next 5 years. Hetzler’s projections for HDD and flash costs The flat spot in 2009 is real. Flash and HDD are both lithography limited, and thus are on the same growth path going forward Curves showing fast flash price declines include 3-bit per cell flash (TLC), which can’t be used for IT applications. Flash is never cost competitive with HDD in $/GB.
The bigger problem for ssd is production volume. A single wafer can produce 30,000 gmr heads. A disk only needs a few heads, so a disk manufacturer can manufacture 100,000 disks per day, resulting in 14,000 PB per line, per year. Meanwhile, a wafer can only produce 425 dies, which is only enough to produce just a few Solid state drives. The ssd manufacturer can produce only 1250 wafers per day, resulting in only 390 PB per line, per year. It would take an additional investment of 11 to 17 billion dollars in fabs just to capture 1 percent of the HDD market. In this economy, we just don’t see them making that investment.
To get around this, IBM has successfully combined a small amount of Solid-State Drives with a large amount of energy-efficient SATA disk. In the left corner, wearing red, is a traditional two-frame disk array with 192 FC drives, consumping 9.5 kw of power. In the right corner, wearing green, is a single-frame disk array, combining 16 SSD with 96 high-capacity SATA drives. Both of these represent 75 TB of usable capacity. However, in running a standard DB2 brokerage workload, you can see that the combination of SSD and SATA was able to meet or beat the msec response time up until about 35,000 IOPS. Which is the normal operating range for most applications. To make this happen, IBM offers Easy Tier, which provides SUB-LUN automatic movement of each extent. Hot, frequently accessed extents are automatically moved to SSD, and the cold, less referenced data is moved down to spinning disk. The result: the single-frame combination used 50 percent less floor space, 40 percent less energy, and saved $100 thousand over the course of three years.
Another way to reduce energy consumption is Data deduplication. If we can identify identical chunks of data, and store only a single copy, we can reduce the total amount of disk required.
Here is a typical example. I start with a backup server. This could be IBM Tivoli Storage Manager, or other backup software like Commvault, Symantec or Legato. Typically, the backups are written to a tape library, sometimes a small amount of disk is used to store the most recent 24 hours. The problem is that tape is too slow for most large recovery scenarios. With data deduplication, it is now practical to store all backups on disk to improve performance of recovery. IBM offers ProtecTIER which emulates tape libraries with LTO drives. You can get this in appliance form, with internal disk, where 4TB of disk could hold as much as 100TB of backup data. We also have gateway models, where 1PB of external disk could hold up to 25 PB of backup data.
In a recent study by the Clipper Group a SATA disk system and a tape system were configured to store data long term in an archive for 5 years. At the end of 5 years the disk system was 23 times more costly than the LTO-4 tape library system. The study included environmental costs (ergy costs and space costs) The disk system consumed 290 times more energy than the tape system to power and cool. Even if the data were to be stored on disk in a virtual tape library using data deduplication with a 20x reduction ratio the tape system would still be about 5 times less expensive. Customers have a variety of objectives including meeting SLA performance goals, compliance and data security goals and erergy use and TCO goals. All of disk or all of tape may not meet all of the goals. A hybrid or blended solution of disk and tape may meet all of the objectives. Furthermore, having data on tape, detached from the system, reduces the risk of accidental or intentional corruption. Moreover, tape is portable, enabling low cost data protection strategies.
So, to summarize my first trend, all of the storage types are moving over one step to the left. Primary data, previously stored on high-speed FC disk, is moving to small amounts of Solid-State combined with large amounts of SATA disk to reduce energy costs. This combination is referred to in the industry as “Flash & Stash” where perhaps 5% is on SSD and 95% on SATA disk. Backup data, previously stored on tape cartridges, now is moving to spinning disk to improve recovery time. This includes disk replication, snapshots and virtual tape libraries improved by compression and data deduplication. So what is left for tape? If you look at all forms of storage, 1% on disk, 12% on tape, and the remaining 87 percent is on analog media such as paper or film. Boxes of manilla folders containing paper documents or x-ray film represent a fire-hazard. If you lose the building to fire, you most likely have lost your only copy of that information. It is now cheaper to store information on tape than paper, and people are recognizing the benefits of encryption and keeping multiple copies of tape cartridges in remote locations. We are also seeing tape cartridges being used in a manner similar to the project folder. IBM now has developed the Long-Term File System (LTFS) which allows tape cartridges to be passed from one employee to another as if they were high-capacity USB sticks. The data can be encrypted on tape, which makes it more secure than a manilla folder containing paper and film.
My second trend is the convergence of networks. Purple Ethernet Cables: http://americangadgetgeeks.blogspot.com/2009/12/be-careful-how-you-handle-ethernet.html
In the past, large systems like mainframes used ESCON for local devices and SNA to communicate to other systems. How many remember SNA? The small systems had SCSI for local devices, and TCP/IP to connect to other systems.
Today, the world of mainframes and small systems have merged. ESCON is replaced with FICON. SCSI replaced with FCP, so that both FICON and FCP can be managed using common Fibre Channel switching. Large and small systems alike have also standardized on TCP/IP over Ethernet, with NFS, HTTP and FTP protocols allow files to be shared across different platforms.
The problem is that we still have rats nests of cables on the backs of each server rack. Here is an extreme example, with separate cables for communication, computing, management and data. Wouldn’t it be nicer if we could carry all that traffic on a single wire? It’s all ones and zeros, afterall?
Here is the situation most people face today. Network Interface Cards (shown here as purple rectangles) connect to your Ethernet-based LAN switches. Host bus adapters (shown here as orange ovals) connect to your Fibre Channel-based SAN switches. Different storage devices connect to either your LAN or SAN.
The first step in this direction is the “Top-of-Rack” switch. These are available from IBM today. This Top-of-Rack switch sits at the top of your server rack. A new “Converged Network Adapter” card replaces both the NIC and HBA, allowing an inexpensive 2-meter copper cable to carry 10Gb Ethernet for NAS and iSCSI, as well as 8Gb FC over Ethernet. The reason all of the storage vendors are behind this approach is that it allows for investment protection. The Top-of-Rack switch routes all the ethernet packets to your existing LAN gear, and all of the Fibre Channel packets to all of your SAN gear. All of your existing storage devices continue to work as before.
Next month, October 2010, we anticipate a new standard to be finalized called TRILL – which stands for the Transparent Interconnection of Lots of Links. TRILL would allow multi-hop, meaning that Top-of-Rack switches from multiple different racks could all connect to an end-of-row director. These cables would carry both the TCP/IP as well as FCP traffic from switch to switch. The last switch could then connect to legacy storage devices, or to new devices offering unified storage with their own converged network adapters.
So, this is a good time to look into this new technology. But you might be thinking to yourself, isn’t Ethernet lossy? And you would be correct, traditional Ethernet is know to drop packets. That’s ok for emails and instant messages, but not for data transfer. The new standard for data center networks will be Converged Enhanced Ethernet, or CEE, which is loss-less, to handle both TCP/IP and Fibre Channel protocols. If you have 1Gb Ethernet, upgrading to 10Gb Ethernet should make sure that it is CEE capable. If you have 1, 2 or 4Gb Fibre Channel today, you can upgrade to 8Gb Fibre Channel, or consider CEE instead. The switch manufacturers already are planning for 40Gb Ethernet as the next step, and 100Gb Ethernet has been proven in the lab. These are very exciting improvements in bandwidth.
The third trends is Cloud Computing. Purple Ethernet Cables: http://americangadgetgeeks.blogspot.com/2009/12/be-careful-how-you-handle-ethernet.html
Back in 1961, John McCarthy of MIT predicted cloud computing, stating that someday computing would be organized as a public utility, much like the telephone system. In the 1970s, IBM and other service bureaus offered Time-sharing. Companies would rent time on the mainframe as needed. Since then, we have seen Grid Computing and Application service providers, and now this has evolved to Cloud Computing. mainframe providers conducted this kind of business in the following two decades, often referred to as time-sharing, offering computing power and database storage to banks and other large organizations from their world wide data centers. To facilitate this business model, mainframe operating systems evolved to include process control facilities, security, and user metering. The advent of mini computers changed this business model, by making computers affordable to almost all companies. As Intel and AMD increased the power of PC architecture servers with each new generation of processor, data centers became filled with thousands of servers. In the late 90's utility computing re-surfaced. InsynQ (  ), Inc. launched [on-demand] applications and desktop hosting services in 1997 using HP equipment. In 1998, HP set up the Utility Computing Division in Mountain View, CA, assigning former Bell Labs computer scientists to begin work on a computing power plant, incorporating multiple utilities to form a software stack. Services such as &quot;IP billing-on-tap&quot; were marketed. HP introduced the Utility Data Center in 2001. Sun announced the Sun Cloud service to consumers in 2000. In December 2005, Alexa launched Alexa Web Search Platform, a Web search building tool for which the underlying power is utility computing. Alexa charges users for storage, utilization, etc. There is space in the market for specific industries and applications as well as other niche applications powered by utility computing. For example, PolyServe Inc. offers a clustered file system based on commodity server and storage hardware that creates highly available utility computing environments for mission-critical applications including Oracle and Microsoft SQL Server databases, as well as workload optimized solutions specifically tuned for bulk storage, high-performance computing, vertical industries such as financial services, seismic processing, and content serving. The Database Utility and File Serving Utility enable IT organizations to independently add servers or storage as needed, retask workloads to different hardware, and maintain the environment without disruption. In spring 2006 3tera announced it's AppLogic service and later that summer Amazon launched Amazon EC2 (Elastic Compute Cloud). These services allow the operation of general purpose computing applications. Both are based on Xen virtualization software and the most commonly used operating system on the virtual computers is Linux, though Windows and Solaris are supported. Common uses include web application, SaaS, image rendering and processing but also general-purpose business applications. Utility computing merely means &quot;Pay and Use&quot;, with regards to computing power.
This quote from the economist magazine says it all. “Clouds will transform the IT industry, profoundly change the way people work and companies operate.” I believe this is true. To avoid the concern that everyone is calling everything “cloud”, vendors have now adopted a standard definition developed by the US National Institute of Standards in Technology. “Cloud Computing is a pay-per-use model, enabling network access, to a pool of computing resources, that can be provisioned and released rapidly, with minimal management effort or service provider interaction.” Businesses like this business model, paying only for what they use, and being able to rapidly release resources they no longer need. Purchasing IT equipment is easy, but selling it off on eBay at 10 cents on the dollar when you no longer need it can be quite painful. Source: IBM, Cloud Computing - What's all the hype? Source: cloud-computing-v26.pdf (National Institute of Standards and Technology) Effectively and Securely Using the Cloud Computing Paradigm, Peter Mell, Tim Grance NIST, Information Technology Laboratory 10-7-2009
Here is the typical comparison. On the left in blue is the traditional approach. Year 1 includes a lot of capital expenditures, purchasing new hardware and software, and then years 2, 3 and 4 are operational expense. This year 1 expense is often big enough to cause companies to postpone or defer these important projects until the following year. On the right in orange is what you see with Cloud Computing. You only pay for operational expense. The results is a saving of 43 percent over the course of four years, and 73 percent in year 1 alone.
Many people are confused on what exactly is cloud computing, so I use this analogy that I hope will help explain it. In the area of transportation, a traditional approach is to buy your own car, drive it yourself and take care of maintenance and upkeep. Alternatively, if you don’t need a car every day, you can hire a rental car, drive it yourself, you pay by the day or the week when you need it. Finally, there is Transportation as a Service, just hop in the back seat of a taxi and tell the taxi driver where you want to go, pay by the kilometer.
I have color coded this as follows. Yellow for what you are responsible for, and blue for what someone else takes care of. So in the traditional approach, it is all yellow. With a weekly rental, you decide where to go, you drive it yourself, but someone else, perhaps Avis or Hertz, takes care of purchasing the fleet of vehicles and perform all the maintenance. With a taxi, the only yellow part is telling the taxi driver where to go, the rest is taken care of for you.
This same color scheme can apply to the different levels of Cloud Computing. On the far left in Yellow is the traditional IT department, your own IT staff mangaes your own IT equipment in your own IT datacenter. On the far right in blue is traditional outsourcing, often referred to as “your mess for less”. The outsource provider will have their own staff, their own equipment and their own datacenter do the IT processing for you. In the middle are the three levels of Cloud Computing. You will here these terms often, so I have organized them in a manner that can make them easy to understand. First there is Infrastructure as a Service. The service provider maintains the data center facilities, power and cooling, and rents you the hardware. You get virtual machines that you can install your own operating system of choice, your own database, your own middleware and then build or buy the applications you want to run. Next is Platform as a Service. The service provide has deployed standardized platform stacks available on a “service catalog”. You pick either UNIX running Websphere, DB2 and Java, or Windows running IIS and SQL Server, or an open source LAMP stack consisting of Linux, Apache, MySQL and PHP programming language. These three standardized stacks already run 99 percent of the Internet. On these standard stacks you can build or buy your own application. The third stack is Software as a Service. Anyone how has used Hotmail, or Yahoo, or Google’s Gmail is already familiar with this. The service provider has built the application and deployed it in a manner that all your employees need to use the appliation are login userid and password credentials.
How will this impact the IT industry? Greg Papdopoulus, who was the CTO of Sun Microsystems, made these predictions. First, that there would be a neutron star collapse of data centers. It won’t make sense for businesses to build their own datacenters. Over 75 percent of datacenters were built more than 10 years ago, before high-density servers running server virtualization software like VMware or Hyper-V. They were not designed for the power and cooling requirements. Faced with having to build a new datacenter, many will chose Cloud Computing instead. Second, cloud hosting providers will bring brutal efficiency for utlization, power, security, blah blah blah. Already we see a half-dozen large providers – they are Amazon, Google, Yahoo, Microsoft, Salesforce.com, and of course IBM. We are also seeing smaller regional providers in each country, to handle matters that require special legal or government compliance concerns. And third, that computing will look much like the banking world. People no longer stuff bills under they mattress. Companies will trust cloud providers with their information and applications in the same manner that people trust banks with their money.
Of course, not every company is at that level of trust. To address that concern, IBM offers five levels of cloud computing. On the far left is the private cloud, we have products and solutions that provide the benefits of cloud computing, but it sits on your data center floor, and managed by your own IT staff. Next is managed private cloud, which also sits in your data center, but is managed by IBM personnel. Next is hosted private cloud. The equipment is dedicated to your company, but is co-located at an IBM facility. In the green is th next level, called Shared Private Cloud. That might sound like a contradiction. This will be regional data centers that will have high-speed VPN links between your company and IBM. It will be multi-tenant, but be designed for business in mind, with high-speed links over VPN, options for complete financial and security audits, service level agreements, and so on. The blue represents IBM’s public cloud, individuals and companies can rent computing resources over the internet.
So that wraps up my talk. First, I talked about the shift in the way we will use different storage types. Flash-and-Stash for primary data, Disk for backups, and tape for long-term retention of information. Second, I covered how SANs and LANs will merge to a common Data Center Network based on 10gb Convergence Enhanced Ethernet. And Third, I explained how Cloud Computing was driving new levels of standardization, automation and management that will benefit internal IT departments as well. Purple Ethernet Cables: http://americangadgetgeeks.blogspot.com/2009/12/be-careful-how-you-handle-ethernet.html
Thank you very much.
Thank you very much
Future Trends in Storage Tony Pearson IBM Senior Managing Consultant [email_address]
Agenda Cloud computing is driving standardization, automation and management that also impact internal IT departments Improvements in bandwidth is driving a convergence of networks Energy costs, economics and performance are driving a shift in the roles of each storage type
Typical Data centers have 2.5 Power Usage Effectiveness (PUE) rating How energy is typically used in the data center 63% 37% Storage Servers, Networking . . . With adoption of server virtualization, storage is taking over as fastest growing part of Information Infrastructure Source: Dell, IDC, UC Berkeley, Green Data Project Preview: http://www.drunkendata.com/?p=1233 IT Load 60% 40% Power and Cooling
Storage Hierarchy DRAM Cache Solid-State Drives (SSD) Phase Change Memory FC and SAS disks SAN Tape SATA disks Virtual Tape NAS/iSCSI FC or SAS drives (15K RPM) ~ up to 435 W/TB Solid-State Drives ~120 W/TB SATA drives (7200 RPM) ~ 40 to 115 W/TB Tape ~ 2 W/TB
Solid-State Drives will be the only storage you need Are you sure about that ?
Solid-State Drives (SSD) are most appealing for random read-intensive I/O workloads
Previous attempts to increase IOPS:
Heavily use DRAM cache
Short-stroke the spinning disk
Stripe data across many spindles
Solid State Drives (SSD) Watts / Drive Watts / TB Drive Power Use Source: IBM, STEC
Solid State Drives (SSD) offer some interesting characteristics:
More Reliable: 1% AFR vs. 3-8% for HDD
Lower energy consumption (Watts / Drive)
Faster read (millisecond response time)
Slower write destage
Best place to initially put this technology: drive-for-drive replacement inside servers
Reduce outages, Improve Resiliency
Save 1500 Watts per server rack
Fast operating system reboot
Hard Disk Drives and NAND Flash Storage Comparison Source: IBM Almaden Research, Steven R. Hetzler, Sep 2009
HDD and SDD Production Lines One wafer = 30,000 GMR heads One wafer = 425 dies Daily Output: 1,250 wafers Result: 14,000 PB/line/year @375 GB per HDD Daily Output: 100,000 disks Result: 390 PB/line/year @2 GB per die $11 to $17B USD investment required for SSD to capture 1% of HDD market Wafer Source: IBM Almaden Research, Steven R. Hetzler, Sep 2009
IBM System Storage Easy Tier saves energy 9.5 kW 5.7 kW Easy Tier achieved better performance in 50 percent less floor space and 40 percent less energy. Save up to $100,000 in power and cooling for roughly 75 TB of usable data over three years.
Data deduplication is a method of reducing storage needs by eliminating redundant copies of data.
Store only one unique instance of the data
Redundant data replaced with pointer to the unique instance
C A B C A A B B A C A B C A A B B A C A B C A B B A A
IBM Data Deduplication Solutions Backup Server Traditional Tape Library Traditional Disk System ProtecTIER Virtual Tape Library Gateway attaches to external disk Appliance includes its own internal disk
IBM ProtecTIER emulates multiple tape libraries with LTO drives
In-line Deduplication can achieve up to 25x capacity optimisation
Gateway models can manage 1PB of disk (up to 25PB of backup data)
Cost Ratio to store long-term data on SATA Disk versus Tape is 23:1*
5 year TCO to store 2.4 PB of archive data
Including hardware, energy, and space costs
SATA disk system versus LTO-4 tape library
Energy costs of disk system was 290 times more than tape
A VTL with 20X data deduplication is still about 5X more costly than tape
23X 5X “ Tape continues to provide the fiscal responsibility and functional value that enterprises require in the twenty-first century.” The Clipper Group The Cost Ratio for a Terabyte Stored Long-Term on SATA Disk versus LTO-4 Tape is about 23:1 For energy cost, it is about 290:1 *Source: The Clipper Group, “Disk and Tape Square Off Again” Report #TCG2008009LL, Feb 2008 290X
The Shifting Roles of Storage Solid-State Drives Combined with slower SATA disk to reduce energy costs over 15K RPM drives “ Flash & Stash” Physical tape, combined with automation Long-Term File System (LTFS) Work-Task Project Folder Long-term Space Management and Data Retention Backup Data Primary Data Disk replication, Snapshots and Virtual Tape Libraries Improved by compression, deduplication
Agenda Cloud computing is driving standardization, automation and management that benefit internal IT departments Improvements in bandwidth is driving a convergence of networks Energy, economics and performance are driving a shift in the roles of each storage type
In the past, different system platforms were largely unconnected from each other Small Systems Large Systems SCSI ESCON SNA TCP/IP NFS CIFS
A convergence of technologies has brought these systems platforms closer together Small Systems Large Systems SCSI SAS ESCON TCP/IP NFS CIFS NFS HTTP FTP FICON FCP Fibre Channel Protocol iSCSI FCoE FCP FCoE HTTP FTP
Data Center Fabric Convergence – “One Wire” Fabric Convergence Servers Multiple Fabrics Converged Fabric
Separate LAN and SAN networks Host Bus Adapter (HBA) Network Interface Card (NIC) Local Area Network (LAN) Storage Area Network (SAN) NAS iSCSI FCP
Top-of-Rack (TOR) Converged Switches Local Area Network (LAN) NAS iSCSI FCP TOR switch
Converged Network Adapter (CNA)
10GbE for NAS and iSCSI
8Gb FC over Ethernet (FCoE)
Storage Area Network (SAN)
End-Of-Row Converged Directors NAS iSCSI FCP TOR switch EOR director Transparent Interconnection of Lots of Links (TRILL) Unified Storage
Converged Network Adapter (CNA)
10GbE for NAS and iSCSI
8GB FC over Ethernet (FCoE)
Networks Under Transition 1Gb 1,2,4 Gb Convergence opportunity 10Gb CEE Converged Top-of-Rack 8Gb Infrastructure transitions NAS, iSCSI FCoE IBM offers choice to clients upgrading or adding DCN infrastructure with Converged Enhanced Ethernet (CEE) 40Gb CEE 8-10 Gbps Gen. Existing Fibre Channel Ethernet
Agenda Cloud computing is driving standardization, automation and management that benefit internal IT departments Improvements in bandwidth is driving a convergence of networks Economics and performance are driving a shift in the roles of each storage type
“ People do not want quarter-inch drills. They want quarter-inch holes.” Professor Emeritus Theodore Levitt, Harvard Business School
Origins of Cloud Computing Application Service Provider Time-Sharing Grid Computing Cloud Computing If computers of the kind I have advocated become the computers of the future, then computing may someday be organized as a public utility just as the telephone system is a public utility... The computer utility could become the basis of a new and important industry. “ — John McCarthy , MIT Centennial in 1961 „ In the 1960s and 70s, several companies provided time-sharing services as service bureaus
Cloud – A Disruptive New Paradigm? 2010 Application Service Provider Cloud Computing “ Clouds will transform the information technology (IT) industry… profoundly change the way people work and companies operate.” Cloud computing is a pay-per-use model for enabling network access to a pool of computing resources that can be provisioned and released rapidly with minimal management effort or service provider interaction.
Pool of Resources
Grid Computing Time-Sharing Source: US National Institute of Standards and Technology (NIST)
Compares traditional model vs. Cloud Computing service
Includes acquisition, management, power/cooling, floor space
Also includes network circuit cost, with full redundancy
Circuit costs are offset by economies of scale, reduced operational costs
Initial modeling shows 43% savings over 4 years, and 73% in year 1
Traditional Data Center Cloud Computing Services Source: IBM
An Analogy – Transportation Alternatives Traditional Approach: Buy a car, drive it yourself, have a place to park it, take care of maintenance and insurance. Rental with or without Chauffer: Rent a car by the day or week. Drive it yourself, or hire a chauffer to drive the car for you. Transportation as a Service: Hop in the back seat of a taxi and tell driver where you would like to go. Pay by the kilometer.
An Analogy – Transportation as Someone Else’s Problem You Decide where to go You Decide where to go You Drive Parking / Storage Someone else Drives Parking / Storage You Purchase Vehicle Ongoing Maintenance Someone else purchases Vehicle Ongoing Maintenance Purchases Vehicle, Ongoing Maintenance You Decide where to go You Drive (or hire someone) Weekly Parking Traditional Weekly Rental Taxi
The Many Shades of Cloud Computing Infrastructure as a Service (IaaS) Software as a Service (SaaS) Platform as a Service (PaaS) Traditional IT Datacenters Traditional Outsourcing Facilities Hardware Platform Build App Use Application Facilities Hardware Platform Build or Buy App Use Application Facilities Hardware Platform Build or Buy App Use Application
Cloud Prediction from Sun CTO Greg Papadopoulos
A "neutron star collapse of data centers"
It won't make sense for businesses to build their own data centers.
Hosting providers will bring "brutal efficiency" for utilization, power, security, service levels, and idea-to-deploy time.
A half dozen very large cloud infrastructure providers and a hundred or so regional providers
Look more like the banking world
Customers will trust service providers with their private data as they do banks with their money.
IBM’s Five Co-existing Cloud Delivery Models Enterprise Enterprise Data Center Private Cloud Enterprise Data Center IBM Operated Managed Private Cloud Hosting Center Hosted Private Cloud Enterprise A Shared Private Cloud Cloud Enterprise owned and operated Enterprise owned; IBM operated Customer/IBM owned and IBM operated (single tenant) IBM owned and operated (multi-tenant) Enterprise B Enterprise C 1 2 3 4 Public Cloud Cloud IBM owned and operated (multi-tenant) 5 User A User B User C User D User … Private Cloud Shared Private Cloud Public Cloud Cloud Services delivered publicly to end users / secure, enterprise-class Cloud Services delivered privately to Enterprises / virtual separation of tenants Customer owns and pays for infrastructure and has unlimited exclusive access IBM owns infrastructure and customer has shared access and pays by usage
Summary Cloud computing is driving standardization, automation and management that benefit internal IT departments Improvements in bandwidth is driving a convergence of networks Energy, economics and performance are driving a shift in the roles of each storage type
IBM Tucson Executive Briefing Center Contact Us For more information, visit: http://ibm-vbc.centers.ihost.com/briefingcenter/tucson To book a briefing, please contact your IBM Representative, IBM Business Partner, or Briefing Center Coordinator, Lee Olguin at +1 (520) 799-5460.
About the Speaker Mr. Tony Pearson Senior IT Storage Consultant IBM System Storage Tony Pearson is a Senior Managing consultant for the IBM System Storage™ product line. Tony Pearson joined IBM Corporation in 1986 in Tucson, Arizona, USA, and has lived there ever since. Over the past years, Tony has worked in development, marketing and customer care positions for various storage hardware and software products. In his current role, Tony presents briefings on storage topics covering the entire System Storage product line, as well as various Tivoli storage software products. He interacts with clients, speaks at conferences and events, and leads workshops to help clients with strategic planning for IBM’s integrated set of storage management software, hardware, and virtualization products. Tony writes the “Inside System Storage” blog, which is read by hundreds of clients, IBM sales reps and IBM Business Partners every week. This blog was rated one of the top 10 blogs of 2006 for the IT storage industry by “Networking World” magazine. The blog was published in book form as “Inside System Storage: Volume I” available from Lulu publishing. Tony has a Bachelor of Science degree in Software Engineering, and a Master of Science degree in Electrical Engineering, both from the University of Arizona. Tony is an IBM Master Inventor and holds 19 IBM patents for inventions on storage hardware and software products. 9000 S. Rita Road Bldg 9070 Mail 9070 Tucson, AZ 85744 Tony Pearson Senior IT Storage Consultant IBM System Storage™
Tony’s book series “Inside System Storage” Volume I and Volume II are available in paperback, hardcover and eBook formats:
Trademarks The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both. The following are trademarks or registered trademarks of other companies. * All other products may be trademarks or registered trademarks of their respective companies. Notes : Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply. All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions. This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce. For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml: IBM Systems, IBM System z9®, IBM System z10™, IBM System Storage® , IBM System Storage DS®, IBM BladeCenter®, IBM TotalStorage®, IBM System z®, IBM System z10®, IBM System p®, IBM System i®, IBM System x®, IBM IntelliStation®, IBM Power Architecture®, IBM SureONE®, IBM Power Systems™, POWER®, POWER6®, POWER7®, Power ®, IBM z/OS®, IBM z/OS.e, IBM AIX®, IBM i, IBM z/VSE™, IBM z/TPF, IBM z/VM ®, IBM i5/OS®, IBM AIX 5L™, IBM 4690 Operating System, IBM zEnterprise™ Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not actively marketed or is not significant within its relevant market. Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.