What Is Object Storage, When Should You Use It and What to Look For When Purchasing It Sponsored by EMCGuest Speakers: George Crump, Founder and Lead Analyst, Storage SwitzerlandGeorge Hamilton, Sr. Product Marketing Manager, EMC Atmos and CenteraModerator: (Don Keefe) Is It time To Consider Object Storage What Is Object Storage? When Should You Use It? What to Look For When Purchasing it(Don Keefe): Hello, and welcome to today‟s SearchCloudApplications.com Presentation “What isobject storage? When should you use it and what to look for when purchasing it?”
Agenda • Storage Swiss Background • The State of Unstructured Data • What Is Object Storage? • When Should You Use It? • What to Look For When Purchasing itMy name is (Don Keefe), and I am going to be the moderator for today‟s presentation. Today‟spresentation is being brought to you by EMC. Before I begin today‟s presentation, please note thatthe slides will be pushed to your screen automatically, and all the audio will be streamed to youthrough your computer.If you have any questions today for our speakers, you can enter them by clicking on the questionstab which is on the left side of your screen, and click on the submit question button; your questionswill be addressed at the end of today‟s presentation. With that said, it is now my pleasure tointroduce our speakers for today; joining us are George Crump, George is founder and lead analystat Store Switzerland, and analyst firm focused on the storage marketplace.We are also joined today by George Hamilton. George is a senior product marketing manager inEMC‟s advanced storage division, responsible for EMC‟s object storage platforms, EMC Atmosand center. Now let me hand things over to George Crump to begin his presentation. Please goahead, George.
Background • Analyst firm covering storage, cloud and virtualization markets • Knowledge of these markets is gained through product testing, real world implementations and interactions with users and suppliers • The results of this research are found in the articles, briefing reports, case studies and lab reports on our web site www.storage-switzerland.comGeorge Crump: Thanks. Welcome everybody thank you for tuning in today and my role is totake you guys through the -– sort of the basics of object storage, what is it, when should you use itand then what to look for when purchasing it. So from an agenda standpoint, I will just kind ofgive some quick background on who we are, what the state of unstructured data is, what is objectstorage, when should you use it and again, what to look for.So just some background on us, we are an analyst firm, we focus on the storage cloud andvirtualization marketplace. Those markets cover everything from data projection to cloud storageinfrastructures to cloud application; things -– all of the various things in those realms. We gainknowledge of these markets through product testing, real world implementations, interactions withusers and suppliers, then you can find a lot of that research on our Web sites in the form of articles,briefings, case studies, et cetera. So let‟s talk about the state of unstructured data.
The State Of Unstructured Data • Unstructured Data Growth is the key problem being faced by both cloud providers and large enterprises. • The Petabyte is the new Terabyte • Double Digit PB Storage Infrastructures are common • Billion of files under managementYou know, data is kind of interesting. I‟ve been involved in storage for many years now, and in theearly „90s, mid „90s, we were all very focused on online data base backup and how fast could beget data bases back up, and everybody was sort of beating their chest toward one terabyte and hourbackups and things like that, and that was big back then, I mean that was an important obviouslyaspect of data center. What is interesting is how things have changed over the last probably four tofive years is that most of the conversation that we have users and providers and all of those sorts ofpeople, really focuses now on unstructured data. And initially, again five years ago a lot of thiswas kind of office productivity files, things like that; but it quickly grew into all types of richmedia, which includes videos and images and audio.You know, we deal with some providers that, for example, store images and the images that maycost you nothing to store the image, but then the use of that image will cost you something, or theytake those images and package them into holiday cards or cases for cell phones, you know, theyscreen the image on the back of it and things like that. So there is -– the concept has changed. Theother real big impact is the quality of the image or the audio file or the video file, or even the officeproductivity document has really gotten significant better.And as a result, the capacity that you -– that each of these file consumes has grown dramatically.So if you look at the -– kind of the slide there, the data growth is sort of the key problem being
faced by certainly cloud providers and then large enterprises. And the other big thing is thepetabyte is sort of the new terabyte if you will; especially when we‟re talking with cloud providersand large enterprises, dozens of petabytes and several now in the hundred petabyte range is notuncommon.And so when you‟re looking at data protection and making sure that the data stays viable for anextended period of time and things like that, the world changes when you‟re dealing with 50, 60, 70petabytes of information. So that becomes a big problem. And so these double digit petabytestorage infrastructures are becoming very, very commonplace and it is something that we deal withall the time nowadays. And again, hearkening back to my early back up days, you know, when wewere trying to design a back up infrastructure, we would look for file servers that had millions offiles and that would always kind of choke the whole back up process.Millions of files is nothing nowadays. We constantly run into sites with billions and billions offiles and so it is a challenge that we need to keep after, and it is really driving some fundamentalchange in how we store information. And what we‟ve seen come out of this is an increased interestin object storage and the way that works versus the way a standard file system works. So let mekind of talk about a standard file system quickly, I am assuming most people on the phone knowwhat that is, but it is hierarchical in nature and it is really designed for humans, right? What is Object Storage? Standard File System
We tend to think in folders and groups and things like that and so it is really designed for sort of thesame process that creates a data base, right? I mean, you think in terms of records and fields withina record and things like that. Humans like things to be organized and in the right place and if it‟snot in the right place, you know, it‟s a problem. The change is though, from a computer standpoint,a computer doesn‟t necessarily need all this organization and frankly, the organization sort of getsin it‟s way.And so there‟s maybe some differences there that we need to take advantage of. So the standardfile system, like I said, we‟ve put -– the little dots represent files, but we‟re talking about in thebillions here and I feel like drawing a billion dots if you don‟t mind. But the challenge is to thiskind of standard file system approach is like I said, it is really ideal for a human. You can kind ofnavigate your way down to the right folder and things like that. So it helps; you know, the examplethat I like to use is when you drop off your dry cleaning, if it was a self service dry cleaner, you‟dwant the -– your clothes hung by your last name.So my name is Crump and so I would want them in the C‟s so that I could at least go to the C‟s andmaybe even Cr‟s so that I could get to my clothes, right? And that is how humans need to operate.I don‟t want to have to go from, you know, whatever the ticket number one to ticket number 5,000and check each one until I get to the right one, right? So that is really where this sort of file systemarchitecture comes into place. But the challenge is that as we get into the billions of objectcategory, the file system approach doesn‟t scale and really starts to incur bottlenecks, right?
What is Object Storage? Standard File System • Challenges To Traditional File Systems ๏ File system approach is ideal for human ๏ BUT with billions of objects the file system approach does not scale and incurs bottlenecks ๏ also limitations in use of metadataAnd finally it really develops a limitation in the use of meta data, right? The meta data is dataabout data or in the case of an object storage system, it is data about an object. And I might want tokeep things in an object or in meta data that more than what I do today in a typical file system,right? So if you think about the real basics in a typical file system, part of the meta data is thatlocation that file path set up, part of the meta data is the create data and the modified date, maybethe archive byte. And that is really about it.And in object storage, I might want a much more robust capability so that I can control differentthings, than I was able to do before. And so we will talk about some of those advantages. So let‟stalk about what is object storage? It is essentially the opposite. It -– you can still kind of think ofan object, I think it is simpler for most people to think of an object as a file; it is sort of the lowestcommon denominator if you will. But it is not a you know, it is not scattered across the file systemin blocks or things like that. The object is the sum or all parts, so to speak.
What is Object Storage? Object File System ๏ Object storage is an emerging alternative to file-based systems; ideal for storing large volumes of unstructured data ๏ Object storage decouples data from its physical location through the use of object IDs ๏ flat and infinite namespace makes object storage scalable ๏ Provides a foundation for other data longevity techniquesAnd it is ideal for storing large volumes of unstructured data. It decouples -– object storagedecouples data from it‟s physical location from the use of IDs. So if I could just go back to my drycleaner example, what really happens when probably all of us go to the dry cleaner, we give theguy a ticket, it has a number and they move the little conveyer belt thing to exactly that number.They don‟t know your last name, so more or less, and they just take that number and give you theclothes associated with that numbered slot, right? Very, very efficient. They can managethousands of customers in a very, very small shop.So think of the same thing with object storage. Your object, or your file, essentially gets a numberlike my dry cleaner gives me a number and when you go to get that object, to go to retrieve thatobject, you just identify it by this number. So that is sort of the kind of the simplest way I can thinkof to describe how that works. And so this flat and sort of infinite name space, really makes objectstorage very scalable. I don‟t have to worry about developing these incredibly complex paths tofiles and things like that. And it really provides a strong foundation for other data longevitytechniques.And so I will tie in meta data here and so again, as I talked about in the file system meta data thingwhere we were thinking about modify data and create date and things like that, here we are talkingabout -– we‟ll have those but we will also have, OK, when I first upload this object I want to
maintain six copies in six different continents because it‟s the latest video from you know, insertyour favorite rock star here, you know. And so -– and then when that becomes less popular in sixmonths to a year, maybe I only want to keep three copies on three continents to service the demandand then to maintain some level of redundancy.And then maybe in three years, it doesn‟t really matter at all, and so I only want to keep two copies,again one for redundancy. So that is an example of some of the things that you can do withenhanced metadata; there is also things that you can do as far as compliance and making sure thatyou have a chain of custody developed and things like that. And so there is a lot of things that youcan do around metadata once you are in a more object oriented type of environment. So that is -–and you and use that to really maintain the longevity techniques of making sure that one of thethings that you can do is compare the state of an object five years from now, make sure that it stilllooks like it was supposed to.And then compare that object to the other objects and make sure that there hasn‟t been any youknow, data melting or data degradation somewhere else in the environment. So you can alwaysmake sure that the object that you uploaded stays exactly the way that it should be. So let‟s talkabout what we would use object storage for. So there is a lot of different use cases, you know, a lotof different people think of us as cloud providers and clearly cloud providers are top candidates forthis. It is really anything where there is a very, very high file count environment, or you know, touse the correct term, object count.Where we‟re -– and again, we‟re typically talking in the billions and it is also environments whereeither long term data retention, data verification are required or where some sort of multi-geography movement of data is required. And not for the kind of classic DR replication type ofrequirement, but more for the access type of thing. So then again, going back to the video exampleI used earlier, where to lower latency you might make sure it is available on multiple continents andthings like that.
Using Object Storage • Use cases for object storage ๏ High File Count Environments ๏ Environments where long term data retention and data verification are required • How to talk to object storage ๏ API Set ๏ Gateway ๏ Built-in alternate protocol supportSo those are the kind of key environments that look to take advantage of these. One of thechallenges that people face is OK, how do we talk to object storage. Again, in the provider market,it is generally worth it to then to leverage an API set and optimize their -– whatever theirapplication is to talk directly to object storage. That takes some time, and so people have sort ofbuild sort of alternate ways to get there. A very common example is a gateway; so if you use anyof the cloud sharing or file share -– file share and file cloud relations (type) that either synchronizesor does something through the internet, you are using a small form of a gateway.And so you don‟t know that you are necessarily running cloud storage, it does all the interface foryou. And then finally we are seeing an increase in what we‟ll call alternate protocol support, wherethe object storage system can essentially front end itself as some sort of a mount, whether it be afile system mount or a block storage protocol mount. So those are the typically the three ways toget to object storage, and like I said, the -– there is -– I don‟t -– probably the most common isgateways. But generally in the provider space, we see a pretty quick move to the API set to be ableto have direct control over it.
What to look for in object storage • Beware of Do It Yourself Solutions o Most IT departments don’t have the time or resources to build their own • Look At Ingest Rates And Geo Scalability o Performance does matter • Cloud Storage needs more object file systems o Intelligent dispersant, automation, back office integrationSo let‟s talk a little bit about what to do -– or what to look for in object storage. My number onerecommendation here is beware of do it yourself solutions. You know, clearly they exist, you cango get an open storage or an open software type of solution, go buy your own hardware, sew it alltogether, and you know, and then typically an example is drawn to some of the larger providers thatalready exist today as people that do it themselves. What we find is, most providers frankly justdon‟t have the time or the resources to really do that and then maintain it over time.You are generally better off with a pre-built solution that is specifically is designed for this marketand doesn‟t require you spending a lot of time doing it yourself. So again, not saying that theseopen system -– open solutions are necessarily bad, it is just that for -– they‟re probably not practicalfor a large majority of people who are looking for this type of solution. The other thing that Iwould like to recommend as you look at interest rates and what we call GO scalability. So a lot ofpeople kind of hear the term cloud storage and they think, oh well it‟s all about dollar per gigabyteand performance really doesn‟t matter.Well, it does matter; the ability to get data into it at a good rate is very, very important. And also,the ability to do what we call G.O. scalability, the ability to have multiple object storage systemsthroughout the country or world and then have data automatically go to those based on policy orwhatever makes the most sense in your environment. So you know, scaling is a critical issue, not
only just from a raw capacity standpoint, but also from a performance standpoint, because thecloser that you can automatically put data to a potential user, the better their overall experience isgoing to be.And then finally you want to look for more than just the file system itself in a cloud storage device,I mean, object storage in and of itself is important but then it‟s what has the vendor done in additionto just providing an object file system. I don‟t want to minimize the effort in creating just the filesystem itself, but clearly there is more things to do it. So a couple of key things that I like torecommend is what we call intelligent disbursement again, that goes also back to that geo scaling; italso ties that example I used earlier about you know, a new video comes out or a new music -– newtop ten song, whatever -– or a new movie, and you want to put it in more pods initially because it‟spopulate and then be able to pull it back to a few pods to save money, right?So that allows you -– that intelligent disbursment gives that capability and again, we think that‟svery important. Also automation -– the key inn for most providers is how many full time heads tothey require per terabyte/petabyte, right? So obviously the less people that are required to managethe storage the more profitable whatever the venture is and frankly the -– one of the biggestchallenging is just finding enough skilled storage people to operate it. So sometimes it is not evenso much a money issue as it is a skills shortages issue. So having one person that can manage, youknow, multiple petabytes, if not hundreds of petabytes of information becomes a key requirement.And then finally, you know, the integration to back office type of functions, whether that be from abilling or a customer service perspective, things like that. So being able to tie into the sort of moretraditional back office functions becomes very, ,very important. So those are really the key thingsthat we look for in object storage systems and then you know, kind of go beyond just the object partitself. So you know, from our point, I‟m going to stay on for questions of course, but I want tothank you for tuning in, there is my contact info.
Thank you! George Crump, Chief Steward, Storage Switzerland http://www.storage-switzerland.comgcrump@storage- switzerland.com Storage Swiss on Twitter: http://twitter.com/storageswiss Storage Swiss on YouTube: http://www.youtube.com/user/storageswissGeorge, I know that you -– first of all, awesome first name, of course; and you know, I want to kindof hand it over to you and kind of talk about how your guys Atmos projects will kind of tie into thatwe are talking about here.George Hamilton: Oh, that‟s great, George. Thank you very much. And that was a greatoverview to object storage, and I especially liked your comment, that the petabyte is the newterabyte. We find kind of the same thing; it‟s so true. And that is really kind of the new normal formore and more companies moving forward, especially over the next decade. And they are all goingto have to adapt and begin to operate at petabyte scale. As George mentioned, one of the firstproblems is, of course, the sheer amount of unstructured content is growing tremendously.
EMC ATMOS Object-based Cloud StorageYou see a lot of big numbers associate with that and will grow 50 x over the next 10 years; but youknow, more so than that is that there is this application shift going on at the same time, and that iswhat‟s driving the content. You know, EMC, you know, pioneered object storage almost 11 yearsago with the introduction of EMC centera. And it really gave it‟s foothold and became such astrong product because of it‟s ability to very efficiently archive unstructured content.So when people had to keep a ton of e-mails and keep them for a huge amount of time, it was justmore efficient to do that in object based storage platform because of that unique, because it‟s a flataddressing scheme, not a hierarchical addressing scheme; so you could scale it very simply andalways access the content, just have the object ID. So for years, object has really cut it‟s teeth as anarchiving platform and been wildly successful at that. But as we‟ve seen over the last severalyears, that object storage is also preferred architecture for cloud use cases.
What’s Driving Unstructured Data Growth? Unstructured data Application shift Instant access 50X growth over 10 years - Web and mobile de facto Demand instant access store over longer periods of standard delivery and from device of choice – time consumption models from any locationAnd so in addition to all of this unstructured data, there is this big shift in applications. We havemore Web and mobile devices being used by everybody and that is really almost a standarddelivery and consumption model for content today. We are post PC era -– there is no preparing forthat -– there is not thinking about it -– that is the reality of today. We are post PC era and what‟sdriving a lot of unstructured data growth is the fact that users want to use all of these different typesof applications on Web and mobile devices and they want to get instant access to their content,wherever they are, whatever network connection they have and whatever device they‟re on.
Web Apps Driving Unstructured Data Growth According to IDC, 80% of apps today browser based Register your Pay your car Bills File your TaxesSo looking at this picture, what do these three things have in common? Whether it is new cars, abank, or the federal government; you don‟t think of them in terms of new and cool stuff, right? Butthey actually all are transforming their application -– their experience with what they do. Evenmyself, I renewed my license this past summer; I went online and very easily got onto their Website and renewed my license with a few mouse clicks. I didn‟t have to go to the (RMV) and standin line for two hours and do that.And who among us doesn‟t pay our bills online anymore? Or do other financial transactions on amobile device or a tablet. We can all do these things. And you know what? I just did my taxes; Itook advance of the Monday holiday and I got my taxes done; and I did that all online within acouple of hours; really, really easy and I never had to talk to a human being to do it. So thesecompanies which you don‟t think of as dot com and cutting edge companies have all transformedthe way people interact and access and share the information in these apps through the use of Weband mobile type of applications.
Mobile Devices Primary Access And Delivery Vehicle Smartphones outsold PCs in 2011 – Rise in BYOD Go Grocery Shopping Access your stuff Log Expense Reports Diagnose your HealthAnd the use of these applications is driving a lot more content creation and consequently the needto store it. So again, as I mentioned, this is the post PC era now, it is not on the horizon, we are init. You know, how many devices do people in the audience have? I bet everybody out there hasmultiple devices, in fact, the average worker now has more than three on average. I think theaverage went from 2.8 a couple of years ago to now 3.2. So most of us have now more than threedevices.And I can vouch for that, because I do have a work laptop, and I have a tablet and I have a smartphone. Most people can probably say the same thing. So it completely changes how you interactwith application and do your actual job. From a mobile device I can order my groceries, I can domy expense reports on a mobile app delivered by my company; I can access all of my stuff. Iactually don‟t use iCloud but I use other cloud services. And I can access all of my stuff and dothink and share and share files with people very easily from whatever device that I happen to beusing at that time.It really ends up being my primary device whenever I am outside of the office now, where there isthat a couple of years ago that was probably my laptop. And in fact, a recent survey said that about95 percent of companies now support bring your own device in some fashion. And along with that,comes a sea of content that users want access to -– again, wherever they are and on whatever
device they have. Whether it‟s a sale meeting type presentation, whether it‟s a nurse that needs animage or a medical record, and the back end storage needs to be optimized for this newconsumption and delivery model. Why File Systems Don’t Work For Next Gen applications file systems have major disadvantages Locking (whole file and byte-range) is complex Distributed (WAN) access is complex Single-site HA clustering is complex Geographic HA clustering is extremely complex File system replication is extremely complex File system security is complex Folder/file access control, inheritance Reliance on complex, session-based authenticationSo file systems, as George kind of alluded to, are built for a very different purpose then these nextgeneration Web and mobile and cloud types of applications. And it is not to pick on file systemsunfairly, they have a very critical place in IT. But they are optimized to deliver and store structuredcontent, that is generally very tightly tied to an application and delivered over a local area network.So it is really built to perform in that context. What -– but what makes file systems work so well inthat context, makes then actually a poor architectural fit when you get into native cloud and Weband mobile applications.Because there is simply a lot of built in complexity with file systems. And you know, this helpswith the data protection integrity and working with relational database management systems andtheir associated applications. And a primary storage system, serving kind of performance sensitiveapplications, you know, these complexities can be manageable. But when you operate at scale, thelimitations become very apparent very quickly. When you want to scale beyond a single site,replication becomes very complex. High availability clustering whether you‟re in a single site or amulti site environment also gets very complex.
Security, file access, et cetera; they all contribute to the complexity of file systems. Butimportantly for developers, they need to factor all of this into their code; and this makes codingtake longer, they have to spend cycles on mundane coding tasks rather than on applicationfunctionality and they have to depend on storage admins quite a bit throughout that whole process.But really with today‟s environment, object storage is a much better fit. As George alluded to,developers can write to an API and that‟s where object and cloud storage really excels. Top 3 Reasons Web and Mobile Apps work better with Cloud Storage 1. Location Transparency -- use one storage system & access point across many global apps 2. Self-managing storage – NO LUNS, never recode when systems change 3. REST (HTTP-based) APIs – simplify and speed development app app app https://accesspoint.yourcompany.com EMC Atmos Cloud StorageSo one of the three reasons that, you know, we‟ve identified as why Web and mobile applicationswork better with cloud storage based on object -– and number one is location transparency. Youuse one storage system, one access point across however many global applications that you have.The application simply doesn‟t care where the data is located; it doesn‟t have that hierarchicaladdressing scheme of the tight relationships. It is more -– the analogy that we always use in objectstorage is that it is like valet parking your car.
EMC Atmos Cloud Storage A Platform For Next Generation Applications & Cloud Services File tiering/ Medical imaging/ Custom apps Package apps Storage-as-a-service archiving/backup VNA (web/mobile) ATMOS SDK, HTTP/S (REST), S3, CAS, NFS, CIFS https://accesspoint.yourcompany.com New York U.K.You know that when you go into the restaurant you have your ticket, you don‟t know where yourcar is, you don‟t care where your car is, they might even move it a couple of times while you arehaving your meal, but at the end of the night, you hand in your ticket, you get your car back. Andobject storage works exactly the same way. So you don‟t have to worry about the location of thedata. The application doesn‟t care. It has the object ID, it can get to the object. Also, it is self-managing storage. I don‟t have to work and have an IT -– a storage administrator in the ITdepartment provision storage for them.There are no (Luns) there‟s no (Raid) clearance, there‟s no replication schemes to take into accountwhen developing. And also with that location transparency, that means you don‟t have to recodewhen the systems change; you‟ve written to the API, the underlying infrastructure doesn‟t matter tothe application developer. That is really what cloud is all about; I want a utility -– and so, havingself managed the storage without all the complexities really helps developers achieve that. Andlastly of course, is using Web based Web services API, using (rest).This simplifies application development and it speeds application development, which reduced riskto the organization. If I have applications that have less code that is -– that makes testing easier, itmakes QA easier, it speeds up development and that reduces my overall risk and I can getapplication project out the door faster. So let me talk a little bit about -– drill a little bit down on
EMC Atmos cloud storage. And essentially EMC kind of built on the foundation of centera whichis object based storage, but added cloud capabilities and the ability to geographically dispersecontent over multiple locations and Web searches access and cloud like capabilities to it.So Atmos presents itself as a single global system with one global name space. And that globalname space is accessible by this multiple access methods to that single global name space. Sodevelopers can use the Atmos software development kit and Web services standard as well astraditional file access protocols to simply access this big storage pool, a distributed object pool overmultiple locations but it is presented as one logical system. And the primary use cases we‟ve seenpeople start with maybe tiering and archiving to the cloud, but then once they‟ve done that and theysee that it drives down the cost of storing content long term, but also making it available, there isalso a tremendous value in doing customerized Web and mobile applications, makes that easier. Store, Archive And Access Distributed Unstructured Data At Scale APP 3 Single storage cloud Limitless scale Multi-tenancy Metadata-driven policies Storage-as-a-service Multiple access methods and APIs Instant access from any device New York U.K.There is also packaged applications using those protocols, can get access to storage very simply andalso be able to deliver storage as a service, and we have a (fixed use) service providers that offerstorage as a service with EMC Atmos as the back end storage platform. So as we say, EMC Atmosgives you the ability to store, archive and access distributed, unstructured data at scale. And I willgo through these in a bit more detail but essentially it presents itself as a single storage cloud, itscales very easily in largely limitless scale, it features built in multitenacny, and very importantly it
has meta drive policy management, so that you can optimize the placement and retentiondisposition of objects within the system and do that in an automated way according to policy. Unique attributes of object and cloud storage Objects can live anywhere (location transparency) and are not tied to a specific underlying file server or file system Flat, universal namespace is ―application-friendly‖ and allows global access to stored content from anywhere the distributed application runs REST (HTTP-based) APIs promote rapid application development Applications can easily associate custom metadata with stored objects – no need for a Reduced separate, synchronized database Complexity Easy to restrict access by placing files in secure sandboxes (multi-tenancy) Policy driven management controls automated file distribution & access, and provide data resiliency and high availability Self-managing storage makes it easy to grow capacity or add new sites No need to provision LUNS or create filesystems, mount points or shares No need to modify an application that’s running out of space Near-limitless scale – just add more hardware when needed Increased Automatic load balancing as new objects are stored Scalability Scales elastically – apps simply create and delete files as neededYou can deliver storage as a service; if you are either a service provider or an enterprise that wantsto be able to offer a self service storage model to your organization you can do that. And it offersmultiple access methods and APIs and gives that ability to get instant access to storage from anydevice. So what the -– really the unique attributes of objects in cloud storage again, as I mentionedthe location transparency. It‟s not tied to any specific underlying file server or file system, youhave the flat universal name space, which is very friendly to applications.So it allows that kind of global access to all of the content from anywhere distributed applicationsmay run. In the end, (rest based APIs) promote very easy application development, the applicationscan very easily associate meta data with their objects so there is no need to build a separateextraction layer with meta data and then manage that separately. Objects and their metadata arestored together in the system. And again, as I mentioned (inaudible) to all of these, but itselfmanaging storage. It makes it very easy to grow capacity, to add new sites, you don‟t have toprovision (lunds), not creating additional file systems, mount points or shares.
Then there is no need to modify an application that may be running out of space. And of course, Imentioned the scalability of it. It really is on a limitless scale, you just add new hardware whenneeded; it is a node based architecture; we simply add more nodes, more locations and itselfconfigures, it‟s very easy to scale. And again, it self configures, to it automatically load balances asnew objects are stored in the system and you can very in an elastic way scale up and down. So asan IT department I can now operate as a service provider by giving self service access to differentbusiness units and I can actually manage and monitor what they use for storage resources. A Single Storage Cloud Unlimited Applications, Services And Users Distributed object store –Best fit for unstructured data –Uses Object IDs with metadata in Blob storage –No RAID Groups, LUNs or File systems https://accesspoint.yourcompany.com Global namespace –Common view independent of location –Abstracts storage from the application –No need to recode apps – ever! Multi-site active/active –Distributes objects & access across all sites –No dedicated replication or back up required –Instant access to data Site1 Site 2So as I mentioned, Atmos acts as a single storage cloud. At it‟s root, it is a distributed objectsstore. So you have a very large object storage system that can be distributed across multiplelocations, but again presents itself as a single global system managed through one pane of glass andaccess through a single global name space. That is what makes it such a fit for unstructured data;and again, there is no (raid) groups or (lunds), none of the traditional file system mechanismsinvolved in it. And again it features that single global name space, so you have a common viewindependent of the location.
Limitless Scale Eliminate Storage Sprawl And Downtime App n Scale out architecture –Node-based for instant scale –Flex out to public clouds –Performance scales linearly with capacity Operationally efficient https://accesspoint.yourcompany.com Non-disruptively –Rebalance nodes to optimize performance Add Apps, Users, –Redundancy and multiple access points Or Capacity –Flexible configurations to set SLAs Self-configures, self-heals –Recognizes new capacity, sites, applications, and tenants instantly –Automated self-healing One management view –Web-based –Aggregated alerting Site1 Site 2 Site 3It really extracts storage from the applications; so that is really the key in things like the health carefield that is where interoperability is such a huge thing. Being able to abstract all of your imagedata from all of these different picture archive and communications systems, means I can swap outa Pax -– picture archive and communication system or new application and I don‟t have to migratethe data. I have basically extracted that data from the applications, so I‟ve got a single globalarchive that can act as an archive to multiple applications and types.So I don‟t have to recode apps and I don‟t have to migrate my data. And it is a multi site activearchitecture. As I said, in the traditional file system world, you have mostly active/passivearchitecture, so you have a primary data center and a secondary data center, that would act as a failover in case I have major issues at my primary data center. Atmos works fundamentally differentin that it is multi site active. Every site can actively serve content and it distributes objects acrossall of the sites so there is no dedicated replication, there is no backup required, you always haveinstant access to the data, even surviving a site outage.That architecture also lends itself very well to scaling. It‟s a scale out node based architecture. Soas I said, you can simply add new sites, new capacity into the infrastructure on demand and it willself configure and be ready to recuperate that into the environment. So the performance scale islinearly with capacity and that makes it very efficient; you can rebalance nodes to optimize
performance, you have redundancy built into it, multiple access points. You have flexibleconfiguration to set SLAs. And again, I mentioned it, it self configures itself, heals -– and again,all of this with one management view. Multi-Tenancy Share Resources Across Tenants, Apps, And Users Maximize storage utilization Department A Department B Department n –Securely isolate and share across dept, app, users Atmos Policy A Atmos Policy B Atmos Policy n –Eliminate ‘over-provisioning’ Tenant A Tenant B Tenant n Simplify management –Set policies, SLAs and access across tenants –Aggregate view of resources and utilization Improve IT agility –Provide instant access to storage –Empower tenants with self-service access Site1 Site 2So even if you have multiple locations, you have aggregated a learning for all of that that is all Webbased, you can monitor the whole environment from single pane of glass. And even when I amadding applications, users, it is not disruptive to the system whatsoever, it automatically just selfconfigures when you add that capacity. And of course nothing is really a true cloud unless you cansupport multi-tenancy. So Atmos gives you the resources to share resources across differenttenants, whether that is applications, users, locations; so you can really maximize your utilization ofstorage but you security isolate the different tenants on this system.This eliminates a lot of over provisioning, because tenants can simply subscribe to the amount ofcapability that they need. And if they don‟t need it any more they can simply release it back intoyour storage pool. And so it simplified management because it is self service and you can also kindof set different policies and service levels across different tenants. If you are a service provider andexample, you know, you could have offered different storage service levels for different classes oftenants. If you are an enterprise IT department, it enables you to act as a service provider,improving your agility in being able to provide instant self service access to storage.
Metadata-Driven Policies Automate Data Lifecycle Management Policy A. Automate data lifecycle ―Status=UserPaid‖ Multiple copies X management multiple sites –Placement, retention, disposition and expiration –Use age and trend to drive to tier 2 storage Improve storage efficiency –Set number, type and location of replicas –Synchronous and asynchronous options Policy B: ―Modality=MRI‖ –GeoParity erasure coding 65% more efficient Multiple copies retained 5 years Customize at object, tenant and system levels Archive XSo another key feature of EMC Atmos is the fact that it includes meta data driven policies, so youcan automate the placement and retention of all of the content in the system. You are basicallyautomating your information lifecycle management. So you can set policies for the placement,retention, disposition, expiration; you can you age, trend, and drive it to a different tier of storagebased on that. As we always say, it is like disk to disk to somewhere else, instead of a policy that Iset. I‟ll have something in local storage for say 30 to 60 days, then I want to move it to my privatecloud archive on premise for the next year or two, and then off load it to a third party serviceprovider for another maybe seven years according to some compliance mandate.You can set that all by policy and if you are actually working with an Atmos powered serviceprovider, you can have a hybrid model where you store locally for a certain amount of time andthen automatically by policy, push to an internal cloud archive and then push it to an external thirdparty cloud archive on Atmos and manage that whole process internally; again, as one globalsystem, even when you are working with a third party cloud provider. So you an customize allthese settings at the object level, tenant, system levels.
Manage And Deliver Storage-As-A- Service Transform From Cost To Value Center Manage basic storage services –Set pre-defined policies & SLAs –Quota management –Granular metering, historical trending –Chargeback and billing support Develop your storage service –Atmos Cloud Delivery Platform open source storage-as-a- service portal – instantly.. –Monetize your solution with web services Transform IT –Automate basic IT storage tasks 40+ Service Providers –Enable self-service access deliver their storage service on ATMOSSo the examples that we show here is like something like an MRI where you want to have -– I cansay, I want this many copies stored in this many locations for this amount of time, and evenautomate the movement of that to different tiers of storage. It is crucially important the amount ofdata that is out there that people have to manage, to automate this. There is simply not going to beenough IT staff hired over the next decade to keep up with the amount of data and manually do this.So with all of that, you know, Atmos allows whether you are a service provider or an ITdepartment, to manage and deliver storage as a service.And so for IT there is a lot transparency that comes with this. If you can prove your value to thebusiness, you can manage your storage services very easily, very transparently, you can granularlymeter what people are using, you can do charge back. If you are a service provider you canintegrate that charge back with the billing applications and as I said you can actually deliver astorage service. So for an enterprise IT department, they can transform their IT from a cost centerto more of a value center if the service provider that allows them to get out the door, were the cloudstorage service, very quickly.And I even have to correct this line because it is now over 50 service providers that delivery theirstorage service on EMC Atmos. What is also key to EMC Atmos is that it is API driven storageand we do have a software development kit that provides a lot different language bindings codes;
we have a whole community building up around this now. We support multiple access methods,like I said, Web service standards like the rest and traditional access methods such as sys and nfs.So Atmos, at the end of the day, can service as a location independent and application agnosticarchive. Application Access Methods Atmos SDK Provides Language Bindings, Code And Sample Apps Access methods –Web services: REST/SOAP –Traditional access: CIFS, NFS, CAS APIs –Atmos REST API –Native S3 API Broad range of language bindingsYou can have multiple applications can you Atmos as their archiving target whether it is atraditional application using sys nfs or whether it is a Web based application that is using (ras);Atmos can be the archiving target for it. And with that comes very easy instant user access, youcan get basic http access to storage for any device; we have a product called geo drive which allowsyou to kind of almost create a cloud drive on a desktop and be able to push stuff up into EMCAtmos very, very simply or create a Linux mount point, and some browser plug ins.
Instant User Access Basic http access –Upload and share files w/expiration –Anonymous URLs to share files GeoDrive Windows and Linux –Drag, drop, stub, backup, recover, share –Atmos Windows cloud drive – e.g. G: –Atmos Linux mount point Browser plug-ins –HTML5, AtmosFox, AtmosChromeSo all of this makes it very, very easy for mobile devices and client end points to get very easyaccess to the cloud storage. And these are, you know, geo-drive is a free product -– a separate addon product -– but free, for users to be able to do that. So let‟s take a look at some of thedeployment options for it. So Atmos at the end of the day is basically hardware and software thatEMC sells for enterprises and service providers to build a private or a public cloud service. So anenterprise can build a private cloud storage service, but then they can tap into again, one of thoseover 50 Atmos powered cloud service providers and basically have a hybrid cloud model, wherethey store things, certain content privately but then confederate out to public clouds.
Deployment OptionsAnd some additional add on software, I mentioned Atmos trio drive here on the bottom, but also wehave the Atmos cloud delivery platform, so the Atmos cloud delivery platform allows either aservice provider or an enterprise to build a storage as a service offering and deploy it right out ofthe box. It is a total turnkey solution. Many service provider will kind of customize and build theirown self service portals and management frameworks, but Atmos provides a way for serviceproviders to get out the door very quickly with a cloud storage service and have all that right away.
Atmos Add-On Software Atmos Cloud A turnkey solution to deliver and manage storage-as-a- Delivery service Platform Offer self-service access and storage management Manage and meter utilization and bandwidth per user Integrate chargeback and billing Atmos Instant Windows and Linux access to any Atmos storage GeoDrive cloud Creates a virtual drive (Windows) or mount point (Linux) Enables users to upload/download/share filesAnd we kind of say it‟s crate to credit card within 90 days. And there is some customization that isallowed with that, too. Service providers may only use certain functionality at the cloud deliveryplatform, or they may be a total customs solutions or they may just go right out of the box and puttheir own branding on the Atmos cloud delivery platform. And Atmos also, you know, the value ofa cloud storage service is really the applications that are on top of it. And so Atmos through thepower of it‟s developer community that it has built and it‟s partner program, has alreadyreintegrated a lot of applications through solutions that could work directly with the EMC Atmos.
Private Or Public Storage Clouds Build & manage Use one of the 40+ a Atmos-powered public private storage cloud services around cloud the globe Custo m APP Federation ATMOSAnd they fall into a couple of broad categories, and at the top of course is just being able to writeyour own custom applications using these new development frameworks. But a big use case ofcourse is archiving and back up and tiering content to the cloud. Basically acting as an archive tier.But also doing kind of file sink and share applications with the EMC (simplicity) and oxygen cloudand as well as doing things with medical image archiving and being able to take that even a stepfurther in having a mobile access to an archive content.Let me introduce some examples that kind of showcase what Atmos has done in the real world andyou are probably familiar with vista print they are a company local here in Massachusetts. They doa lot of custom -– marketing collateral and business cards and the problem that that were having afew years ago is that the explosive growth in the amount of content that that had to store, and theyhad applications and users, globally distributed. They really need to get their cost low whilepresenting a very high level of service.But what they also needed to do is provide different levels of service by customer type. That wassomething they there weren‟t really able to do. So you couldn‟t tier the service for one companythat might order business cards once every two years versus a customer that was interacting withtheir applications on a daily basis. You know, printing different collateral and doing stuff on amore regular basis. They couldn‟t offer a tier of service and better monetize the service. So what
they did was to reverse their storage problem and the ability to tier the service, is they implementedthe EMC Atmos across three different locations and that allowed them to start sharing theirresources and distribute their objects across those multiple sites.And it also helped them accelerate the development of their Web based application because it is allbrowser based. Then be able to build SLAs for that and tier their service according to the user typeand that gave them more of a competitive edge. So being able to automate the service levels bycustomer type was a big thing for them but they also just purely saved on storage costs and over$1.5 million a year in storage savings. Using an object based platform they could distribute acrossmultiple locations which is much more efficient. And they also, because they were able todistribute that content across multiple systems that didn‟t have a lot of replication that goes alongwith trying to do that with traditional file systems, they were able to save like $300,000 inbandwidth costs. Atmos Integrated Solutions Custom applications Archive to the cloud SourceOne Cloud backup and recovery NetWorker Content management & collaboration DCTM File tiering CTA DiskXtender Medical imaging archive Mobile and file sync and share Server gateway WAN optimization and fast file transferSo not only were they more efficient on the storage side, but they are also using their network moreefficiently. With that if there is any other information that people need I‟d recommend that they dogo to the Atmos page at EMC dot com and there is also multiple Web casts that they can view andalso I would head to Atmos online dot com, where you can not only access the Atmos blog, but youcan also interact with -– we have a developer sandbox there available there at EMC on Atmos
online dot com, where you can actually interact with the EMC Atmos storage platform. You canactually work with developer tools and actually configure the storage space and do somedevelopment work there. Atmos: Custom/Traditional Applications Vistaprint Situation Efficiently scale and manage 100%+ digital media growth Applications and users globally distributed Competitive pressures to retain low cost/high service Provide different levels of service by customer type Solution EMC Atmos cloud storage across 3 distributed locations Efficiently shared resources and distributed objects across sites Accelerated development with Atmos Web Services & REST API Automated SLAs by user type for competitive edge Business Benefits Automated service levels by consumer type $1.4M / year storage savings $300,000 / year bandwidth savings
EMC Atmos Resources & Tools Datasheets, White Papers, Videos www.emc.com/atmos Webcasts www.brighttalk.com/channel/7397 Stay Up To Date—Twitter, Blog Twitter: www.twitter.com/emcatmos Blog: www.atmosonline.comWe have offered that as for different people to kind of get familiar with the platform. And again,access to the developer community is available on Atmos online dot com as well, where you‟ll getaccess to all the different language findings, code examples and developer forums there. And withthat, we can open it up for Q&A.
(Don Keefe): We are now going to move on to the Q&A portion of today‟s Webcast. If you haveany questions for our speakers today you can enter them by clicking on the questions tab which ison the left hand side of your screen, and click on the submit question button. We‟ll answer asmany questions as time allows. OK -– we did have a couple of questions that came in during theWebcast. The first one that we have here, this person would like to know what exactly is a globalname space.George Hamilton: What exactly is a global name space? Well, let‟s see, thank of in the world offile systems, very simply a name space is an abstraction of a file system resource. Perhaps it is afile server, and in the old way of doing things, you‟d have multiple file servers would have theirown name space. And so when developers would write to a particular name space, so again, theywould need to know the location of the data and where -– to find out where is the data, what serveris it on, so they would write to that particular name space.A global name space virtualizes, excuse me, all of the different file system resources that areunderneath; so again, it gets rid of the location dependency. This single global name space extractsevery resource that is underneath it, the developer only needs to write to that single global namespace and doesn‟t care about the location of the data underneath. That is really the simplest way tosay it.
(Don Keefe): OK, and the next question that we have here, the person would like to know, can youdescribe a little bit more about active/active architectures.George Hamilton: Sure, and again, that -– I‟ll use a good example, too is that we had a customerof the University of Illinois Health Science System in Chicago and they have two locations withEMC Atmos running at the two locations. If they were in an active/passive environment, their datacenter in Chicago would be their primary data center and perhaps their data center that is locatedabout 20 miles outsides of Chicago would be their secondary data center.So they would have to implement a replication and back up scheme that replicates everything tothis second data center and they would typically use snapshots and other technology to make surethat you are -– you have both data centers serve content in case of a failure. But if you do havesome sort of an outage that required a fail over, there is usually a process that you have to gothrough and it can take a couple of hours before you can actually fail over to the other data center inan active/passive environment depending on how you‟re configured. With active/active, what wedo is we distribute the objects across every node and every location in the infrastructure.So there isn‟t an environment -– there isn‟t a node or a location that is in passive mode; they are allactive. So using load balancing you can actually take requests from end users and actually servethe content from whichever data center is actually closest to the end user. So you are getting moreusable capacity in that way, because both sites can service content but they also can survive a sitefailure. There is enough -– we use a geo parity to be able to kind of stripe data -– stripe (inaudible)across both locations in the infrastructure, or more than two data centers.And regardless, even if you suffer a site failure, you can still service content. So it is very data -– itis just more efficient use of storage because you aren‟t provisioning it -– you don‟t have as muchredundant storage and you don‟t have as much overhead which you get when you are trying toreplicate from a traditional file system. So it is a much more efficient use of storage and in the caseof (UIL) they told us that they have been able to take down their primary data center in the middleof the day to do maintenance on something -– they‟ll take down a system in the middle of the dayand the end users don‟t even notice.They said that is a huge help to them because they don‟t have to come in on nights and weekends todo the kind of the routine maintenance stuff -– they can do it during the middle of the business day.(Don Keefe): OK, and you mentioned geo-parity and that kind of ties into the next question thatthis person had; they said, I‟ve heard that Atmos has a unique data protection scheme called geoparity; how is that different than (raid)?George Hamilton: Right, good question. Again, (raid) at it‟s core is parity and so if you think ofthe differences, if (raid) is parity, what is geo parity -– well, it‟s parity with geographic distribution.So if you are working with (raid), there is -– you are striping data across multiple discs within anarray. And so you -– you‟re caught -– and if you -– in order to get additional redundancy you haveto replicate such as in like an active, passive fashion. And you are not only replicating the data but
you are replicating parity you are replicating any other content that is going to help you rebuild datain case you lose discs.I mean, that‟s how it is set up. As you scale that, though, you are not only adding disc capacity forthe actual storage of data but you are actually growing the mechanisms by which you can recoverthat data. So you keep having to add more overhead as you try and scale out a raid based system;but when you have a single global system in an active/active architecture using geo parity, now youcan do that some of striping of objects across every node or location within the infrastructure andyou‟re not increasing overhead as you scale, it stays the same.So it is much more efficient; so that is the basic difference is -– you know, single location versusmore than one location. Geo-parity allows you to kind of do it what it does at a disc level but do itover a whole distributed infrastructure over multiple locations.(Don Keefe): OK, and we do have time for one more question today. And that question is you hadmentioned the (rest) API and S3 -– do you have an SDK for these?George Hamilton: Yes, as I mentioned if you go to Atmos online dot com, you can access ourdeveloper network there. You can get actual access to a sandbox of Atmos to work with but alsoget access to all of the software development tools and the developer forums, language bi links, allof that stuff is available online at www.atmosonline.com.(Don Keefe): OK. As I did mention, we are out of time but I would like to thank today‟s speakers,George Crump and George Hamilton for taking the time to join us today. I would also like to thanktoday‟s sponsor, EMC for making this event possible. And as always, I would like to thank you,the audience, for taking the time to join us today. This is (Don Keefe), have a great day. END