Your SlideShare is downloading. ×
The Object Evolution - EMC Object-Based Storage for Active Archiving and Application Development
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

The Object Evolution - EMC Object-Based Storage for Active Archiving and Application Development

1,951
views

Published on

This Technology in Brief, written by Taneja Group, examines the fast-changing world of archiving and development on the web, and how object-based storage for unstructured data provides benefits such …

This Technology in Brief, written by Taneja Group, examines the fast-changing world of archiving and development on the web, and how object-based storage for unstructured data provides benefits such as active archiving, global access, fast application development, and much lower cost compared to high computing and data protection costs of NAS.

Published in: Technology

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,951
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
48
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. TECHNOLOGY IN BRIEF THE OBJECT EVOLUTION EMC OBJECT-BASED STORAGE FOR ACTIVE ARCHIVING AND APPLICATION DEVELOPMENT NOVEMBER 2012 A few years ago, object-based storage made a huge splash on-premise with the promise of meaningful data relationships, information accessibility and strong compliance. It remains an important component for information management based on compliance and single-tenant architectures. However, the evolution of object-based storage has big implications for the cloud and unstructured data: new approaches to active archiving, web/mobile application development and a changing model for cloud storage service providers. Object storage is optimal for the web. It has a very different architecture from file systems, which are frankly overkill for most cloud storage. On-premise can be a different story; having data close to hand under single-tenant access control is right for some data storage. But on-premise stored data requires that the enterprise maintain a primary data center, a cold data center for DR, replication, continuous data protection, and so on. Given the right set of needs this is a fine trade-off of course and we certainly do not counsel people to get rid of their internal data centers and redundant systems. However, cloud-based object architecture offers big benefits for storing unstructured data for active archiving, global access to data, fast application development and much lower cost compared to the high computing and data protection costs of on-premise NAS. EMC has engineered Atmos to provide these capabilities and many more as a massively scalable, distributed cloud-based system. In this Technology in Brief we will examine the fast-changing world of archiving and development on the web, and how object-based storage is the best way to go for these monumental tasks. When Object Trumps File The go-to architecture for unstructured data has traditionally been an application-centric system containing the operating system, the application, and a NAS filer using hierarchical file architecture. This infrastructure works acceptably well in a slow-growth, consistent workload setting; although even then it is far too easy to add complexity along with additional systems and filers. However, business needs have evolved far beyond this sleepy storage model. Unstructured data now comprises a massive portion of large data growth, and hierarchical file systems are difficult to optimize and scale. For example, file system-based storage requires near-constant provisioning. As storage requests grow (which they inevitably do), IT administrators must manually provision storage to meet the expanded requirements. Meanwhile, large volume and spiky workloads make provisioning both “up” and “down” an expensive and time-consuming proposition. And difficult provisioning is hardly the only problem: siloed data protection with individual backup, replication and archiving applications steadily raises OPEX. Scaling is an issue as well. Large critical big data applications may warrant scale-out or scale-up file systems (which are challenges in and ofCopyright The TANEJA Group, Inc. 2012. All Rights Reserved. 1 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 2. Technology in Briefthemselves). Most do not rate this architecture, and instead reside on poorly scalable systems. Thenumber of these systems grows as applications come online, making it even harder for IT andapplication owners to administrate and for users to get the value from the application that theyneed. This already difficult scenario gets even worse when NAS storage is used for what isessentially a cloud use case, such as extending existing assets over the cloud. Figure: Traditional NAS infrastructure 3In contrast to hierarchical file system-based storage silos, object-based storage opens up a wholenew range of dynamic functionality. Object-based storage assigns unique object IDs to access dataacross all federated locations. This goes a long way towards eliminating traditional, time-consuming storage management tasks like LUN creation and RAID groups. Active archives andapplications needing fast global access particularly benefit from global namespaces and locationtransparency. The flat, universal namespace allows global access to stored content from anywherethe distributed application runs. Applications can also efficiently associate metadata with storedobjects without using a dedicated database. Sharing vast storage resources means applicationadministrators do not need to modify application files. Object-based storage usually has elements offile systems in order to handle processes like file archiving, but it is not founded on thatarchitecture and its drawbacks.Object-based storage originally developed as a type of specialized NAS storage where thehierarchical system was replaced with an object-oriented system that made file storage far moresecure and scalable. One of its most popular incarnations is still going strong today: Content-Addressable Storage (CAS). A subset of object-oriented storage, CAS ensures there is only one ID forany object. When the CAS object is retrieved, it can be hashed again and checked against its ID toverify identity. CAS de-dupes at the object level for copy control.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 2 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 3. Technology in BriefTABLE: CONTRASTING FILE SYSTEMS WITH OBJECT STORAGE Characteristic File Object File systems implement a Object metadata is stored along centralized file layer metadata with the object data to avoid service that tracks directory metadata service bottlenecks. ThisMetadata structures, permissions, and on-disk ID may be used to also uniquely locations of files. All file requests verify and validate the data being must access metadata first for stored. permission and file information. File systems have built-in Object storage provides a single namespace constraints for files and flat namespace for objects. directories they can store and Replacing path and filenames withNamespace manage. Hierarchical directory object identifiers makes the structures can become unwieldy, address space practically infinite performing poorly at navigating with very fast performance for large numbers of users or files. users and applications. File systems are designed to offer Objects are inherently immutable in-place editing and updating of once stored under a unique ID, files using sophisticated, yet highly and can be easily replicated and complex, locking and accessed globally. ProgrammingInteraction synchronization mechanisms. These for object storage leads to simpler, methods make it difficult to supportable, and more reliable distribute or extend file systems programs. across multiple locations. File systems present a real Object stores are simple, clean and challenge for cloud-based archival quick to access. Since objects are management and mobile easily distributed, replicated, and application delivery. Poor globally accessible in the cloud,Cloud Applications scalability, lagging performance, they are ideal for active global and complex application archives and distributed mobile development make traditional file applications. systems a poor choice for compelling new cloud usages.Object-based storage both on-premise and in the cloud require certain key capabilities. On-premiseobject storage has great benefits for local file storage including multiple application access, massivescaling, high availability; and in some architectures, information governance as well. Multiple application access. Applications simultaneously leverage the same centralized object-based storage infrastructure. This enables local object-based storage to execute application-specific archiving management attributes for a complete chain of information custody. Massive scaling. Massive scaling is problematical with file-based archive solutions. As the file system reaches its maximum capacity, administrators must expand the entire system’s operating system, file system and application in order to scale the archive. By contrast, object-Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 3 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 4. Technology in Brief based storage can expand in an open fashion into multiple petabytes due to their flat address space. High availability. Object storage often archives data that has heavy retention and government requirements. In this environment, 5 9’s or higher availability (99.999%) is a necessity. Mirroring and parity help to protect availability; other beneficial features include self-healing, detecting and fixing soft corruptions in the background, and addressing hardware failures before they impact data availability. Information governance. A subset of object-based storage, Content-Addressable Storage (CAS) is purpose-built for long-term defensible retention of fixed files and data. As opposed to other archival storage methods like tape or monolithic “tar” files that bundle data up and/or move it offline, CAS stores data as objects that can be strictly and individually managed for governance and compliance and yet remain actively accessible on-line.Best Practices: Object and the CloudWe strongly support on-premise object storage such as CAS for local space savings, performanceand information governance. However, we find that object storage is roaring to life in the cloud,where cloud-based active archiving and application development require highly distributed andsingle namespace storage for unstructured content. These critical usage cases benefit far more fromobject-based storage than they do from traditional file systems. Let’s look at best practicesarchitectural features for object-based storage in the cloud.DATA AND METADATAWhen data is stored as an object, a unique object identifier is created out of a single universal globalnamespace. The object ID is retained by the client application and used to subsequently retrievethat object. Objects can effectively live anywhere in the cloud-wide system without the storageclient needing to know about actual data locations, file system structures or LUN details. Thisprovides a complete location transparency that serves to reduce intentional storage managementand inherently supports globally distributed access by web and mobile applications.Because of the location transparency provided by the object storage layer, objects can beautomatically load-balanced across nodes, and replicated within and across sites withoutdisrupting applications or users. Wide data distribution and federation can be managed throughsystematic policies to meet various service level goals for access, high availability, protection, costand performance.The object layer abstraction also provides a great benefit to applications that previously might havehad to be intimately storage aware to avoid running out of space or had to otherwise activelymanage data locations. Because applications written to leverage object storage don’t have to embedrules or code specific knowledge of storage infrastructure details, they avoid having to be re-written or re-architected for “changing” storage assignments as users spread, features expand, anddata sets grow.MULTI-TENANCYSecure multi-tenancy is a key requirement of cloud object storage, which should support two levelsof multi-tenancy: tenants and sub-tenants. Tenants are top-level entities that each has its ownaccess points, security controls and master storage policies. Tenants share nothing with othertenants and are fully isolated. Every node gets assigned to a specific tenant; tenants do not sharenodes and therefore each tenant has its own dedicated access points and storage. Within a largecompany, a tenant could be set up for independently managed divisions or subsidiaries. In a serviceprovider implementation, the tenant might be mapped to a broad storage service offering.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 4 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 5. Technology in BriefSub-tenants are then created within each tenant with security controls and defined managementpolicies assigned by the tenant. Each sub-tenancy defines a distinct storage environment withisolated management for its own users, object namespace, and defined shares. A sub-tenant withina company might correspond to a department, while a storage providers sub-tenant might track toa specific client account.This highly functional multi-tenancy capability makes it easy to create private sandboxes orimplement a global content delivery scheme. With some planning, this scheme could enable largecorporations to facilitate aggregating “big data” distributed across the enterprise.ACCESS FROM ANYWHEREAs a cloud object storage service with a flat global namespace, an object can be accessed throughany site (although for performance, policies might strive to replicate objects to sites closer to wherethey will be read). In addition, object storage for the cloud must present a broad range of accessmethods including both web services and traditional file services.REST (and SOAP) web services are key APIs. REST is the most common cloud storage accessmethod for browser and custom mobile applications. REST as a protocol over HTTP was designedto optimize web-style remote access to “resources”, and is an ideal match to object storage whereeach object can be easily treated as a REST resource. Figure: Typical cloud-based object storage deploymentPOLICY DRIVEN MANAGEMENTA key benefit of object storage is the ability to use metadata to drive automatic data managementpolicies. Policies should support service levels, and should be triggered when data objects arecreated, objects hit certain ages, or upon metadata updates. Policies can control data protectionoperations including the number, type and target locations for replicas, inherent storage featuresfor striping, compression and de-duplication, retention locks and automatic deletion, and shiftingobjects into different policies over time.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 5 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 6. Technology in BriefThe policy mechanism should be highly flexible, targeting policies to any group of objects based onboth system and user defined metadata. Policies can be used to build service levels by defining theamount of replication, implement archive rules for compliance, and optimize capacity andperformance as items age.Primary Object Use Cases in the CloudCloud-based archiving, particularly medical and file archiving, forms the primary use case forobject-based storage. Web application development is surging forward, and Archive-as-a-Serviceand its providers round out the fastest-growing use cases.PRIMARY USE CASE: ACTIVE ARCHIVESArchived information is playing a more strategic role in workflows and business processes. On-premise archiving is essentially static and used to reduce storage costs, improve operationalefficiency, retention and compliance, and enable the business to use archived data to make betterbusiness decisions. Cloud-based archiving retains elements of these features but adds new dynamicones: instant access from any device, archive as a service and federating to private or public cloud.Atmos provides both the static and dynamic features that massive active archives require. Federate to public or private clouds. Federation enables companies to treat on-premise and cloud object storage as a single efficient infrastructure. Companies may pool distributed storage assets including data, applications and policies to take full advantage of the cloud’s massive scalability and global access features. Federation also lowers cost and risk: application workloads run on cloud resources with a low execution cost, and if a cloud-based storage system goes down the distributed workload remains protected. Federation extends internal policies to cloud-based storage environments by applying existing policies and settings to cloud-based storage. Use metadata to drive business and storage decisions. We expect the use of metadata to expand quickly to directly feed business exploitation processes, as well as support more automatic and intelligent storage management decisions. A singly managed distributed system that maintains directly accessible object metadata yields rich support for business decisions. Object-based storage also enables IT to automate information lifecycle management across the entire distributed data store, not just by storage silo. Policies should be flexible enough to be set at the object, tenant or system levels, to automate archive decisions, set and manage retention, expiration, and disposition. Multi-tenancy for secure shared storage. Multiple applications can safely co-exist as separate tenants. Isolation by tenant protects security while enabling the sharing of system-wide resources and capacity. Multi-tenancy is also efficient since it is subscribed to a highly scalable pool of storage, which can flexibly up-scale and down-scale on demand. Massive scalability. Unstructured data storage is growing so fast that traditional storage systems are straining purchase, maintenance and management resources to the brink. Distributed object-based architecture yields near-limitless scale. Object also allows for automatic load balancing whenever new objects are stored, which protects high performance across the entire distributed system. Multi-site active/active. Multi-site active/active architecture is an important component of object-based storage, especially in the cloud. Cloud object storage systems span multiple sites and provide for multi-site direct access to objects through both synchronous and asynchronous replications. This model replicates between multiple storage nodes and sites, which not only increases distributed availability and content distribution, but also supports disaster recovery.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 6 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 7. Technology in Brief Archive-as-a-service. The most agile and flexible way for IT to deliver archive services is with the cloud model of self-service portals. This model manages and meters utilization and bandwidth and supports third-party chargeback. Within an enterprise this flexibility and instant storage relieves users of the temptation of using commercial cloud services simply because they can get the storage they need fast – even though security might not be in place. This approach also enables ISVs and MSPs to extend archive requirements and offerings. Reduce manual tasks and provisioning across multiple archives. Cloud-based archives must be easy to set-up and for reliability and consistency must not require long or deep manual configuration. They should also automate underlying complexities including security, audit, retention, performance, and capacity growth. Atmos provides these features and more, relieving the cloud administrator of enormous burdens. Distributed systems may be managed as a single entity with policies to automate hundreds of management and data protection tasks. And perhaps the most important of all, object-based systems like Atmos offer massive scalability of capacity and performance thanks to their unique architecture.FAST-GROWING USE CASE: WEB AND MOBILE APPLICATION DEVELOPMENTWeb and mobile applications development using unstructured data also has driving needs thatobject-based cloud storage meets. Web application development requires quick access to storageresources, test/dev environments capable of storing multiple copies of large data sets, and theability to test web applications in real-time online environments. These requirements areunderstandably hard to achieve in traditional using file-based storage systems.Applications written to leverage object storage won’t need to be rewritten or even taken offline asthe object storage seamlessly (or elastically) expands over time. Atmos provides the keycapabilities that web application development require, including location transparency, self-managing storage and REST APIs. Enable instant access to data from any device. Web and mobile applications are inherently geographically distributed, yet file systems are usually limited in both effective access points (location) and number of files that they can manage. Object-based storage abstracts its storage from physical locations, providing a secure access point in place of device-specific mount points. Web services APIs and file-based access allow approved users to easily access their archives from computers and a broad array of mobile devices. Integrated web services over REST and SOAP are key to this instant access. Other support components are file-based access (CIFS / NFS / IFS / CAS), and expanded access via ISV applications. Self-managing storage. In traditional development, applications have often been hard-coded to specific data stores through pointers to identified LUN’s or file system navigation paths. In contrast, object storage provides a clean mapping from application to data through a simple REST API with an immutable unique object ID to the stored object. This goes a long way towards eliminating traditional, time-consuming storage management tasks like LUN creation and RAID groups. Cloud owners may choose to extend self-management options to customers, making it simple for users to grow storage capacity on demand. Broad API support. Cloud object storage is basically shared storage accessed through web- based services. Atmos’ architecture supports rapid web application development with a broad API set including REST and S3. REST API leverages HTTP operations on objects that are directly addressed, which reduces code complexity and provides the kind of easy, automatically distributed, protected, persistent storage the developer needs. In addition to the REST API, EMC Atmos also natively supports the Amazon S3 API. This provides customers with the ability to simply point S3 applications to Atmos and seamlessly migrate their applications to any of the more than 40 Atmos powered public clouds around the globe.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 7 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 8. Technology in BriefEMC and Object-Based StorageEMC first introduced Centera CAS for archiving in 2002. Centera offers 5 9’s data availability withits redundant array of independent nodes (RAIN) is interconnected via cube switches, protectingdata across independent nodes in a cube. Mirroring and parity provide additional protection andavailability.Centera’s CAS architecture keeps the retained data from being compromised or deleted before theend of its retention period. Centera assigns unique hash-code identifiers specific to each uniqueobject including content elements, metadata, and data/metadata relationships. This inextricablylinks content elements with their metadata, which are stored within a flat address space – no needfor a separate database. This architecture ensures authenticity of the archived objects. Centeraabstracts the unique objects from their generating applications and operating systems, whichenables Centera to flexibly act as the single, highly optimized data store for previously siloedarchives.Centera retains single instances of archived objects. In the case of multiple users of the same file –such as a PowerPoint file sent over a distribution list – Centera retains metadata with informationabout each user’s interaction with the file, but points to the single instance of the object. By cuttingdown on data copies, this results in dramatic reductions in the quantity of archive storage.Centera searches using metadata, rather than opening up the content objects on application-specificstorage. This results in much faster and more efficient searches without using application cycles.This is possible because content and metadata stored on Centera is application, file and operatingsystem independent; and Centera offers is a search engine right in its repository.Centera’s content-based addressing integrates directly with application environments via APIs,with no need for kernel level dependencies. This means that multiple applications cansimultaneously use Centera, and that specific archiving management attributes – such as data agingand data protection -- can be executed per application. These capabilities create a complete chain ofcustody once the data leaves the primary application to be archived on Centera. Mediaindependence also leverages Centera’s application support. Centera objects are independent ofspecific storage media and protocols, which means that the storage system can migrate to newstorage media over time without disturbing the integrity of the archived objects. For long termdisk-based archiving, this represents significant risk mitigation and investment protection.Centera architecture is highly scalable and self-managing. Traditional file systems scale based onthe amount of stored data versus remaining available address space – which may not be much. Asthe file system reaches its maximum capacity, administrators must expand the entire file systemincluding operating system, file system, and application in order to scale the archive. In contrast,Centera expands to petabyte-high capacities due to their flat address space. It also leverages itsarchitecture to distribute management controls across the entire archive infrastructure. Forexample, if a Centera disk or node fails, the archive cluster knows how to self heal without manualintervention. This distributed management structure extends to cover the deployment, scaling,recovery and protection of all the archival objects being stored by Centera.Centera optimizes archiving, information governance and compliance. Users may choose from 300native, integrated archiving applications to manage archival needs for email, files, medical imaging,content management, video, voice, and more on the single Centera archiving platform. In addition,Centera offers Compliance Edition Plus for compliance and eDiscovery, and Governance Edition fordata retention management.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 8 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 9. Technology in BriefCentera Compliance Edition Plus captures and preserves original content, protecting data andproving chain of custody for legal eDiscovery and litigation. Retention classes assign a logicalreference to each electronic record object; policies enforce data retention and safe disposition.Centera Governance Edition enforces internal policies for data retention and disposition. Policiesmay be organizational or application-specific, which improves corporate accountability, reduces thecost of eDiscovery and compliance, and proves the integrity of governance controls.To the Cloud: Atmos ArchitectureEMC’s Atmos supports the same CAS API as Centera for seamless migration, and brings objectstorage into the cloud with massive scalability and geographic federation supported with multi-tenancy, cloud provisioning and global access features. While Atmos is readily leveraged to extendactive global archives, it also offers an exceptional platform for web and mobile applicationdevelopment. Atmos even enables new opportunities for global “big” data aggregation anddistribution.Atmos is at heart a software storage system for building private and public cloud storage. Atmosimplementations are available from EMC either already integrated into pre-packaged physicalbuilding blocks or as a virtual machine solution for VMware vSphere that can leverage other EMC or3rd party storage resources. Additionally, there is a rich ecosystem of service providers providingAtmos as cloud Storage-as-a-Service directly. Any and all of these options can be federated togetheras needed within and across a given organization.EMC uses REST and SOAP web services, and has also implemented file services on top of Atmos toserve underlying objects through the lens of either an NFS or CIFS file server. When NFS or CIFSshares are defined, they are assigned to specific Atmos nodes (or dedicated pairs for HA) and utilizethe Atmos node’s inherent Linux capabilities (leveraging an Installable File System with the FUSEextension). Layering a file system over Atmos imposes some constraints regarding universal access,but also enables both traditional and transitional applications and file system type usage.EMC Atmos Windows and Linux users can also leverage the EMC GeoDrive add-on that installs on asingle user workstation or server to provide remote virtual NFS/CIFS style access (over REST) toAtmos object storage. GeoDrive supports local caching of files for offline use and eventualsynchronization on reconnection. One of the major benefits of GeoDrive is enabling a user to accesslarge amounts of protected storage from anywhere. It can also be used for the disaster recovery offiles pushed or mirrored into Atmos.Atmos technically maintains a given piece of data as an object with associated metadata thatincludes the object ID, system and user-defined metadata fields and the internal object layoutinformation (and parent/child information for objects saved through a file system “namespace”interface). Applications and users can store arbitrary metadata with each object that can beleveraged by group management policies. Policies can be created at the tenant level as a designscheme to provide various service levels of performance access, and data protection based on someawareness of the multi-site architecture of the cloud implementation. They are then assigned tosubtenants, who need to not be aware of the underlying implementation, to apply as target servicelevels to their objects. For example, the power to explicitly enforce compression of image files (e.g.jpegs) after a number of days would present a significant capacity optimization for a web-basedapplication dealing with millions of images.In addition to supporting compliance and retention policies, metadata can be used to driveautomated file distribution, access control and data protection activities optimizing for theappropriate level of data resiliency, performance and availability. For most applications, thoughtfuluse of user metadata can remove any need to implement a separate management tracking databasefor stored objects.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 9 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 10. Technology in BriefReplication is controlled by automated policies which can mirror data objects at many points in anobject’s lifecycle both within and across multiple sites. Within a data center site, replication mightfor example be set to happen synchronously upon ingestion while between replication betweensites might be set asynchronously and launched with an arbitrary delay to allow for data settling.Replications can be targeted to specific locations, or abstractly sent to “other” sites as the systemdecides.For performance and availability, replicas are all active for read access (objects are inherentlyimmutable so there is no issue with having to manage distributed locking mechanisms). Because itis “multi-site active/active”, any site can fulfill new object write requests when the local primarysite is unavailable.In addition to full replication, EMC also provides an erasure coding option called GeoParity. Insteadof keeping two or more full 100% copies, “9/12” erasure coding enables storing an “expanded”object containing only 33% additional encoded “redundant” data broken up into 12 segments. Byusing erasure coding, the original data can be reconstructed dynamically from any 9 of thesegments. These segments are cleverly distributed so that the object can survive (and even beaccessed during) multiple failures. For greater protection there is also a “10/16” coding with a 60%capacity overhead. Erasure coding does impact access performance, especially at ingestion, butprovides great fault tolerance with much lower capacity utilization. Of course, policies can bewritten to convert replicated objects to erasure coded schemes as they age appropriately.With object stores there is generally no need for low-level RAID or disk level protection and Atmosis no exception. Upon hardware failures, replications and/or GeoParity across nodes (RAIN)combined with built-in node auto-healing features suffice to provide the full data protection asdetermined by the service level “policies” implemented for each type of data object. Atmos canwithstand the loss of any disk, node, rack, or even site.Atmos Pre-built Hardware ConfigurationsEMC Atmos pre-configured hardware “appliances” consists of a rack/cabinet containing from 4 to16 Atmos nodes in various configurations and disk capacities. Flexible configurations enablesmooth scalability, and allow for mixes of capacity and performance in and across Atmos sites. AnAtmos storage node consists of a 1GbE server front-end running the Atmos storage servicesconnected to one or more SAS attached disk array enclosures (DAE), each containing 15 1-3TB7200RPM disks. Every node runs all object storage services (the first two nodes in each site alsorun the site metadata locator service that indexes which node contains which objects) supportingtremendous horizontal system scalability.EMC has also introduced their new Atmos G3 series for new levels of density and energy efficiency.G3-Dense-480 is the first in the Atmos G3 series and consists of 4, 6, or 8 nodes with 480 disks in40U, and 3TB drives.TABLE: ALIGNING TOP CLOUD USE CASES WITH EMC ATMOS Use case Challenge Benefits Medical Over 800 million medical imaging Vendor Neutral Archive (VNA) on Atmos: Archiving procedures a year require huge integrates with EMR/EHR and improves storage scalability; collaboration and PACs for better patient care and compliance increase complexity. collaboration, improves data lifecycle management, reduces IT costs, and preserves HIPAA compliance. File Archiving Corporate file sharing is popular with With EMC Sync & Share, users can securelyCopyright The TANEJA Group, Inc. 2012. All Rights Reserved. 10 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 11. Technology in Brief employees but syncing and sharing share Atmos files across mobile devices, are hard to manage. Employees will Linux and Windows. GeoDrive creates a frequently share files anyway over Dropbox-like service that is secure and mobile devices, leaving corporations manageable, powered by Atmos’ fast accountable for risky behavior. performance. Atmos policies monitor changes to data and provide access control, benefitting regulated verticals like finance. Archive as a Both the enterprise and storage The Atmos Cloud Delivery Platform enables Service service providers struggle to provide corporations and service providers to meter IT services to their respective capacity, bandwidth, and usage across customers. Provisioning, tenants. Provisioning is automated by maintenance, and security are all tenant, and Atmos allows tenants to safely difficult issues in traditional storage self-manage and access their own storage. offerings. Managed Many MSPs suffer from narrow profit Atmos lets MSPs efficiently offer storage as Service margins because of the expense of a service and better monetize new service Providers delivering storage to customers. offerings. MSPs can monitor capacity and Managing multiple tenants, manual usage for chargeback, reduce provisioning provisioning and maintaining service costs, and replace multiple tenant manage- level agreements all cut into revenue ment systems with a single system. Dynamic and make it too expensive to add scaling, high availability and security cost- new storage services. effectively meet service level requirements. Content-Rich Traditional storage is a poor Atmos provides location transparency for Web environment for Web application global applications and a highly mobile user Applications development, which needs highly base. The single namespace means that scalable capacity for multiple large application developers never need to recode data sets, a secure environment for pathnames and locations, and do not need test/dev and application testing in to code for limited storage environments. real-time environments. Self-management options make it easy for customers to provision their own storage, and REST APIs reduce application complexity.Taneja Group OpinionWhen on-premise archive solutions smoothly integrate with federated storage, then public andprivate clouds provide extensive scalability and global availability. Yet we see too many end-userstreating the cloud as just another storage tier for low value retained data. This is a huge waste ofcloud possibilities but we understand why it happens: cloud platforms with poor performance anddelivery mechanisms can make cloud-based storage more trouble than it’s worth.But when we talk about EMC Atmos we are not talking about a low-cost storage tier, far from it. Weare describing the heart of business innovation based on highly secure and highly accessible globaldata stores. EMC’s long expertise with object-based storage has kept Centera relevant and hasextended dynamic data management to the cloud with Atmos. The Atmos-fueled cloud replaceshierarchical file storage while allowing the secure flow of information between the data center, thedistributed cloud, and global access points. Customers profit from greatly improved application anddata delivery, and the deep business value inherent in their valuable data.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 11 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com
  • 12. Technology in BriefWhen a company is dealing with geographic reach and large growing volumes of rich content, thenthey should look to object-based storage in the cloud. We fully support EMC in its push to scalecapacity, performance, availability and management far beyond what traditional file systems arecapable of, and more massively than ever before..NOTICE: The information and product recommendations made by Taneja Group are based upon publicinformation and sources and may also include personal opinions both of Taneja Group and others, all of which webelieve to be accurate and reliable. However, as market conditions change and not within our control, theinformation and recommendations are made without warranty of any kind. All product names used andmentioned herein are the trademarks of their respective owners. Taneja Group, Inc. assumes no responsibility orliability for any damages whatsoever (including incidental, consequential or otherwise), caused by your use of, orreliance upon, the information and recommendations presented herein, nor for any inadvertent errors that mayappear in this document.Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 12 of 1287 Elm Street, Suite 900  Hopkinton, MA 01748  T: 508.435.2556  F: 508.435.2557 www.tanejagroup.com