© 2009 Caringo, Inc. Access Store   Distribute Caringo & Alfresco Complete Content Lifecycle Solution
Managing File-Based Data as Content <ul><li>Storing file-based data is as much an information management problem as it is ...
Realities of File-Based Data <ul><li>Unstructured data </li></ul><ul><li>Over 95% is “unstructured”  1 </li></ul><ul><li>M...
File Storage Challenges <ul><li>Today’s storage requirements are different </li></ul><ul><ul><li>Millions and billions of ...
<ul><li>Majority of capacity in commercial sector born as file-based, rich digital content </li></ul><ul><li>5 key infrast...
<ul><li>Effectively manage content from creation through storage and expiration </li></ul><ul><ul><li>Alfresco2Caringo int...
Covering the Complete Storage Workflow <ul><li>Comprehensive solution to  access ,  store  and  distribute </li></ul><ul><...
CAStor Software Key Features <ul><li>Runs on affordable and standard x86 server hardware </li></ul><ul><ul><li>Delivers fl...
Early File Management Challenge <ul><li>8dot3 in DOS days </li></ul><ul><ul><li>Eight characters + extension </li></ul></u...
Incremental Advancement <ul><li>Long file names introduced in Windows mid-90s </li></ul><ul><ul><li>Promise of better iden...
CAStor Content Storage Software Object-Based <ul><li>CAStor ideally suited for file-based data storage </li></ul><ul><li>S...
CAStor Content Objects <ul><li>Supports all types of digital content </li></ul><ul><li>Metadata values stored are specific...
Metadata Enables Intelligence Filter and Rules Engine HTTP/1.1 200 OK Date: Thu, 26 Jun 2008 21:26:34 GMT Server: CAStor C...
Intelligent Content Replication and Distribution <ul><li>CAStor Content Router (CR) </li></ul><ul><li>Policy-based replica...
Content Relationships in Storage <ul><li>Relate elements of a specific project in an anchor stream </li></ul><ul><ul><li>S...
Caringo: Unified Content Infrastructure <ul><li>Investment Protection </li></ul><ul><ul><li>Runs on standard x86 server ha...
<ul><li>A Winning Combination for Managing the Content Lifecycle </li></ul><ul><li>Experience the Solution Today </li></ul...
Training 9 Thank You! Access Store   Distribute
Upcoming SlideShare
Loading in …5
×

Content Lifecycle Management - Best Practices for Governance, Archiving, Compliance & Mining Unstructured Data - Caringo and Alfresco Software

3,549 views

Published on

See the full webinar here: http://www.alfresco.com/about/events/ondemand

This webinar discusses the possibility of a full content life cycle management solution, addressing all lifecycle needs
- from content creation
- to archiving and retention.

Creation of unstructured or file-based data is growing faster than any other data type in organizations.

What is needed is a system that is easy to manage, scales to support the amount of file data being stored and preserves it for the long term.


Industries that are regulated by government for data retention and integrity, need solutions that make it simpler for them to do so.

And without significant overhead.


CAStor is a Content Adressable Storage solution from Caringo which integrates with Alfresco Enterprise Edition. The CAStor’s unique software approach creates high-performance and massively scalable clustered storage on standard x86 server hardware.

This provides customers with affordable content storage that can start with one terabyte and scale seamlessly into Petabytes as your business grows.

Published in: Technology, News & Politics
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,549
On SlideShare
0
From Embeds
0
Number of Embeds
40
Actions
Shares
0
Downloads
185
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Content Lifecycle Management - Best Practices for Governance, Archiving, Compliance & Mining Unstructured Data - Caringo and Alfresco Software

  1. 1. © 2009 Caringo, Inc. Access Store Distribute Caringo & Alfresco Complete Content Lifecycle Solution
  2. 2. Managing File-Based Data as Content <ul><li>Storing file-based data is as much an information management problem as it is an issue with the storage technology </li></ul><ul><li>The point where business and IT needs converge </li></ul><ul><li>Business Need: Protecting and preserving intellectual property and business critical records for future benefit </li></ul><ul><li>IT Need: Implementing a cost-effective infrastructure that ensures the availability and integrity of file-based data </li></ul>
  3. 3. Realities of File-Based Data <ul><li>Unstructured data </li></ul><ul><li>Over 95% is “unstructured” 1 </li></ul><ul><li>Massive file growth </li></ul><ul><li>Up to 120% per year 2 </li></ul><ul><li>Low reuse of files 3 </li></ul><ul><li>90% never accessed after creation </li></ul><ul><li>Only 65% of files accessed are only accessed once 3 </li></ul><ul><li>Aging files occupying expensive storage </li></ul><ul><li>Software needed to migrate files to secondary storage </li></ul><ul><li>Added cost and complexity </li></ul><ul><li>Must meet compliance mandates </li></ul><ul><li>Secondary storage tier required </li></ul>90% 10% 65% Files never accessed again accessed once 1 IDC, The Expanding Digital Universe 2 The Economic Impact of File Virtualization, IDC 3 Measurement and Analysis of Large-Scale Network File System Workloads, UC Santa Cruz accessed
  4. 4. File Storage Challenges <ul><li>Today’s storage requirements are different </li></ul><ul><ul><li>Millions and billions of files on thousands of large disk drives </li></ul></ul><ul><li>File systems simply cannot stretch any farther </li></ul><ul><ul><li>The weight of layers of complexity and virtualization makes them brittle </li></ul></ul><ul><ul><li>They hit maximums on file size and number of files and servers </li></ul></ul><ul><ul><li>They encounter folder and drive letter problems </li></ul></ul><ul><li>Newer file systems are high-maintenance </li></ul><ul><ul><li>Even with layers of virtualization, underlying file systems must still be managed, migrated, backed up and maintained </li></ul></ul><ul><ul><li>Requires highly skilled administrators </li></ul></ul><ul><li>Volume of file data is major information management problem </li></ul><ul><ul><li>Folder/Sub-folder/file name becomes cryptic at scale (millions/billions) </li></ul></ul><ul><ul><li>File systems provide no informational context for files </li></ul></ul>
  5. 5. <ul><li>Majority of capacity in commercial sector born as file-based, rich digital content </li></ul><ul><li>5 key infrastructure requirements (Enterprise Strategy Group) </li></ul><ul><ul><li>Infinite scale – in real-time, dynamically, no human intervention </li></ul></ul><ul><ul><li>No boundaries – expand beyond walls of IT department </li></ul></ul><ul><ul><li>Operationally efficient – leverage commodity components, policy-based automation </li></ul></ul><ul><ul><li>Self-Management – auto re-balance and optimize, no human intervention </li></ul></ul><ul><ul><li>Self- Healing – withstand failures, automatically adjust/heal itself </li></ul></ul><ul><li>Object-based storage (IDC) </li></ul><ul><ul><li>4 tests/criteria for technology </li></ul></ul><ul><ul><ul><li>Self-referencing – Unique address for each file/object </li></ul></ul></ul><ul><ul><ul><li>Described by metadata – Beyond standard file system </li></ul></ul></ul><ul><ul><ul><li>Location independence </li></ul></ul></ul><ul><ul><ul><li>Dynamic presentation – Not fixed to a traditional tree format </li></ul></ul></ul><ul><ul><ul><li>Intelligent replication/distribution </li></ul></ul></ul>Next Gen: Internet Scale and Object-Based
  6. 6. <ul><li>Effectively manage content from creation through storage and expiration </li></ul><ul><ul><li>Alfresco2Caringo interface available at Alfresco Forge </li></ul></ul><ul><ul><li>Developed by XeniT </li></ul></ul><ul><ul><ul><li>Alfresco and Caringo Partner </li></ul></ul></ul><ul><li>Alfresco ECM manages business process & workflow </li></ul><ul><li>Caringo stores and protects business-critical content </li></ul><ul><li>Ensure content integrity and preservation </li></ul><ul><ul><li>Preserve context of content for the long-term </li></ul></ul><ul><ul><li>Accessible and available well into the future </li></ul></ul>Convergence: ECM & Content Storage
  7. 7. Covering the Complete Storage Workflow <ul><li>Comprehensive solution to access , store and distribute </li></ul><ul><ul><li>HTTP access for cloud storage and Web 2.0 </li></ul></ul><ul><ul><li>Complete business solutions </li></ul></ul><ul><ul><li>Continuous data availability </li></ul></ul><ul><ul><li>Long-term data protection </li></ul></ul><ul><ul><li>Intelligent data replication for content distribution and disaster recovery </li></ul></ul>Content File Server (CFS) Integrated Solutions Native CAStor
  8. 8. CAStor Software Key Features <ul><li>Runs on affordable and standard x86 server hardware </li></ul><ul><ul><li>Delivers flexibility and choice </li></ul></ul><ul><li>Massively scalable storage cluster </li></ul><ul><ul><li>Start small and scale to billions of files or objects </li></ul></ul><ul><ul><li>As you grow from TBs to PBs, access bandwidth also grows </li></ul></ul><ul><li>Increase capacity seamlessly </li></ul><ul><ul><li>No disruption in operations or data availability. No migration! </li></ul></ul><ul><li>Manages and repairs itself automatically and faster than RAID </li></ul><ul><li>Local and Wide Area Replication for DR and backup </li></ul><ul><li>Data protection for regulatory compliance and internal governance </li></ul><ul><ul><li>WORM, integrity checking, authenticity, object-level retention, LifePoints </li></ul></ul><ul><li>Rich metadata support </li></ul><ul><ul><li>Attach and persist descriptive metadata with objects </li></ul></ul><ul><ul><li>Content in Context </li></ul></ul>Node 1 n 2 3 GigE 900 4 CAStor Cluster
  9. 9. Early File Management Challenge <ul><li>8dot3 in DOS days </li></ul><ul><ul><li>Eight characters + extension </li></ul></ul><ul><ul><ul><li>Example: C:Directorydocument.doc </li></ul></ul></ul><ul><ul><li>Organizational challenge for even hundreds of files </li></ul></ul><ul><ul><ul><li>Significant position coding schemes </li></ul></ul></ul><ul><ul><ul><li>Include fully qualified path and name on documents </li></ul></ul></ul><ul><ul><ul><li>Law firms still do this today </li></ul></ul></ul><ul><ul><li>System metadata only, basic </li></ul></ul><ul><ul><ul><li>Non-descriptive, not useful in organizing files </li></ul></ul></ul>
  10. 10. Incremental Advancement <ul><li>Long file names introduced in Windows mid-90s </li></ul><ul><ul><li>Promise of better identifying files for organization and finding </li></ul></ul><ul><li>8dot3 turned into this: </li></ul><ul><ul><li>My DocumentsThis is my document.doc </li></ul></ul><ul><ul><li>and… </li></ul></ul><ul><ul><li>My DocumentsThis is my document v2.doc </li></ul></ul><ul><li>Folder/sub-folder hierarchy is cumbersome </li></ul><ul><li>File counts now in the millions and beyond </li></ul><ul><ul><li>Remains difficult to manage especially over time </li></ul></ul><ul><ul><li>Millions will turn into billions </li></ul></ul><ul><li>File names still lack informational value </li></ul>
  11. 11. CAStor Content Storage Software Object-Based <ul><li>CAStor ideally suited for file-based data storage </li></ul><ul><li>Supports rich metadata tags </li></ul><ul><ul><li>System generated metadata </li></ul></ul><ul><ul><li>Custom metadata </li></ul></ul><ul><ul><li>Descriptive information lives with the file </li></ul></ul>Metadata 101000101010100111010101100010110100…110010 UUID HTTP/1.1 200 OK Date: Thu, 26 Jun 2008 21:26:34 GMT Server: CAStor Cluster/2.2 CAStor-Application-Name: FinalCutPro CAStor-Create-Date: 2008-06-26 21:26:14.687000 Castor-System-Cluster: Internet Demo Cluster Castor-System-Created: Thu, 26 Jun 2008 21:26:20 GMT Content-Disposition: inline; filename=Sports %Segment%206-26-08.mxf Content-Length: 8619354 Content-type: application/mxf lifepoint: [Thu, 03 Jul 2008 21:26:14 GMT] reps=2, deletable=True lifepoint: [] delete Replica-Count: 2 Content Address File Data
  12. 12. CAStor Content Objects <ul><li>Supports all types of digital content </li></ul><ul><li>Metadata values stored are specific to each individual type </li></ul><ul><li>Vast 128-bit address space </li></ul><ul><ul><li>Never run out of UUIDs </li></ul></ul><ul><ul><li>Billions of objects </li></ul></ul><ul><li>Define metadata values to drive replication and distribution </li></ul>Metadata 101000101010100111010101100010110100…110010 UUID1 Content Address Video Metadata 111000101010100111010101100010110100…101010 UUID2 Image Metadata 101100101010100111010101100010110100…110111 UUID3 Audio Metadata 101011101010100111010101100010111110…110011 UUID4 Doc
  13. 13. Metadata Enables Intelligence Filter and Rules Engine HTTP/1.1 200 OK Date: Thu, 26 Jun 2008 21:26:34 GMT Server: CAStor Cluster/2.2 CAStor-Application-Name: SimpleCASg CAStor-Create-Date: 2008-06-26 21:26:14.687000 Castor-System-Cluster: Internet Demo Cluster Castor-System-Created: Thu, 26 Jun 2008 21:26:20 GMT Content-Disposition: inline; filename=Car%20Chase%206-26-08.mxf Content-Length: 86193452 Content-type: application/mxf lifepoint: [Thu, 03 Jul 2008 21:26:14 GMT] reps=2, deletable=True lifepoint: [] delete Replica-Count: 2 Metadata Filtered for specific value(s) Rule(s) fire when condition met If Content-type = MXF then Replicate to DR Facility
  14. 14. Intelligent Content Replication and Distribution <ul><li>CAStor Content Router (CR) </li></ul><ul><li>Policy-based replication to geographically separate sites </li></ul><ul><li>Policies driven by administrator-defined rules for specific metadata </li></ul><ul><li>Multiple replication and distribution topologies supported </li></ul><ul><ul><li>1:1, 1:M, M:1, M:M </li></ul></ul><ul><ul><li>Customize to meet specific needs </li></ul></ul><ul><li>Replicate some or all files </li></ul><ul><li>Fully automated to reduce management effort for file replication </li></ul>Video
  15. 15. Content Relationships in Storage <ul><li>Relate elements of a specific project in an anchor stream </li></ul><ul><ul><li>Simple list of UUIDs </li></ul></ul><ul><ul><li>Video, key image, audio, script </li></ul></ul><ul><li>Add elements through business process/workflow </li></ul><ul><li>Persist relationships over the long term </li></ul>Mutable Metadata UUID UUID1 UUID2 UUID3 UUID4    UUID n Anchor Stream Metadata 101000101010100111010101100010110100…110010 UUID1 Content Address Video Metadata 111000101010100111010101100010110100…101010 UUID2 Image Metadata 101100101010100111010101100010110100…110111 UUID3 Audio Metadata 101011101010100111010101100010111110…110011 UUID4 Doc
  16. 16. Caringo: Unified Content Infrastructure <ul><li>Investment Protection </li></ul><ul><ul><li>Runs on standard x86 server hardware </li></ul></ul><ul><ul><li>Add new generation server hardware at any time without disruption </li></ul></ul><ul><li>Cost-effective Scaling </li></ul><ul><ul><li>Add capacity without interruption or need to provision storage </li></ul></ul><ul><ul><li>Scale from Terabytes to Petabytes in a single cluster </li></ul></ul><ul><li>Operational Efficiency </li></ul><ul><ul><li>Self-managing and self-healing cluster minimizes administrative intervention </li></ul></ul><ul><li>High Performance Object Storage </li></ul><ul><ul><li>Easily address performance needs for small and/or large file workloads </li></ul></ul><ul><li>Data Protection & Preservation </li></ul><ul><ul><li>Archive unstructured data for the long-term and address regulatory compliance </li></ul></ul>Content File Server (CFS) Integrated Solutions Native CAStor
  17. 17. <ul><li>A Winning Combination for Managing the Content Lifecycle </li></ul><ul><li>Experience the Solution Today </li></ul><ul><li>Get 4TB of CAStor Software Free </li></ul><ul><li>Go to http://www.caringo.com/downloadCAStor.html </li></ul>
  18. 18. Training 9 Thank You! Access Store Distribute

×