Cloud File System Gateway
                     &
 Cloud Data Management Interface (CDMI)



Author:                        Imran Khan, Solutions Architect, Calsoft Inc.
Presenter:                     Parag Kulkarni, VP Engineering, Calsoft Inc.

2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
Agenda

    Cloud Storage Industry Challenges
    Brief about CDMI
    Cloud File System(FS)
       Cloud FS Architecture
       Cloud FS Modules
       Cloud FS Solution
    Conclusion
    Q&A Session




 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                           2
Abstract
          Seamless extension of NAS to Cloud Storage using Cloud File System
Today NAS stores data on local disks and/or SAN disks. Most enterprises have sufficient file
storage capacity to run their day-to-day operations provided some older data is moved to
secondary storage. But since most data resides on primary storage (or secondary storage
within enterprise boundaries) it becomes necessary to extend storage capacity for NAS.

With Cloud storage becoming more secure, accessible, easy to use and cost effective, it can
be a considered as secondary storage for enterprise NAS – Hierarchical Storage
Management. We can even use cloud storage as primary storage by using enterprise storage
devices for caching to improve cloud data access throughput.

Adding CDMI based interfaces to cloud file system enables us to integrate with any cloud
storage provider and store file based data to cloud storage easily. Cloud File System
presented and implemented by Calsoft integrates with many cloud storage providers using
CDMI. This helps enterprises store file based data to cloud storage and provides throughput
similar to local NAS by using efficient caching techniques.
 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                                        3
Learning Objectives
         Challenges
      To store ever growing data and optimally manage
       storage capacity
      Hierarchal Storage Management across enterprise
       storage and cloud storage
      Measure/monitor user guarantees and SLAs and map
       them over multiple clouds
      Optimizing storage capacity between on-premise and
       cloud storage pools
      To migrate between cloud storage platforms



       Solution
     Adoption of Cloud Storage
     CDMI – move to build an open standards for storing data
      in the Cloud
     No impact on existing users/apps using NAS
     Abstract policy engine to monitor and map the SLAs

    2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                              4
Cloud Storage Industry Challenges

                                  Access Bandwidth, Delay And
                                  Disruption Of Service

                                  Common Interface To Multiple Clouds


                                  Data Security


                                  Data Transfer Policy


                                  Auxiliary Features

  2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                            5
Cloud Data Management Interface

                                                                                                   CDMI
                                                                              A        protocol    for       self-provisioning,
                                                                               administering and accessing cloud storage.
                                                                              It defines RESTful HTTP operations for
                                                                               assessing capabilities of cloud storage
                                                                               system and exporting data via other
                                                                               protocols such as CIFS and NFS




                                                                                             CDMI Benefits
                                                                              To manage containers, domains, security
                                                                               access
                                                                              Easy of monitoring / billing
                                                                              For storage that is functionally accessible by
                                                                               legacy or proprietary protocols

 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                                                                              6
Calsoft’s Cloud File System
       Leveraging CDMI to provide a common interface to interact with multiple clouds

 Cloud Access                                         Cloud Request              Multiple Cloud
                                                                                  Framework
     • File system                                          • Convert the           • Enables
       interface to the                                       filesystem              interaction with
       cloud access.                                          demand into             multiple clouds,
                                                              cloud requests          while abstracting
     • Filesystem cache                                                               out many
       the data from the                                    • Convert the data        operations
       cloud to provide a                                     objects back to
       quicker access                                         common file           • Dynamically
                                                              model                   changes support
     • Filesystem                                                                     for various cloud
       interface                                                                      vendors
       provided to the
       clients using NFS,                                                           • Provides a set of
       CIFS etc                                                                       policies to control
                                                                                      access patterns

 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                                                     7
Cloud File System Architecture
 User 1
                            C                                                                             CDMI
                            I                                                         CDMI              Compliant
 User 2                     F     CLOUD DATA              3rd party Cloud Cloud 1
                                                                                                          Cloud
                                  ACCESSS                 Storage Plug-in                                Storage
                            S                                             Cloud 2
                            /
                                  LAYER
                                                                                          SOAP / REST
 User 3                     N                                                              / WebDAV
                            F     USER MANAGEMENT                                                   Cloud 1 - CDMI
                            S     POLICY MANAGEMENT                                                 Non Compliant
                                                                                                    Cloud Storage
 User 4                           Etc.
                            S
                            E
  ……




                            R     LVM, Disk Driver, RAID, etc.
                            V                                                                       Cloud 2 - CDMI
 User n                     E                                                                       Non Compliant
                            R                                                                       Cloud Storage


                             CLOUD FILE SYSTEM
                                                                            Local / SAN Disks
  2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                                                                     8
Cloud File System Modules

                                                                                                                             Cloud     S3
                                                                                                                             Plugins
Cloud interface                                                                                                                             Other

and the policy                                                                         CFS User Space             Cloud
engine                                                                                                           Interface
                                                                                                                                             Other



  CFS user space                                                NIFS/
                                                                                                                 & Policy

  and   Local FS                                                CIFS                 Command
                                                                                     translation
                                                                                                      Local FS
                                                                                                      wrapper
                                                                                                                  Engine                    Other


  wrapper                                                       User                                                                   User

                                                                            Cloud                                                      kernel
                                                                NIFS/       FS
     Command                                                    CIFS        ( CFS)
                                                                            Layer 1 Functionality
     Translation                                                Kernnel
                                                                            Layer 2 Functionality


                                                                                     Local Cache FS (LCFS)
         Cloud FS – from
         cache or not?                                                             Ext 3     reiser      other


                                  NFS /
                                  CIFS


  2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                                                                                                9
User Space Vs. Kernel Space
• Most of the file systems in user
  space (using FUSE) are designed so
  for the ease of developing and
  maintaining them. Example - s3fs for
  Amazon S3 cloud.

• Doesn’t mean that performance is
  guaranteed.

• User space FS makes one or
  sometimes more than one data
  copies.

• When using a cache, in case of a hit,
  FUSE will still need a context switch
  and data copy.

• Linus Trovalds – ‘I think that arguing
  that something _can_ be done with
  fuse, and thus _should_ be done
  with fuse is just ridiculous.’

• Polpulating the local cache can be
  done by just a command, why pass a
  buffer like fuse does.

• Layered functionality inside the FS
  (for ex: splitting) is easy to
  implement and could prove useful.
    2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                              10
Policy Engine

   •    Pricing models that most clouds use –
          • Storage based              - $/GB
          • Request based              - $/1000 requests.
          • Data transfer based        - $/GB
   •    Other QoS parameters that determine choice of cloud
          • Easy provisioning
          • Multi-tenancy
          • Security
          • Reliability
   •    These parameters especially pricing is tracked by the service provider. There is no easy way for
        user to track these parameters.
   •    Also, there is no standard or specification that defines these parameters.
   •    The policy engine module, proves to be an efficient solution
          • To try and define these parameters across multiple clouds
          • To monitor, keep track of these parameters
          • Allows a rule based framework to control the access to these clouds based on the QoS
              they provide.
   •    In future maybe these QoS parameters can be standardized
   •    And made accessible via APIs, enabling users to program against these parameters

  2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                                                      11
Cloud File System Solution

     Industry Challenges                                                    Calsoft Solution
                                               The policy engine in cloud interface module can be used to
      Access bandwidth, delay
                                               distribute or replicate data across multiple clouds. Loss of service
      and disruption of service
                                               from one cloud will not hamper access to any data.

                                               The plugins to interface with different clouds supporting different
        Common interface to
                                               communication protocols can be written independently and
          multiple cloud
                                               loaded at run time
                                               The policy engine can also select different security algorithms
                                               based on different clouds, which can be applied to the data while
                  Security
                                               sending out over the wire. It is more efficient since it is out of band
                                               for a cache hit scenario

                                               The Policy engine is user controlled and xml based. The rules can
         Data transfer policy
                                               be as simple and as comprehensive as needed

                                               The cloud interface and plugins can do book keeping that can be
           Auxiliary features                  used to verify amount of data transferred and compare the cost of
                                               that data transfer against the billed amount
  2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                                                                   12
Conclusion

 Cloud File System is an idea that has taken into consideration
 current events in Cloud world related to Data storage as a Service
 (DaaS)

 It is a prediction of how infrastructure around cloud services and
 management has changed. This model that will improve
 performance, will enable seamless transitions across CDMI
 compliant and non-compliant clouds for large enterprises with
 very less hassle.




 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                           13
Presenter Biography




 Parag Kulkarni – VP Engineering, Calsoft Inc.
  A veteran of storage industry

  More than 19 years of experience in architecting and developing products
  Key strength lies in quickly understanding product requirements and translating them into

    architectural and engineering specs for implementation.
  Led the engineering team at Calsoft.

  Led the development of Database Editions product at Veritas (Symantec)
  A key contributing member at leading storage companies like Informix (IBM).
  Masters of Technology in Computer Science from IIT Roorkee

  Degree in Industrial Management from University of Indore, India.




 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
Author Biography




 Imran Khan – Solutions Architect, Calsoft Inc.
  A veteran of storage industry

  More than 8 years of experience in architecting and developing products

  Has dealt with products ranging from backup and replication, SAN simulators, multipathing,

    SMI-S, filesystems, journaling, link aggregation protocols.
  Key strength is the ability to have holistic view across stacks of different functionality and

    their interaction.
  Bachelors in Computers Engineering from University of Pune, India.




 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
Thank You



                                     Questions & Answers


                                                    Contact info
                                                   Parag Kulkarni
                                      VP Engineering, Calsoft Inc.
                                          Email: parag.kulkarni@calsoftinc.com
                                                Phone: +1 (408) 834 7086


 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
                                                                                 16

Cloud File System and Cloud Data Management Interface (CDMI)

  • 1.
    Cloud File SystemGateway & Cloud Data Management Interface (CDMI) Author: Imran Khan, Solutions Architect, Calsoft Inc. Presenter: Parag Kulkarni, VP Engineering, Calsoft Inc. 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
  • 2.
    Agenda  Cloud Storage Industry Challenges  Brief about CDMI  Cloud File System(FS)  Cloud FS Architecture  Cloud FS Modules  Cloud FS Solution  Conclusion  Q&A Session 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 2
  • 3.
    Abstract Seamless extension of NAS to Cloud Storage using Cloud File System Today NAS stores data on local disks and/or SAN disks. Most enterprises have sufficient file storage capacity to run their day-to-day operations provided some older data is moved to secondary storage. But since most data resides on primary storage (or secondary storage within enterprise boundaries) it becomes necessary to extend storage capacity for NAS. With Cloud storage becoming more secure, accessible, easy to use and cost effective, it can be a considered as secondary storage for enterprise NAS – Hierarchical Storage Management. We can even use cloud storage as primary storage by using enterprise storage devices for caching to improve cloud data access throughput. Adding CDMI based interfaces to cloud file system enables us to integrate with any cloud storage provider and store file based data to cloud storage easily. Cloud File System presented and implemented by Calsoft integrates with many cloud storage providers using CDMI. This helps enterprises store file based data to cloud storage and provides throughput similar to local NAS by using efficient caching techniques. 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 3
  • 4.
    Learning Objectives Challenges  To store ever growing data and optimally manage storage capacity  Hierarchal Storage Management across enterprise storage and cloud storage  Measure/monitor user guarantees and SLAs and map them over multiple clouds  Optimizing storage capacity between on-premise and cloud storage pools  To migrate between cloud storage platforms Solution  Adoption of Cloud Storage  CDMI – move to build an open standards for storing data in the Cloud  No impact on existing users/apps using NAS  Abstract policy engine to monitor and map the SLAs 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 4
  • 5.
    Cloud Storage IndustryChallenges Access Bandwidth, Delay And Disruption Of Service Common Interface To Multiple Clouds Data Security Data Transfer Policy Auxiliary Features 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 5
  • 6.
    Cloud Data ManagementInterface CDMI  A protocol for self-provisioning, administering and accessing cloud storage.  It defines RESTful HTTP operations for assessing capabilities of cloud storage system and exporting data via other protocols such as CIFS and NFS CDMI Benefits  To manage containers, domains, security access  Easy of monitoring / billing  For storage that is functionally accessible by legacy or proprietary protocols 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 6
  • 7.
    Calsoft’s Cloud FileSystem Leveraging CDMI to provide a common interface to interact with multiple clouds Cloud Access Cloud Request Multiple Cloud Framework • File system • Convert the • Enables interface to the filesystem interaction with cloud access. demand into multiple clouds, cloud requests while abstracting • Filesystem cache out many the data from the • Convert the data operations cloud to provide a objects back to quicker access common file • Dynamically model changes support • Filesystem for various cloud interface vendors provided to the clients using NFS, • Provides a set of CIFS etc policies to control access patterns 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 7
  • 8.
    Cloud File SystemArchitecture User 1 C CDMI I CDMI Compliant User 2 F CLOUD DATA 3rd party Cloud Cloud 1 Cloud ACCESSS Storage Plug-in Storage S Cloud 2 / LAYER SOAP / REST User 3 N / WebDAV F USER MANAGEMENT Cloud 1 - CDMI S POLICY MANAGEMENT Non Compliant Cloud Storage User 4 Etc. S E …… R LVM, Disk Driver, RAID, etc. V Cloud 2 - CDMI User n E Non Compliant R Cloud Storage CLOUD FILE SYSTEM Local / SAN Disks 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 8
  • 9.
    Cloud File SystemModules Cloud S3 Plugins Cloud interface Other and the policy CFS User Space Cloud engine Interface Other CFS user space NIFS/ & Policy and Local FS CIFS Command translation Local FS wrapper Engine Other wrapper User User Cloud kernel NIFS/ FS Command CIFS ( CFS) Layer 1 Functionality Translation Kernnel Layer 2 Functionality Local Cache FS (LCFS) Cloud FS – from cache or not? Ext 3 reiser other NFS / CIFS 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 9
  • 10.
    User Space Vs.Kernel Space • Most of the file systems in user space (using FUSE) are designed so for the ease of developing and maintaining them. Example - s3fs for Amazon S3 cloud. • Doesn’t mean that performance is guaranteed. • User space FS makes one or sometimes more than one data copies. • When using a cache, in case of a hit, FUSE will still need a context switch and data copy. • Linus Trovalds – ‘I think that arguing that something _can_ be done with fuse, and thus _should_ be done with fuse is just ridiculous.’ • Polpulating the local cache can be done by just a command, why pass a buffer like fuse does. • Layered functionality inside the FS (for ex: splitting) is easy to implement and could prove useful. 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 10
  • 11.
    Policy Engine • Pricing models that most clouds use – • Storage based - $/GB • Request based - $/1000 requests. • Data transfer based - $/GB • Other QoS parameters that determine choice of cloud • Easy provisioning • Multi-tenancy • Security • Reliability • These parameters especially pricing is tracked by the service provider. There is no easy way for user to track these parameters. • Also, there is no standard or specification that defines these parameters. • The policy engine module, proves to be an efficient solution • To try and define these parameters across multiple clouds • To monitor, keep track of these parameters • Allows a rule based framework to control the access to these clouds based on the QoS they provide. • In future maybe these QoS parameters can be standardized • And made accessible via APIs, enabling users to program against these parameters 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 11
  • 12.
    Cloud File SystemSolution Industry Challenges Calsoft Solution The policy engine in cloud interface module can be used to Access bandwidth, delay distribute or replicate data across multiple clouds. Loss of service and disruption of service from one cloud will not hamper access to any data. The plugins to interface with different clouds supporting different Common interface to communication protocols can be written independently and multiple cloud loaded at run time The policy engine can also select different security algorithms based on different clouds, which can be applied to the data while Security sending out over the wire. It is more efficient since it is out of band for a cache hit scenario The Policy engine is user controlled and xml based. The rules can Data transfer policy be as simple and as comprehensive as needed The cloud interface and plugins can do book keeping that can be Auxiliary features used to verify amount of data transferred and compare the cost of that data transfer against the billed amount 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 12
  • 13.
    Conclusion Cloud FileSystem is an idea that has taken into consideration current events in Cloud world related to Data storage as a Service (DaaS) It is a prediction of how infrastructure around cloud services and management has changed. This model that will improve performance, will enable seamless transitions across CDMI compliant and non-compliant clouds for large enterprises with very less hassle. 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 13
  • 14.
    Presenter Biography ParagKulkarni – VP Engineering, Calsoft Inc.  A veteran of storage industry  More than 19 years of experience in architecting and developing products  Key strength lies in quickly understanding product requirements and translating them into architectural and engineering specs for implementation.  Led the engineering team at Calsoft.  Led the development of Database Editions product at Veritas (Symantec)  A key contributing member at leading storage companies like Informix (IBM).  Masters of Technology in Computer Science from IIT Roorkee  Degree in Industrial Management from University of Indore, India. 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
  • 15.
    Author Biography ImranKhan – Solutions Architect, Calsoft Inc.  A veteran of storage industry  More than 8 years of experience in architecting and developing products  Has dealt with products ranging from backup and replication, SAN simulators, multipathing, SMI-S, filesystems, journaling, link aggregation protocols.  Key strength is the ability to have holistic view across stacks of different functionality and their interaction.  Bachelors in Computers Engineering from University of Pune, India. 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved.
  • 16.
    Thank You Questions & Answers Contact info Parag Kulkarni VP Engineering, Calsoft Inc. Email: parag.kulkarni@calsoftinc.com Phone: +1 (408) 834 7086 2012 Storage Developer Conference. © Calsoft Inc.. All Rights Reserved. 16