Mark Cusack "Data Intensive Computing on the Amazon Cloud."

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Mark Cusack "Data Intensive Computing on the Amazon Cloud." - Presentation Transcript

    1. Structured Data Archiving: Key Issues 9th July 2009 – CloudCamp, London Mark Cusack Principal Architect RainStor
    2. Ideal Use Case: Application Retirement Dev & Test Retirement Mission Critical Production System - On-premise Implementation End-of-Life Life Cycle Usage Cloud Bursting e.g reporting & peak processing
    3. Data Volume Issues
      • How to transfer in terabytes of data?
      • Structured data can be massively compressed before uploading to a storage cloud, e.g. Amazon S3:
      Network Connection Upload time (1 TB) Compressed (40:1) upload time DSL 166 days 4 days 10 Mbps 13 days 8 hours 1 Gbps Less than a day Less than an hour
    4. Data Security Issues
      • How to maintain data privacy and integrity?
      • Theoretical solution: homomorphic encryption!?
      • Reality: encrypt network pathways and data rest-points
      • Blind cloud storage
        • Key should be generated by the application owner
        • Data should be encrypted on-premise, prior to transfer to the cloud
      • Tamper-proofing and auditing
        • Keep digests of database files off-cloud
        • Practical in the case of application retirement (read-only)
    5. Data Availability Issues
      • How to ensure that the data is always available?
      • Make multiple copies of the data
        • Data compression makes this cost-effective
      • Employ emerging cloud interoperability standards
        • De facto standards: Eucalyptus, Amazon Web Services
        • SNIA eXtensible Access Method (XAM): as an alternative cloud storage interoperability standard
    6. Data Query Issues
      • How to query the data without compromising performance, security and accessibility?
      • Use a compute cloud to query the data
        • E.g. run retired application reports on EC2 against S3 data
      • Query directly against compressed data
        • Problem becomes CPU-bound rather than IO-bound
      • Pass encryption key to the cloud on a per-query basis
      • Provide an ODBC/JDBC interface to compute instance
    7. More Information and Contact Details
      • www.rainstor.com
      • twitter.com/rainstor
      • twitter.com/markcusack
      • [email_address]

    + cpurringtoncpurrington, 3 months ago

    custom

    245 views, 1 favs, 1 embeds more stats

    Mark Cusack's lightning talk at CloudCamp London #4 more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 245
      • 240 on SlideShare
      • 5 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 0
    Most viewed embeds
    • 5 views on http://www.cloudcamp.com

    more

    All embeds
    • 5 views on http://www.cloudcamp.com

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories