• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Trusted Cloud Storage Tech Talk
 

Trusted Cloud Storage Tech Talk

on

  • 502 views

 

Statistics

Views

Total Views
502
Views on SlideShare
502
Embed Views
0

Actions

Likes
0
Downloads
7
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Trusted Cloud Storage Tech Talk Trusted Cloud Storage Tech Talk Presentation Transcript

    • Secure Cloud Storage and Computing UsingReconfigurable HardwareVictor Costan (龍望), Hsin-Jung Yang (楊昕蓉), Srini Devadas, Nickolai Zeldovich
    • Why Security Matters
    • Cloud Computing: Dreams and Reality• The Cloud: Ideal Picture • The Cloud: Reality
    • Cloud Storage: Attack Vectors Hypervisor State Hardware Bugs Manipulation Attacks
    • Replay Attacks are Harmful
    • Spot the Differences
    • Spot the Differences
    • Spot the Differences
    • Spot the Differences: Job
    • Spot the Differences: Job
    • Spot the Differences: Name, Relationship Status
    • Why It Matters• We rely on fresh data to make decisions – Google searches – Facebook profiles – Twitter, Linked-In• Outdated data has big impact on users – Wrong profile information: confusion, embarrassment – Old search results: bad business decisions, embarrassment – Old document versions: costly business decisions, regulatory issues
    • System Design
    • Design:Cloud Storage API• Block Device – Fixed block size (1Mb) – Write(block number, block) – Read(block number)  block• Easy to reason about the security• File systems operate on top of this abstraction B1 B2 B3 B4 Disk divided into 1MB blocks
    • Design:System Architecture Client FPGA / ASIC Secure NVRAM (Trusted) Chip System Bus (Untrusted) Internet (Untrusted) CPU Disk RAM (Untrusted) (Untrusted) Network Card (Untrusted) (Untrusted)
    • Design:Trusted Storage on Untrusted Disks160-bit hash in trusted memory authenticates 1TB disk Root Hash Root hash matches h7=h(h5||h6) iff all blocks match 20levels h5=h(h1||h2) Nodes hash h6=h(h3||h4) their children h1=h(B1) h2=h(B2) h3=h(B3) h4=h(B4) Leaves hash their blocks B1 B2 B3 B4 Disk divided into 1MB blocks
    • Design:Hash Tree Caching Node Hash Verified Left Right number child child 1 fabe3c05d8ba995af93e Y Y N 2 e6fc9bc13d624ace2394 Y Y Y The FPGA caches hash 4 53a81fc2dcc53e4da819 Y N N tree nodes 5 b2ce548dfa2f91d83ec6 Y N N 1 The untrusted OS is free to choose the caching policy, for maximum 2 3 performance 4 5 6 7
    • Design:Hash Tree Cache• Server stores entire hash tree in RAM• FPGA has a cache that stores a subset of nodes• Server tells FPGA what nodes to store Cache management commands 1 Node Hash Verified 1 fabe… Y 2 3 2 e6fc… Y 4 53a8… Y 4 5 6 7 5 b2ce… Y
    • Design:Hash Tree Cache - Load• Server tells the FPGA to load a node into a cache entry• The cache entry is unverified right after a load 1 1 2 2 4 4 5 Node Hash Verified Node Hash Verified 1 fabe… Y 1 fabe… Y 2 e6fc… Y 2 e6fc… Y 4 53a8… N 4 53a8… N 5 b2ce… N
    • Design:Hash Tree Cache - Verify• Server tells the FPGA to use a node to verify its children• FPGA checks that parent’s hash matches children hashes 1 1 2 2 4 5 4 5 Node Hash Verified Node Hash Verified 1 fabe… Y 1 fabe… Y 2 e6fc… Y 2 e6fc… Y 4 53a8… N 4 53a8… Y 5 b2ce… N 5 b2ce… Y
    • Design:Hash Tree Cache - Efficiency• Checking leaf 33 requires 10 node loads for a cold cache on this toy example (38 loads on the real FPGA tree)• Remember the root is always loaded in the cache 1 2 3 4 5 8 9 16 17 32 33
    • Design:Hash Tree Cache - Efficiency• Checking leaf 38 only 4 node loads, because 9 is already in the cache and verified• Server can predict client requests and manage cache for high performance 1 2 3 4 5 8 9 16 17 18 19 32 33 38 39
    • Results
    • Results:System Architecture Client FPGA / ASIC Secure NVRAM (Trusted) Chip System Bus (Untrusted) Internet (Untrusted) CPU Disk RAM (Untrusted) (Untrusted) Network Card (Untrusted) (Untrusted)
    • Results: Server Prototype
    • Results: Server Prototype
    • Results: Normal Operation
    • Results: FPGA Board, Normal Operation
    • Results: Attack Does Not Impact User
    • Results: FPGA Board, Under Attack
    • Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data BlockLimit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree NodesLimit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only)Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
    • Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data BlockLimit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree NodesLimit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only)Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
    • Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Disk I/O Throughput Limit: Disk I/O Speed 7,200 RPM HDD 70 MB/s 10,000 RPM HDD 100 MB/s Hash 1MB Data Block 15,000 RPM HDD 130 MB/sLimit: Hash Engine Speed Limit: FPGA Data Bus SSD 250 MB/s Load & Verify Hash Tree NodesLimit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only)Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
    • Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data BlockLimit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
    • Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Block Hash 800 MB/s Pipelined 3,200 MB/s Hash 1MB Data Block Block HashLimit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes 1 MB = 1 block Limit: Hash Engine Speed Limit: Dependencies Transport Throughput Update Hash Tree (Writes Only) PCI Express x16 4,096 MB/s Limit: Hash Engine Speed Limit: Dependencies SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
    • Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree NodesLimit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only) Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
    • Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Tree Node Hash 1.25 M/s Pipelined 5.0 M/s Hash 1MB Data Block Tree Node Hash Limit: Hash Engine Speed Limit: FPGA Data Bus Tree Operations 62.5 k/s Optimized Tree 2.5 M/s Load & Verify Hash Tree Nodes OperationsLimit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport Throughput Limit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
    • Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data Block Limit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only)Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
    • Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Tree Node Hash 1.25 M/s Pipelined 5.0 M/s Hash 1MB Data Block Tree Node Hash Limit: Hash Engine Speed Limit: FPGA Data Bus Tree Operations 62.5 k/s Load & Verify Hash Tree Nodes Limit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport ThroughputLimit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
    • Results: Performance Block Diagram Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed Hash 1MB Data BlockLimit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree NodesLimit: Hash Engine Speed Limit: Dependencies Update Hash Tree (Writes Only)Limit: Hash Engine Speed Limit: Dependencies HMAC (Sign) Result Limit: Hash Engine Speed
    • Results: Prototype Performance (est.) Read / Write 1MB Data Block to Disk Operation Throughput Limit: Disk I/O Speed Node HMAC 1.25 M/s Hash 1MB Data BlockLimit: Hash Engine Speed Limit: FPGA Data Bus Load & Verify Hash Tree NodesLimit: Hash Engine Speed Limit: Dependencies 1 MB = 1 block Update Hash Tree (Writes Only) Transport ThroughputLimit: Hash Engine Speed Limit: Dependencies PCI Express x16 4,096 MB/s SATA II 384 MB/s HMAC (Sign) Result PCI Express x1 250 MB/s Limit: Hash Engine Speed Ethernet 125 MB/s
    • Results: Performance Block Diagram • Steps are performed in Read / Write 1MB Data Block to Disk Limit: Disk I/O Speed parallel (pipelined), because they are in Hash 1MB Data Block different systemLimit: Hash Engine Speed Limit: FPGA Data Bus components • However, the slowest Load & Verify Hash Tree Nodes step is the bottleneckLimit: Hash Engine Speed Limit: Dependencies for the entire system Update Hash Tree (Writes Only) • Each step can be madeLimit: Hash Engine Speed Limit: Dependencies faster by adding more hardware (e.g. more HMAC (Sign) Result disks), assuming cache Limit: Hash Engine Speed policies can scale up
    • Results: Ping-Pong Workload 10 • Typical collaboration 9 scenario 8 7 • Real-Life 6 – Google Docs Block 5 – Facebook Messages 4 – Dropbox 3 2 • Straight-up LRU shines 1 here 0 0 5 10 15 20 Time
    • Results: Photo Gallery Workload 10 • Modeled after data on 9 photo applications 8 7 • Real-Life 6 – Facebook’s #1 Feature Block 5 – Google Picasa 4 – Flixter 3 2 • Special policy inspired 1 by Facebook Haystack 0 classifies photos, loads 0 5 10 15 20 Time cache predictively
    • Results: Map-Reduce Workload 30 • Index-generating Map- Reduce 25 20 • Real-Life – Google Pagerank Block 15 – Facebook friend graph (EdgeRank) 10 5 • Special policy that takes advantage of 0 Map-Reduce access 0 5 10 pattern Time
    • Results: Cache Hit Rates • Applications: 2 users 1 collaborating on a file (ping- pong), photo gallery0.9 browsing, Map-Reduce job0.8 • Cache policies: Speculative Last-Recently Used,0.7 Spec LRU Facebook Haystack’s policy Haystack optimized for caching,0.6 MR-Aware policy optimized for Map- Reduce access patterns0.5 • Conclusion: no policy works well on all applications, so app server must drive policy
    • Results: Protocol Overhead • Client – Server Bandwidth overhead: 0.002% – Operation: 1 HMAC (20 bytes) per 1MB = 0.002% – Handshake: extra secret exchange piggybacks on SSL: 5% • Latency overhead (1 client): 4% – Without security: 8.2ms / request – With security: 8.5ms / request – Latency overhead = the latency of a very fast Internet hop • No throughput overhead (N-clients) – With or without security: 100MB/s – Need 40 HDDs to saturate PCI-E x16, 52 HDDs to saturate FPGAMIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
    • Results: Protocol Overhead• Protocol is simple enough to implement on browser side – Chrome – Firefox – Internet Explorer 10• Easy integration in existing Web applications• End-to-end security
    • Thank You! Questions?
    • Other Applications • FPGA can be used to load user-specified circuits and perform arbitrary computation with security guarantees • Applications: encrypted image search, financial calculations • Potential applications in highly regulated industries, e.g. medical record keeping and processing, secure financial servicesMIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY
    • Secure Computation: Overview Untrusted computation: VM image  CPU cores VM image Cloud Task Trusted Machine Circuit spec computation:  FPGA Circuit spec LUTs • Most code is untrusted, executes in a VM • Trusted code is broken up into kernels which become circuits deployed onto an FPGA • If efficiency is not an issue, deploy a processor on the FPGA, execute software securelyMIT COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE LABORATORY 6/9/2011
    • Secure Computation: Challenge• Multi-tenancy is the key VM Hypervisor to the cloud’s cost Client 1 Client 2 Client 3 effectiveness VM VM VM PCI Express• FPGA can host different applications running in FPGA controller parallel Client 2 Application• Challenge: isolation Client 1 between applications, Application just like a hypervisor Client 3 Application
    • Other Applications• FPGA can be used to load user-specified circuits and perform arbitrary computation with security guarantees• Applications: encrypted image search, financial calculations• Potential applications in highly regulated industries, e.g. medical record keeping and processing, secure financial services
    • Design:FPGA Boot Sequence random nonce PKcard + Manufacturer CertificateCheck certificate against e-fusesCheck Pkcard against certificate PUFsyndrome + SignPKcard(PUFsyndrome)Compute SKfpga from PUFsyndrome Root Hash + SignPKcard(nonce || Root Hash)Verify signature EncSKfpga(SKcard) + MACSKfpga(nonce || SKcard)Verify MAC
    • Design:Client Trust Model• Each FPGA – NVRAM pair has a Endorsement Key (EK)• Manufacturer certifies the public EK• Client uses the public EK to encrypt a HMAC key, which becomes its shared secret with the trusted hardware Manufacturer verify Endorsement sign Client Certificate generate HMAC key PubEK PrivEK encrypt with PubEK decrypt with Encrypted HMAC key PrivEK HMAC key
    • Design:Hash Tree Security1. Impossible to come up with a block B1’ such that B1 ≠ B1’ but h(B1) = h(B1’)2. Impossible to come up with a node hash h1’ such that h1’ such that h1 ≠ h1’ but h(h1||h2) = h(h1’||h2)Therefore, the root hash authenticates the entire contents ofthe tree.
    • Design:FPGA Boot Sequence Security• Server OS transfers messages between FPGA and Trusted Memory  untrusted channel• FPGA authenticates Trusted Memory using Manufacturer Certificate, whose public key is burned into FPGA’s e-fuses• Trusted Memory authenticates FPGA using its Physically Unclonable Function (PUF)• At manufacturing time, FPGA is paired with memory chip• FPGA can be paired with new memory chip if necessary
    • Design:Hash Tree Cache Security• Server OS responsible for loading and verifying tree nodes• Parent node hash verifies children nodes• Reading a block requires the block’s leaf to be verified• Writing a block requires the path from the block’s leaf to the root to be loaded and verified• A node can be loaded in at most one cache line, to prevent replay attacks using stale node hashes