Using Timed-Release Cryptography to
  Mitigate The Preservation Risk of
          Embargo Periods


         Rabia Haq, Mi...
Overview
• Embargo Periods
  – associated preservation risk interval
• Time-Locked Puzzle / Time Release
  Cryptography
• ...
Journal Access Models –
                Romeo Colors*

• Red: Traditional subscription-based Access
   – Purchase-own mode...
Embargoed Access
• Paid access (red) for some time interval,
  then the content becomes open (gold)
  – current issue(s) c...
Who Uses Embargoes?
• 24% of PubMed Central (PMC) titles
  embargoed
• The New England Journal of Medicine
  – embargoed f...
Preservation Risk Interval:
          (A Hypothetical, Non-Topical Example)

• Journal of UT Football Non-Conference
  Sch...
Current Solutions
• LOCKSS (CLOCKSS): www.lockss.org
  – local, cooperating caches between
    subscribers (i.e., librarie...
Can We Use Lazy Preservation?
• We’ve already shown by using IA, search
  engine caches, etc. we can reconstruct
  public ...
Timed-Release Cryptology

• Time-Lock Puzzle (TLP) Creation
  – Data decryption non-parallelizable
  – Serial computation ...
“Regular” RSA
Picking values:                                          “Brute Force” Attacks:

•   n=p*q                  ...
Time Lock Puzzle
Picking values:                                                  Attacking:

•    n=p*q                  ...
Implementation: mod_oai, CRATE
• Both based on (Smith, 2008)
• mod_oai
  – an Apache module providing OAI-PMH functionalit...
mod_oai mechanics
 Integrate OAI-PMH functionality into the web server itself…
         1.       Use mod_oai
             ...
OAI-PMH Data Model




 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19   14
MPEG-21 DIDL Resource Structure




      2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19   15
An Active Repository




 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19   16
A Dying Repository




Records e1, f2, g3 are recoverable; record h is lost.
            2009 ACM/IEEE Joint Conference on...
Dynamic Time-Locked Record
        Embargo within mod_oai
• Identification
   – Calculation of remaining record embargo pe...
Identification



                                                                             update OAI-PMH datestamp
  ...
Encryption
• Modification of LCS35 Time Capsule Crypto-Puzzle to
  use time lock on entire resource (not just the key)
   ...
Encapsulation
This version of the record is 7 of 12 separate encryptions, each of which is
successively easier to break. I...
Selecting Appropriate Values of t
• Time required to break puzzle dependent on processor
  speed
• Given our projected sho...
Effect of Computation Speed on
                       embargolength

• We broke time lock
  puzzles on four
  class of mac...
Picking t With Empirical Data



using 1Ghz machine as baseline,
projecting for a 2.5 Ghz machine,
and locking for 2 years...
Experimental Evaluation
• Embargolength = 365 days
• Embargodecrement = 12
• Test website
  – 525 files
  – 17.3 MB data
 ...
Harvesting Time: Locked & Unlocked




       2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19  ...
O(n2) time to create time-lock puzzle




       2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-1...
Solution: Break Files Into “Chunks”
• Size of file exponentially increases lock-time
• Idea: break file into series of sma...
10 KB Chunked Encryption in mod_oai




      2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19  ...
MPEG-21 DIDL document With Chunks




  This record has been split into 10000-byte
  chunks for faster processing.
  This ...
Future Considerations
• Chunk size performance dependency
• Other optimization methods:
  – Parallel time-locking of resou...
Conclusions
• Suggest the use of time lock puzzles for
  dissemination of embargoed records
  – complement to other method...
Upcoming SlideShare
Loading in...5
×

Using timed-release cryptography to mitigate the preservation risk of embargo periods

1,113
-1

Published on

Slides for:

Rabia Haq, Michael L. Nelson: Using timed-release cryptography to mitigate the preservation risk of embargo periods. 2009 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pp. 183-192.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,113
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Using timed-release cryptography to mitigate the preservation risk of embargo periods

  1. 1. Using Timed-Release Cryptography to Mitigate The Preservation Risk of Embargo Periods Rabia Haq, Michael L. Nelson Old Dominion University Norfolk VA www.cs.odu.edu/~{rhaq,mln} 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 1
  2. 2. Overview • Embargo Periods – associated preservation risk interval • Time-Locked Puzzle / Time Release Cryptography • System Evaluation using mod_oai (resource harvesting using OAI-PMH) – Optimization Using Chunked Encryption • Future Considerations • Conclusion 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 2
  3. 3. Journal Access Models – Romeo Colors* • Red: Traditional subscription-based Access – Purchase-own model • Yellow: Embargoed Access – Hybrid of traditional and open access • Green: Self-authored Open Access – e.g., arVix.org, institutional repositories • Gold: Free and Open Access Journals – e.g., PLoS Journals, www.doaj.org * “Old” Romeo Colors, now green/blue/yellow/white; see: http://www.sherpa.ac.uk/romeoinfo.html 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 3
  4. 4. Embargoed Access • Paid access (red) for some time interval, then the content becomes open (gold) – current issue(s) cost $ – previous issues are free • We’ll assume: gold >= green > yellow > red • Note: inverse of typical online newspaper model of: current is free, archived content costs $. 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 4
  5. 5. Who Uses Embargoes? • 24% of PubMed Central (PMC) titles embargoed • The New England Journal of Medicine – embargoed for 6 months • EBMO Journal – embargoed for 12 months 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 5
  6. 6. Preservation Risk Interval: (A Hypothetical, Non-Topical Example) • Journal of UT Football Non-Conference Scheduling is embargoed for 6 months – sample article “Why scheduling Florida Atlantic, UTEP, Rice & Arkansas is not a national championship schedule” – previous volumes (e.g., 2008, 2007) are freely available – issues 1--6 of current volume are currently for subscribers only – 6 month “sliding window”: when issue 7 comes out on July1, issue 1 becomes freely available • Now imagine Mack Brown issues a cease & desist order to JUTFNS on June 30 – what happens to volume 2009, issues 1-6? Will they ever be available to non-subscribers? 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 6
  7. 7. Current Solutions • LOCKSS (CLOCKSS): www.lockss.org – local, cooperating caches between subscribers (i.e., libraries) – http://www.clockss.org/clockss/Triggered_Content • Portico: www.portico.org – trusted third party archive (i.e., neither library nor publisher) – http://www.portico.org/news/trigger.html 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 7
  8. 8. Can We Use Lazy Preservation? • We’ve already shown by using IA, search engine caches, etc. we can reconstruct public web sites after they’ve been lost (McCown, 2007) • For embargoed content, we could expose encrypted content that is embargoed – but how can we prevent bad guys™ from using zombie farms to break the encryption before the embargo period is up? 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 8
  9. 9. Timed-Release Cryptology • Time-Lock Puzzle (TLP) Creation – Data decryption non-parallelizable – Serial computation required to break puzzle – Data locked for predetermined time-period • not self-unlocking -- still requires computation to unlock • Used in MIT/LCS35 Time Lock Puzzle – http://people.csail.mit.edu/rivest/lcs35-puzzle-description.txt – idea: you could have started in 1999 (with your 1999 computer) and worked for 35 years… OR you can wait until 2033, buy a new computer and work for 1 year rewards procrastination! 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 9
  10. 10. “Regular” RSA Picking values: “Brute Force” Attacks: • n=p*q • need to factor n (which is easier • φ(n)=(p-1)(q-1), then than trying all values of d) “throw away” p & q • simple soln: try all primes from 1 .. √n • pick e coprime to φ(n) • pick d s.t. d*e≡1 mod(φ(n)) helping the attacker: - adding k computers reduces the time public key = (n,e) to break by 1/k private key = (n,d) - they might get lucky and get it on their first shot! Encryption c = me (mod n) more info: http://www.cl.cam.ac.uk/users/rnc1/brute.html Decryption http://axion.physics.ubc.ca/pgp-attack.html m = cd (mod n) 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 10
  11. 11. Time Lock Puzzle Picking values: Attacking: • n=p*q • repeated squarings of a is faster than • φ(n)=(p-1)(q-1), then factoring n -- also not known to be “throw away” p & q parallelizable! (Rivest, 1996) • t=TS • demo: n=253 (i.e., p=11,q=23), t=10, • pick some random key k, w= cm = RC5(k,m) 1 2(2 ) = 22 = 4 (mod 253) • pick random a, 1<a<n 2 2(2 ) = 42 = 16 (mod 253) 3 2(2 ) = 162 = 3 (mod 253) • w= a2t (mod n) 4 2(2 ) = 32 = 9 (mod 253) • ck = k ⊕ w 5 2(2 ) = 92 = 81 (mod 253) 6 puzzle = (n,a,t,ck,cm) 2(2 ) = 812 = 236 (mod 253) 7 2(2 ) = 2362 = 36 (mod 253) 8 actually, in our version we skip step 4 2(2 ) = 362 = 31 (mod 253) 9 and define step 7 as: z = m ⊕ w 2(2 ) = 312 = 202 (mod 253) puzzle = (n,t,z) 10 2(2 ) = 2022 = 71 (mod 253) 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 11
  12. 12. Implementation: mod_oai, CRATE • Both based on (Smith, 2008) • mod_oai – an Apache module providing OAI-PMH functionality for an entire web site not just, for example, records in an institutional repository • CRATE – a model for encoding resource + associated metadata – implemented using MPEG-21 DIDL complex object format 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 12
  13. 13. mod_oai mechanics Integrate OAI-PMH functionality into the web server itself… 1. Use mod_oai • an Apache 2.0 module • automatically answers OAI-PMH requests for an http server • written in C • respects values in .htaccess, httpd.conf 2. Install mod_oai on http://www.foo.edu/ 3. Define baseURL: http://www.foo.edu/modoai → Result: web harvesting with OAI-PMH semantics (e.g., from, until, sets) http://www.foo.edu/modoai?verb=ListRecords&metdataPrefix=oai_didl&from=2004-09-15&set=mime:video:mpeg From site foo, dating from 9/15/2004 through today Give me all resources Using OAI-PMH And their preservation metadata that are MIME type video-MPEG 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 13
  14. 14. OAI-PMH Data Model 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 14
  15. 15. MPEG-21 DIDL Resource Structure 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 15
  16. 16. An Active Repository 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 16
  17. 17. A Dying Repository Records e1, f2, g3 are recoverable; record h is lost. 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 17
  18. 18. Dynamic Time-Locked Record Embargo within mod_oai • Identification – Calculation of remaining record embargo period • Encryption – Calculating record time-lock puzzle complexity – Time-Lock Puzzle creation • Encapsulation – exploiting flexibility of MPEG-21 DIDL format to encapsulate encrypted resources and related information 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 18
  19. 19. Identification update OAI-PMH datestamp as time lock becomes weaker 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 19
  20. 20. Encryption • Modification of LCS35 Time Capsule Crypto-Puzzle to use time lock on entire resource (not just the key) – as per code provided at: http://people.csail.mit.edu/rivest/lcs35- puzzle-description.txt • Input: timeUnit (controls puzzle complexity) • Compute: u = 2t mod((p-1)(q-1)) w = (2u) mod(n) z = resource ⊕ w • Output: n, t, z 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 20
  21. 21. Encapsulation This version of the record is 7 of 12 separate encryptions, each of which is successively easier to break. It will take approximately 3650 hours of computation to break this time-lock. The next update will be available on 2008-01-16T20:56:15Z. Crypto-Puzzle for LCS35 Time Capsule. Puzzle parameters (all in decimal): n = 398399 t = 264600000. z = 313239174518025552773909388461801735302388... 893375562056859914777144518879488573607906... 742437030171894184996228671834511813009803... (many lines deleted for space) To solve the puzzle, first compute w = 2*2*t (mod n). Then exclusive-or the result with z. (Right-justify the two strings first). The result is the secret message (8 bits per character). 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 21
  22. 22. Selecting Appropriate Values of t • Time required to break puzzle dependent on processor speed • Given our projected short embargo period (6-24 months), we made a simplifying assumption that Moore’s law increases linearly (not exponentially) – idea: in the next few months, you’re more likely to see something like: 2Ghz→2.2Ghz, not 2Ghz→4Ghz • recall: t = number of squarings – t=T*S – S=3000 squarings/second, T=1800 seconds *tU, – tU = f(machine speed) * embargolength – t=3000*1800*tU 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 22
  23. 23. Effect of Computation Speed on embargolength • We broke time lock puzzles on four class of machines (in GHz): – 1.8 (5 nodes) – 1.6 (26 nodes) – 1 (1 node) – 0.75 (1 node) 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 23
  24. 24. Picking t With Empirical Data using 1Ghz machine as baseline, projecting for a 2.5 Ghz machine, and locking for 2 years (63115200 seconds): • tU = 63115200 * 2.5 / (1727.61) = 9133 • t = 3000 * 1800 * 9133 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 24
  25. 25. Experimental Evaluation • Embargolength = 365 days • Embargodecrement = 12 • Test website – 525 files – 17.3 MB data – 63% text files – Average file size = 33KB 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 25
  26. 26. Harvesting Time: Locked & Unlocked 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 26
  27. 27. O(n2) time to create time-lock puzzle 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 27
  28. 28. Solution: Break Files Into “Chunks” • Size of file exponentially increases lock-time • Idea: break file into series of small chunks – still O(n2), but with a much more favorable constant • Lock-time on a 1.8 GHz machine time_to_lock(200 KB) = 13 sec time_to_lock(100 KB) = 3 sec 200 KB = 100 KB + 100 KB = 3 sec + 3 sec = 6 sec 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 28
  29. 29. 10 KB Chunked Encryption in mod_oai 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 29
  30. 30. MPEG-21 DIDL document With Chunks This record has been split into 10000-byte chunks for faster processing. This is part 1 of 7 chunks, with unlocked chunks to be reassembled in the specified order. 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 30
  31. 31. Future Considerations • Chunk size performance dependency • Other optimization methods: – Parallel time-locking of resources – Data pre-locking – only time-lock encryption key, use other encryption methods on the original resource (as per original Rivest (1996), not as per http://people.csail.mit.edu/rivest/lcs35-puzzle- description.txt) 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 31
  32. 32. Conclusions • Suggest the use of time lock puzzles for dissemination of embargoed records – complement to other methods such as LOCKSS, Poritco, etc. • Implemented and evaluated time lock puzzles in the mod_oai & CRATE environment • Full paper: – http://doi.acm.org/10.1145/1555400.1555430 – http://www.cs.odu.edu/~mln/pubs/jcdl09/jcdl09-time-lock.pdf 2009 ACM/IEEE Joint Conference on Digital Libraries, Austin TX, June 15-19 32
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×