Towards a theory of data entangelement

Towards a Theory of Data Entanglement James Aspnes, Joan Feigenbaum, Aleksandr Yampolskiy, and Sheng Zhong (Yale University)

Outline Motivation Dagster and Tangler Our model Notions of entanglement Possibility and impossibility results Conclusion

Goal: Protect Remotely Stored Data from the Server Question: Suppose you store your data on a remote server. How do you ensure that it is not corrupted by the server? Answer: Have your data entangled with some VIPs’ data so that corruption of your data  corruption of theirs.

Previous Work: Dagster [SW01] New Document  Encrypt c randomly chosen blocks Pool of blocks Analysis: Deleting a typical document  loss of O( c ) documents

Previous Work: Tangler [WM01] (0, New Document) 2 randomly chosen blocks Pool of n blocks Analysis: Deleting a typical document  loss of O ( (log n ) / n ) documents Interpolate degree-2 poly F() (x 1 ,F(x 1 )) (x 2 ,F(x 2 ))

Our Model: Basic Framework Initialization : Keys are distributed to participants. Entanglement : Users’ data are combined into a common store. Tampering: Adversary tampers with the store before it is stored on server. encoding E … d 1 d 2 d n initializer I k 1 k 2 k n k E tamperer storage server

Our Model: Basic Framework (cont.) Recovery : Users attempt to recover their data. If R i returns original document d i , we say that user i recovers her data. … k 1 k 2 k n storage server

Our Model : Classification Question: What can the adversary do to the data store? Answer: He can… tamper with the store tamper with the store and distribute a new recovery algorithm to all users ( upgrade attack ) encrypt the store and distribute his recovery algorithm only to a few select buddies ( superencryption attack )

Our Model : Classification (cont.) Classification based on recovery algorithm: Standard recovery algorithm Public recovery algorithm Private recovery algorithm … … …

Our Model : Classification (cont.) Classification based on corrupting algorithm: Destructive adversary that reduces entropy of the data store. Arbitrary adversary. Altogether, we have 6 (= 3 £ 2) adversary classes .

Our Definitions Fix encoding scheme , adversary , and recovery algorithms R i . Recovery vector summarizes which documents are recovered

Our Definitions (cont.) Data dependency: d i depends on d j if, with high probability, d i is recovered  d j is recovered: d 1 d 2 d 3 d 4 d 1 depends on d 2

Our Definitions (cont.) All-or-nothing integrity (AONI): every document depends on every other document: d 1 d 2 d 3 d 4

Our Definitions (cont.) Symmetric recovery: adversary cannot bias which documents are recovered

Possibility of AONI in Standard-Recovery Model All users use the standard recovery algorithm: for all i, R i =R. When combining data, mark data store using an unforgeable Message Authentication Code (MAC). Standard recovery algorithm checks MAC: If MAC is valid, recover data. If MAC is invalid, refuse to recover data.

Impossibility of AONI in Public and Private-Recovery Models If any users use the adversary’s recovery algorithm (for some i, R i ≠ R), AONI cannot be achieved Adversary modifies the data store so that old recovery algorithm does not work. And distributes a new recovery algorithm that flips a coin to decide whether to recover data or not.

Impossibility of AONI in Public and Private-Recovery Models (cont.) With high probability, not all coin flips will have same result. With high probability, some data are recovered while others are not. …

Possibility of Symmetric Recovery in Public-Recovery Model All users use adversary’s recovery algorithm: for all i, We can prevent targeted destruction of documents. Documents d 1 ,…, d n must appear i.i.d Encoding scheme must be symmetric:

Possibility of AONI for Destructive Adversaries We can achieve AONI in all recovery models if tamperer destroys entropy. When combining data, interpolate a polynomial using points (k i , d i ). Store = polynomial. AONI is achieved if sufficient entropy is removed. Many stores are mapped to single corrupted store.  With high probability, cannot recover every data item.

Summary of Results  all-or-nothing Private Recovery symmetric recovery all-or-nothing Public Recovery all-or-nothing all-or-nothing Standard Recovery Arbitrary Tamperer Destructive Tamperer

Future Work We have considered a single-round model. Allowing multiple rounds of storage/retrieval will be more realistic. What if data entanglement is combined with other techniques like replication? Will that help to defend data against untrusted server(s)?

Towards a theory of data entangelement

More Related Content

What's hot

Viewers also liked

Similar to Towards a theory of data entangelement

More from Aleksandr Yampolskiy

Towards a theory of data entangelement

Editor's Notes