Your SlideShare is downloading.
×

- 1. © 2018 | idalab GmbH | Potsdamer Straße 68 | 10785 Berlin | idalab.de page 1 | confidential Agency for Data Science Machine learning & AI Mathematical modelling Data strategy Prof. Dr. Delaram Kahrobaei The data-privacy dilemma: How full homomorphic encryption could bring healthcare into the digital era idalab seminar #12 | July 13th 2018
- 2. The data-privacy dilemma: How fully homomorphic encryption could bring healthcare into the digital era Delaram Kahrobaei idalab, Berlin, July 13, 2018
- 3. FHE Deﬁnition A Fully Homomorphic Encryption (FHE) scheme is an encryption scheme which is additively and multiplicatively homomorphic. For encryption function E, and plaintexts a, b, the following holds: E(a + b) = E(a) + E(b) E(ab) = E(a)E(b) It is implied that for any polynomial function F, evaluating the function on ciphertexts and decrypting the result is equivalent to evaluating the function on plaintexts, xi . E(F(x1, x2, . . . , xn)) = F(E(x1), E(x2), . . . , E(xn))
- 4. A wealth of medical data is inaccessible to researchers and clinicians due to privacy restrictions such as HIPAA.
- 5. Introduction The medical institutions and companies who own medical information systems wish to keep their models private when in use by outside parties.
- 6. Introduction Clinicians would beneﬁt from access to predictive models for diagnosis, such as classiﬁcation of tumors as malignant or benign, without compromising patients’ privacy.
- 7. Goals Goal: Secure, accurate and eﬃcient machine learning and function evaluation over encrypted data.
- 8. Encrypted Machine Learning Cryptography and machine learning at ﬁrst seem like opposites. The former seeks to hide information, while the latter is used to discover information. What if we combined these two ﬁelds and used machine learning to discover information about private data? This is the idea behind the relatively new ﬁeld of encrypted machine learning.
- 9. A New FHE Scheme Private key encryption is ideal for medical applications. Due to HIPAA restraints and the private nature of medical data, it makes sense that only the data owner should have any ability to decrypt the information.
- 10. Possibilities Possibilities Training a model on encrypted data Evaluating functions on private data Preserving: Privacy: Research Center cannot determine anything about private patient database D Intellectual Property: Hospital cannot determine anything about clinical decision support model
- 11. Communication Flow
- 12. Third Party Search
- 13. Past FHE Schemes The ﬁrst fully homomorphic encryption scheme was published by Craig Gentry in 2009. While various improvements have been introduced in the time since, fully homomorphic encryption remains an expensive procedure in terms of both time and storage. These schemes contain an inherent amount of noise which grows with each operation. As a result, there is a cap on how many operations you can perform on encrypted data before the results become meaningless.
- 14. FHE Scheme D. Kahrobaei, H. Lam, V. Shpilrain, Method and apparatus for fully homomorphic encryption, private search, and private information retrieval, U.S. 9,942,03. Plaintext Space: Zp, embedded into a ring R which is a direct sum of several copies of Zp Ciphertext Space: S, another direct sum of several copies of Zp, with ideal I such that S/I = R Encryption: for u ∈ R, and E(0) ∈R I E(u) = u + E(0) Decryption: a map ρ : S → R where ρ(I) = 0.
- 15. Zp α −→ R E −→ S ρ −→ R β −→ Zp Correctness of Evaluation: For j1, j2, j3 ∈ I and u, v ∈ R, the scheme presented above is both additively and multiplicatively homomorphic: E(u) + E(v) = u + j1 + v + j2 = u + v + j3 = E(u + v) E(u)E(v) = (u+j1)(v +j2) = uv +uj2+j1v +j1j2 = uv +j3 = E(uv) Security: Secure against ciphertext-only attack (COA) An attacker who retrieves part of an encrypted database has only a negligible probability of correctly decrypting any portion of the database.
- 16. Protocol: Key generation I Elements of a private ring R are encrypted. The ring R = Sn is the factor ring of the public ring Sr by a private ideal I. For embedding elements of a real-life database in R = Sn. Alice (the owner of a private database) starts with a presentation of the ring Sn: Sn = x1, . . . , xn | p·1 = 0, x2 i = xi (for all i), xi xj = xj xi (for all i, j) .
- 17. Protocol: Key generation II. Alice generates a private encryption/decryption key as follows. She starts by expanding the set {xi } of the generators by adding several new generators xn+1, . . . , xr . She then selects an ideal I of Sr generated (as an ideal of Sr ) by elements of the form (xm − wm(x1, . . . , xm−1)) for m = n + 1, . . . , r, where wm = wm(x1, . . . , xm−1) is a random idempotent element of Sm−1.
- 18. Protocol: Key generation III. Alice converts the basis {xi } to the orthogonal basis {ei }, i.e., she represents each xi as a linear combination of ei . It may happen that all generators of the ideal I selected at the previous step have the same coordinates (in this orthogonal basis) equal to 0. These coordinates are then discarded by Alice, i.e., the public ring will have dimension somewhat smaller than 2r . After that, Alice selects a random permutation π on the set of remaining ei and publishes a presentation P: P = e1, . . . , es | p·1 = 0, e2 i = ei (for all i), ei ej = 0 for i = j .
- 19. Protocol: Encryption. Encryption of a plaintext u ∈ Sn is E(u) = u + E(0), where E(0) is a random element of the ideal I of the ring Sr , i.e., an element of the form r j=n+1(xj − wj (x1, . . . , xj−1) · hj (x1, . . . , xr ), where hj are random elements of Sr , i.e., sums of monomials in x1, . . . , xr with random coeﬃcients. The whole expression E(u) = u + E(0) is then converted to a linear combination of ei , where {ei } is the published orthogonal basis.
- 20. Protocol: Decryption. To decrypt g = g(e1, . . . , e2r ) Alice ﬁrst converts g to the “standard” basis {xi }. After that, she replaces xj by wj (x1, . . . , xj−1), starting with j = r and going down to j = n + 1.
- 21. Private search Suppose Alice (the data owner) wants to ﬁnd out whether or not E(x) is in the (encrypted) database D. This deprives the database keeper Carl (e.g. the cloud) from just matching E(x) to elements of the database. Instead, Carl does the following. For each element E(y) of the database, he computes E(x) − E(y), which is equal to E(x − y) because the encryption function E respects the addition. Carl computes the following and send it to Alice: P(x) = ΠE(y)∈D(E(x) − E(y)) = ΠE(y)∈DE(x − y) Since the encryption function E respects the multiplication, too, the element P(x) is equal to E(ΠE(y)∈D(x − y)). Alice then decrypts this element to recover ΠE(y)∈D(x − y). If all plaintexts y are elements of a ﬁeld, then the latter product is equal to 0 if and only if x = y for at least one y such that E(y) is stored in the database D. So Alice will know whether or not E(x) is in the database D.
- 22. Experiment First Simulation Linear time-series analysis by using the Recursive Least Squared (RLS) function on both encrypted and plaintext data. (RLS: is an adaptive ﬁlter algorithm that recursively ﬁnds the coeﬃcients that minimize a weighted linear least squares cost function relating to the input signals.)
- 23. Experiment Hospital Center Database D = {xi [n]|i = 1 . . . k} RLS Algorithm F E(D) −−−−−−−−→ F(E(D)) ←−−−−−−−−−− Error−−−−−−−→
- 24. Data Database 1: synthetic signal y(n), generated according to a known distribution, from input signal x(n) where h(k) = 1 k+1: y(n) = N k=0 h(k)x(n − k) Database 2: Santa Fe Time Series Competition, heart rate data.
- 25. Results Table: RLS applied on plain-text and encrypted data for both data sets. Data N Plain-text Error Ciphertext Error Diﬀerence Time DB 1 3 3.26 ∗ 10−4 3.29 ∗ 10−4 3 ∗ 10−6 0.004 ms DB 1 9 2.95 ∗ 10−4 3.02 ∗ 10−4 7 ∗ 10−6 0.009 ms DB 2 3 1.264 1.269 0.005 0.005 ms DB 2 9 1.124 1.129 0.005 0.01 ms
- 26. Analysis Discussion: Error does not scale with window size, N Diﬀerence in error is negligible: rounding error The encryption scheme allows for correct RLS training Basic machine learning models can be trained on ciphertexts Time is relatively low The scheme is practically eﬃcient
- 27. Experiment Second Simulation Calculating a known multivariate polynomial function, for example a risk assessment score, on patient data. (Risk assessment is the determination of quantitative or qualitative estimate of risk related to a well-deﬁned situation and a recognized threat (also called hazard). Quantitative risk assessment requires calculations of two components of risk (R): the magnitude of the potential loss (L), and the probability (p) that the loss will occur.)
- 28. Experiment Hospital Center Database D = {xi |i = 1 . . . k} Risk Assesment F E(D) −−−−−−−−→ F(E(D)) ←−−−−−−−−−−
- 29. Data Diabetic Data: Includes 10 years of clinical care data for diabetic patients. Each record has over 50 features. The predetermined function outputs a re-admission prediction. Parkinson Data: Includes 6-month telemonitoring trial for symptom progression monitoring of Parkinson’s Disease patients, with features xi . The known function calculates pitch period entropy as PPE = x0 + 18 k=1 akxk
- 30. Results Table: Size of Databases Data MB Patients Diabetes 19 101767 Parkinson’s 1 5876 Table: Simulation results performed on two datasets Data Enc. Patient Enc. DB Dec. Patient Dec. DB Fn Eval. Diabetes 24.45 ms 2488 s 181.52 ms 18472.7 s 0.01 ms Parkinson’s 11.94 ms 70.1 s 97.98 ms 575.7 s 0.006 ms
- 31. Analysis Discussion: There is no error in function evaluation Assessment of plaintext data is the same as ciphertext data Function evaluation can be performed relatively quickly Comparisons: Duas and Micciancio perform a single bootstrapped NAND in 0.69s. Aslett et al. perform a single scalar addition at 0.003s, scalar multiplication at 0.084s with high performance computers.
- 32. Conclusions FHE permits use of machine learning algorithms Outline of example scenarios Time series analysis with minimal error Eﬃcient function evaluation It is completely feasible to consider the utilization of this encryption function for highly sensitive, private and federally regulated medical data.
- 33. Fully Homomorphic Machine Learning: Encrypted medical data Collaboration with University of Michigan, Computational Medicine and Bioinformatics Department A. Gribov, K. Horan, J. Gryak, D. Kahrobaei, V. Shpilrain, R. Souroush, K. Najarian, Medical diagnostic based on encrypted medical data, 1–12 (2018).
- 34. Experiment: Naive Bayes Classiﬁcation A. Wood, V. Shpilrain, A. Mostashari, K. Najarian, D. Kahrobaei, Private Naive Bayes Classiﬁcation of Personal Biomedical Data: Application in Cancer Data Analysis, Submitted, 1–8 (2018). To show the utility of this new scheme, we carried out an implementation of Naive Bayes classiﬁcation over encrypted medical data.
- 35. Naive Bayes The Naive Bayes classiﬁer is based oﬀ of Bayes Theorem. Given possible classes G to which a sample X could be assigned, Bayes Theorem states that Pr(G = |X) = Pr(X|G = ) Pr(G = ) Pr(X) .
- 36. Naive Bayes The “naive” assumption in Naive Bayes is the independence of each attribute. Speciﬁcally, for a class G = j and a feature space of dimension p with X = (X1, . . . , Xp), the Naive Bayes model assumes that Pr(X|G = ) = p k=1 Pr(Xj |G = ) (1)
- 37. Naive Bayes Because Pr(X) is a constant term for a ﬁxed X, only multiplication and comparison are necessary in order to classify a sample. In the binary case, simply compute Pr(G = 0|X) = Pr(X|G = 0) · Pr(G = 0) and Pr(G = 1|X) = Pr(X|G = 1) · Pr(G = 1). The larger of these two values is your assigned class.
- 38. Private Naive Bayes Assume the Data Owner, Alice, wishes to classify a sample X using a learned model owned by Bob. A private protocol should achieve the following: Bob should learn no unnecessary information about the input provided by Alice. Alice should learn nothing but the predicted class index of X.
- 39. Private Naive Bayes To carry this out we follow the idea of Bost et al.Bost et al, Machine Learning Classiﬁcation over Encrypted Data, 2015. Bob prepares tables based upon the information needed for Naive Bayes classiﬁcation.
- 40. Private Naive Bayes First, Bob prepares table P represented as a column vector of degree r where Pi = Pr(G = Gi ), the prior probability on class Gi . Next C prepares a table T as a r × p matrix where entry Tij represents Pr(X = Xj |G = Gi ).
- 41. Private Naive Bayes Bob encrypts each term in the tables P and T and sends these matrices to Alice. Because the entries are encrypted, Alice cannot deduce any information about the contents – however, she can compute the encrypted class probabilities Enc(P(G = Gi |X)) using the information she received from Bob.
- 42. Private Naive Bayes Now that Alice has all of the encrypted class probabilities given her sample X, she needs to determine which of these probabilities is the largest. Speciﬁcally, she needs to compute the argmax of the vector of class probabilities {Enc(P(G = Gi |X))}i∈I . (Note that in the binary case, I = {0, 1}).
- 43. Private Argmax If Alice sends Bob {Enc(P(G = Gi |X))}i∈I he can decrypt the values and learn all of her class probabilities. We need a method to compute the argmax without revealing any of Alice’s information.
- 44. Private Argmax Denote Pi = Pr(Gi |X), denote the encryption function by Enc, and the set of all Enc(Pi ) as E. Let F represent a family of monotone functions which commute with encryption. In our implementation we simply use scalar functions f (x) = {kx|k ∈ Z, |k| < 5}.
- 45. Private Argmax While |E| > 1: 1. Alice computes π(E), a random permutation π on E. 2. Alice randomly chooses f ∈ F and computes f Enc(Pπ(1)) and f Enc(Pπ(2)) . 3. Next, Alice uses the additive homomorphic properties of Enc and f as well as the commutative property of the function to evaluate E = f Enc(Pπ(1)) − f Enc(Pπ(2)) = Enc(f (Pπ(1) − Pπ(2))) 4. D sends E to C.
- 46. Private Argmax 5. Bob decrypts E and recovers f (Pπ(1) − Pπ(2)) If this value is negative, Bob sends the bit b = 0 to Alice, otherwise send b = 1. 6. If b = 0, remove Eπ(1) from π(E), otherwise remove Eπ(2). 7. Alice reverses the permutation to recover E (sans the removed element) and repeats.
- 47. Experimental Results The implementation was carried out in C++ and utilized the publicly available Wisconsin Breast Cancer data set on the UCI Machine Learning Repository. S = 100 S = 400 Time (s) Accuracy Time (s) Accuracy Encrypted 0.179333 0.9622 0.572690 0.9752 Unencrypted 0.00001 0.9691 0.00001 0.9858 The sizes of the training datasets tested, S, are 100 and 400. The size of the testing dataset was 583 and 283, respectively.
- 48. Thanks Thank You!