Presentation given at the KDE seminar (University of Tsukuba) about the paper "Efficient Similarity Search over Encrypted Data"*.
This presentation is based on the uploader's understanding of the paper and may contain inaccurate interpretations.
A summary of the paper is available at: https://mshcruz.wordpress.com/2016/09/02/summary-efficient-similarity-search-over-encrypted-data/
*Kuzu et al.: "Efficient Similarity Search over Encrypted Data". ICDE 2012.
1. Efficient Similarity Search over
Encrypted Data
Mehmet Kuzu, Mohammad Saiful Islam, and Murat Kantarcioglu
IEEE 28th International Conference on Data Engineering
ICDE 2012
Washington, DC - USA - 2012
KDE Seminar
April 25, 2016
Mateus Cruz
4. Introduction Preliminaries Proposal Experiments Conclusion
SCENARIO
Outsourcing storage and processing
Privacy concerns
Encrypt data before uploading
Loss of utility
Searchable encryption schemes
Only exact query matching
1 / 27
5. Introduction Preliminaries Proposal Experiments Conclusion
PROPOSAL
Similarity search over encrypted data
Using locality sensitive hashing (LSH)
Contributions
Secure LSH index
Fault tolerant keyword search
Separation of leaked information
Schemes
Basic single server
Two round multi-server
Paillier index based one round multi-server
2 / 27
7. Introduction Preliminaries Proposal Experiments Conclusion
DEFINITIONS
D = {D1, D2, . . . , Dn}
Collection of sensitive data items
Fi = {fi1
, . . . , fiz
}
Feature set of item Di
Ci is the encrypted form of Di
I is a secure index
Used to find items having a specific feature
3 / 27
8. Introduction Preliminaries Proposal Experiments Conclusion
SIMILARITY SEARCHABLE ENC. (1/2)
Keygen(ψ): Outputs a key K (K ∈ {0, 1}ψ
)
Enc(K, Di): Encrypts Di into Ci
Dec(K, Ci): Decrypts Ci into Di
4 / 27
9. Introduction Preliminaries Proposal Experiments Conclusion
SIMILARITY SEARCHABLE ENC. (1/2)
Keygen(ψ): Outputs a key K (K ∈ {0, 1}ψ
)
Enc(K, Di): Encrypts
Probabilistic encryption
Di into Ci
Dec(K, Ci): Decrypts Ci into Di
4 / 27
10. Introduction Preliminaries Proposal Experiments Conclusion
SIMILARITY SEARCHABLE ENC. (2/2)
BuildIndex(K, D)
Extract feature set from D
Outputs the index I
Trapdoor(K, f)
Generates a trapdoor T for a feature f ∈ F
Search(I, T)
Search on I for the trapdoor T
Outputs an encrypted result collection C
5 / 27
11. Introduction Preliminaries Proposal Experiments Conclusion
SIMILARITY SEARCHABLE ENC. (2/2)
BuildIndex(K, D)
Extract feature set from D
Outputs the index I
Trapdoor(K, f)
Generates a trapdoor
Value that allows search over
probabilistic encryption
T for a feature f ∈ F
Search(I, T)
Search on I for the trapdoor T
Outputs an encrypted result collection C
5 / 27
12. Introduction Preliminaries Proposal Experiments Conclusion
FUZZY SEARCH
dist : F × F → R
Gives the distance between two features
α and β (α < β)
Thresholds for the similarity metric
FuzzySearch(I, T)
Search on I for the trapdoor T
Outputs an encrypted result collection C
With high probability
– Cj ∈ C if ∃fi (dist(fi , f) ≤ α)
– Cj /∈ C if ∀fi (dist(fi , f) ≥ β)
6 / 27
13. Introduction Preliminaries Proposal Experiments Conclusion
LOCALITY SENSITIVE HASHING
Approximation algorithm
Near neighbor search
High dimensional spaces
Map objects in several buckets
Similar objects share a bucket
(r1, r2, p1, p2)-sensitive family
If dist(x, y) ≤ r1, Pr[h(x) = h(y)] ≥ p1
If dist(x, y) ≥ r2, Pr[h(x) = h(y)] ≤ p2
7 / 27
15. Introduction Preliminaries Proposal Experiments Conclusion
SECURE INDEX (1/2)
1 Feature extraction
Maps Di to Fi = {fi1
, . . . , fiz }
2 Metric space translation
Translate features to vectors
8 / 27
16. Introduction Preliminaries Proposal Experiments Conclusion
SECURE INDEX (2/2)
3 Bucket index construction
Map vectors to buckets by applying LSH
A bucket is a bit vector of size
– : total number of data items
λ buckets
– λ: number of used hash functions
4 Bucket index encryption
Encrypt bucket identifiers and contents
– [EncKid
(Bj ), EncKpayload
(VBj
)] ∈ I
Add some fake records into the index
– Hide the number of features in the dataset
9 / 27
17. Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
10 / 27
18. Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
Creates private keys
2 Index construction
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
10 / 27
19. Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
Creates index for
collection D
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
10 / 27
20. Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
3 Data encryption
Encrypts and uploads
data items, and share
keys and hash func-
tions with users
4 Trapdoor construction
5 Search
6 Data decryption
10 / 27
21. Introduction Preliminaries Proposal Experiments Conclusion
TRAPDOOR CONSTRUCTION
A user wants items that contain fi
Apply LSH to construct the plain query
g1(fi), . . . , gλ(fi), gi ∈ g
Encrypts each component of the query
Using the key received earlier
Tfi
= (Enc(g1(fi)), . . . , Enc(gλ(fi)))
Sends Tfi
to the server
11 / 27
22. Introduction Preliminaries Proposal Experiments Conclusion
SEARCH (1/2)
The server searches on the index using Tfi
Each component of Tfi
Returns corresponding bit vectors
The user...
Decrypts the bit vectors
Ranks the data identifiers
12 / 27
23. Introduction Preliminaries Proposal Experiments Conclusion
SEARCH (2/2)
Rank using score(id(Dj))
Number of common buckets between Dj and fi
More buckets in common means higher rank
Users send desired identifiers to the server
The server returns the encrypted items
13 / 27
24. Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
The user decrypts the
received items and ob-
tains the plaintext data
14 / 27
25. Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
Problem!
Leakage of the association between
identifiers and trapdoors
14 / 27
26. Introduction Preliminaries Proposal Experiments Conclusion
MULTI-SERVER SCHEME
Use two servers instead of one
One server for the index
Another server for the data items
Honest-but-curious servers
Do not collaborate with each other
15 / 27
27. Introduction Preliminaries Proposal Experiments Conclusion
TWO ROUND SEARCH SCHEME
The data owner...
Sends the index to server Bob
Sends encrypted items to server Charlie
A user...
Sends trapdoors to server Bob
Receives encrypted bit vectors
Decrypts and ranks vectors
Sends desired identifiers to server Charlie
Receives encrypted items from server Charlie
16 / 27
30. Introduction Preliminaries Proposal Experiments Conclusion
PAILLIER INDEX BASED SEARCH
One round search
Minimize client computation
Transfer the burden to the servers
Servers communicate with each other
Paillier index
Use of the Paillier cryptosystem
– Probabilistic
– Homomorphic additive property
Dec(Enc(m1) ∗ Enc(m2)) = m1 + m2
18 / 27
31. Introduction Preliminaries Proposal Experiments Conclusion
PAILLIER INDEX
Keep the encrypted form of each bit
Instead of a single encrypted bit vector
Traditional index: (πs, σVs
) ∈ I
πs: Encrypted bucket ID
σVs : Encrypted bucket vector
Paillier Index: (πs, [es1
, . . . , es ]) ∈ I
esj
= EncKpub
(1) if Vs[id(Dj)] = 1
esj
= EncKpub
(0) if Vs[id(Dj)] = 0
19 / 27
32. Introduction Preliminaries Proposal Experiments Conclusion
ONE ROUND SEARCH SCHEME (1/2)
After the Paillier index is constructed...
Send index to Bob with Kpub
Send encrypted items to Charlie with Kpriv
Trapdoor construction
Multi-component trapdoor Tfi
= {π1, . . . , πλ}
A user sends Tfi
to Bob along with t
– Retrieval of top t items
20 / 27
33. Introduction Preliminaries Proposal Experiments Conclusion
ONE ROUND SEARCH SCHEME (2/2)
Index search (Bob)
Scores computed using homomorphic addition
– Common buckets between Tfi
and a data item
Sends (i, score(i)) to Charlie along with t
Identifier resolution (Charlie)
Decrypts the received scores
Ranks items according to scores
Sends the encrypted results to the user
21 / 27
41. Introduction Preliminaries Proposal Experiments Conclusion
SUMMARY
Similarity searchable encryption scheme
LSH secure index
Basic scheme
Not secure
Two round scheme
Good for a large number of data items
One round scheme (Pallier index)
Good for a large number of features
27 / 27