SlideShare a Scribd company logo
1 of 46
Download to read offline
Efficient Similarity Search over
Encrypted Data
Mehmet Kuzu, Mohammad Saiful Islam, and Murat Kantarcioglu
IEEE 28th International Conference on Data Engineering
ICDE 2012
Washington, DC - USA - 2012
KDE Seminar
April 25, 2016
Mateus Cruz
Introduction Preliminaries Proposal Experiments Conclusion
OUTLINE
1 Introduction
2 Preliminaries
3 Proposal
4 Experiments
5 Conclusion
Introduction Preliminaries Proposal Experiments Conclusion
OUTLINE
1 Introduction
2 Preliminaries
3 Proposal
4 Experiments
5 Conclusion
Introduction Preliminaries Proposal Experiments Conclusion
SCENARIO
Outsourcing storage and processing
Privacy concerns
Encrypt data before uploading
Loss of utility
Searchable encryption schemes
Only exact query matching
1 / 27
Introduction Preliminaries Proposal Experiments Conclusion
PROPOSAL
Similarity search over encrypted data
Using locality sensitive hashing (LSH)
Contributions
Secure LSH index
Fault tolerant keyword search
Separation of leaked information
Schemes
Basic single server
Two round multi-server
Paillier index based one round multi-server
2 / 27
Introduction Preliminaries Proposal Experiments Conclusion
OUTLINE
1 Introduction
2 Preliminaries
3 Proposal
4 Experiments
5 Conclusion
Introduction Preliminaries Proposal Experiments Conclusion
DEFINITIONS
D = {D1, D2, . . . , Dn}
Collection of sensitive data items
Fi = {fi1
, . . . , fiz
}
Feature set of item Di
Ci is the encrypted form of Di
I is a secure index
Used to find items having a specific feature
3 / 27
Introduction Preliminaries Proposal Experiments Conclusion
SIMILARITY SEARCHABLE ENC. (1/2)
Keygen(ψ): Outputs a key K (K ∈ {0, 1}ψ
)
Enc(K, Di): Encrypts Di into Ci
Dec(K, Ci): Decrypts Ci into Di
4 / 27
Introduction Preliminaries Proposal Experiments Conclusion
SIMILARITY SEARCHABLE ENC. (1/2)
Keygen(ψ): Outputs a key K (K ∈ {0, 1}ψ
)
Enc(K, Di): Encrypts
Probabilistic encryption
Di into Ci
Dec(K, Ci): Decrypts Ci into Di
4 / 27
Introduction Preliminaries Proposal Experiments Conclusion
SIMILARITY SEARCHABLE ENC. (2/2)
BuildIndex(K, D)
Extract feature set from D
Outputs the index I
Trapdoor(K, f)
Generates a trapdoor T for a feature f ∈ F
Search(I, T)
Search on I for the trapdoor T
Outputs an encrypted result collection C
5 / 27
Introduction Preliminaries Proposal Experiments Conclusion
SIMILARITY SEARCHABLE ENC. (2/2)
BuildIndex(K, D)
Extract feature set from D
Outputs the index I
Trapdoor(K, f)
Generates a trapdoor
Value that allows search over
probabilistic encryption
T for a feature f ∈ F
Search(I, T)
Search on I for the trapdoor T
Outputs an encrypted result collection C
5 / 27
Introduction Preliminaries Proposal Experiments Conclusion
FUZZY SEARCH
dist : F × F → R
Gives the distance between two features
α and β (α < β)
Thresholds for the similarity metric
FuzzySearch(I, T)
Search on I for the trapdoor T
Outputs an encrypted result collection C
With high probability
– Cj ∈ C if ∃fi (dist(fi , f) ≤ α)
– Cj /∈ C if ∀fi (dist(fi , f) ≥ β)
6 / 27
Introduction Preliminaries Proposal Experiments Conclusion
LOCALITY SENSITIVE HASHING
Approximation algorithm
Near neighbor search
High dimensional spaces
Map objects in several buckets
Similar objects share a bucket
(r1, r2, p1, p2)-sensitive family
If dist(x, y) ≤ r1, Pr[h(x) = h(y)] ≥ p1
If dist(x, y) ≥ r2, Pr[h(x) = h(y)] ≤ p2
7 / 27
Introduction Preliminaries Proposal Experiments Conclusion
OUTLINE
1 Introduction
2 Preliminaries
3 Proposal
4 Experiments
5 Conclusion
Introduction Preliminaries Proposal Experiments Conclusion
SECURE INDEX (1/2)
1 Feature extraction
Maps Di to Fi = {fi1
, . . . , fiz }
2 Metric space translation
Translate features to vectors
8 / 27
Introduction Preliminaries Proposal Experiments Conclusion
SECURE INDEX (2/2)
3 Bucket index construction
Map vectors to buckets by applying LSH
A bucket is a bit vector of size
– : total number of data items
λ buckets
– λ: number of used hash functions
4 Bucket index encryption
Encrypt bucket identifiers and contents
– [EncKid
(Bj ), EncKpayload
(VBj
)] ∈ I
Add some fake records into the index
– Hide the number of features in the dataset
9 / 27
Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
10 / 27
Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
Creates private keys
2 Index construction
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
10 / 27
Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
Creates index for
collection D
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
10 / 27
Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
3 Data encryption
Encrypts and uploads
data items, and share
keys and hash func-
tions with users
4 Trapdoor construction
5 Search
6 Data decryption
10 / 27
Introduction Preliminaries Proposal Experiments Conclusion
TRAPDOOR CONSTRUCTION
A user wants items that contain fi
Apply LSH to construct the plain query
g1(fi), . . . , gλ(fi), gi ∈ g
Encrypts each component of the query
Using the key received earlier
Tfi
= (Enc(g1(fi)), . . . , Enc(gλ(fi)))
Sends Tfi
to the server
11 / 27
Introduction Preliminaries Proposal Experiments Conclusion
SEARCH (1/2)
The server searches on the index using Tfi
Each component of Tfi
Returns corresponding bit vectors
The user...
Decrypts the bit vectors
Ranks the data identifiers
12 / 27
Introduction Preliminaries Proposal Experiments Conclusion
SEARCH (2/2)
Rank using score(id(Dj))
Number of common buckets between Dj and fi
More buckets in common means higher rank
Users send desired identifiers to the server
The server returns the encrypted items
13 / 27
Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
The user decrypts the
received items and ob-
tains the plaintext data
14 / 27
Introduction Preliminaries Proposal Experiments Conclusion
BASIC SECURE SEARCH SCHEME
1 Key generation
2 Index construction
3 Data encryption
4 Trapdoor construction
5 Search
6 Data decryption
Problem!
Leakage of the association between
identifiers and trapdoors
14 / 27
Introduction Preliminaries Proposal Experiments Conclusion
MULTI-SERVER SCHEME
Use two servers instead of one
One server for the index
Another server for the data items
Honest-but-curious servers
Do not collaborate with each other
15 / 27
Introduction Preliminaries Proposal Experiments Conclusion
TWO ROUND SEARCH SCHEME
The data owner...
Sends the index to server Bob
Sends encrypted items to server Charlie
A user...
Sends trapdoors to server Bob
Receives encrypted bit vectors
Decrypts and ranks vectors
Sends desired identifiers to server Charlie
Receives encrypted items from server Charlie
16 / 27
Introduction Preliminaries Proposal Experiments Conclusion
TWO ROUND SCHEME ARCHITECTURE
17 / 27
Introduction Preliminaries Proposal Experiments Conclusion
TWO ROUND SCHEME ARCHITECTURE
Problem!
Users still do a lot of work
17 / 27
Introduction Preliminaries Proposal Experiments Conclusion
PAILLIER INDEX BASED SEARCH
One round search
Minimize client computation
Transfer the burden to the servers
Servers communicate with each other
Paillier index
Use of the Paillier cryptosystem
– Probabilistic
– Homomorphic additive property
Dec(Enc(m1) ∗ Enc(m2)) = m1 + m2
18 / 27
Introduction Preliminaries Proposal Experiments Conclusion
PAILLIER INDEX
Keep the encrypted form of each bit
Instead of a single encrypted bit vector
Traditional index: (πs, σVs
) ∈ I
πs: Encrypted bucket ID
σVs : Encrypted bucket vector
Paillier Index: (πs, [es1
, . . . , es ]) ∈ I
esj
= EncKpub
(1) if Vs[id(Dj)] = 1
esj
= EncKpub
(0) if Vs[id(Dj)] = 0
19 / 27
Introduction Preliminaries Proposal Experiments Conclusion
ONE ROUND SEARCH SCHEME (1/2)
After the Paillier index is constructed...
Send index to Bob with Kpub
Send encrypted items to Charlie with Kpriv
Trapdoor construction
Multi-component trapdoor Tfi
= {π1, . . . , πλ}
A user sends Tfi
to Bob along with t
– Retrieval of top t items
20 / 27
Introduction Preliminaries Proposal Experiments Conclusion
ONE ROUND SEARCH SCHEME (2/2)
Index search (Bob)
Scores computed using homomorphic addition
– Common buckets between Tfi
and a data item
Sends (i, score(i)) to Charlie along with t
Identifier resolution (Charlie)
Decrypts the received scores
Ranks items according to scores
Sends the encrypted results to the user
21 / 27
Introduction Preliminaries Proposal Experiments Conclusion
ONE ROUND SCHEME ARCHITECTURE
22 / 27
Introduction Preliminaries Proposal Experiments Conclusion
OUTLINE
1 Introduction
2 Preliminaries
3 Proposal
4 Experiments
5 Conclusion
Introduction Preliminaries Proposal Experiments Conclusion
SETUP
Scenario
Error aware keyword search
Datasets
5000 random Enron e-mails
Index construction
2-grams embedded into 500-bit Bloom filter
15 hash functions
AES in CTR mode
128-bit key
23 / 27
Introduction Preliminaries Proposal Experiments Conclusion
RETRIEVAL EVALUATION
t most similar items retrieved
24 / 27
Introduction Preliminaries Proposal Experiments Conclusion
PERFORMANCE EVALUATION (1/2)
Basic search scheme
25 / 27
Introduction Preliminaries Proposal Experiments Conclusion
PERFORMANCE EVALUATION (2/2)
One round search scheme
26 / 27
Introduction Preliminaries Proposal Experiments Conclusion
OUTLINE
1 Introduction
2 Preliminaries
3 Proposal
4 Experiments
5 Conclusion
Introduction Preliminaries Proposal Experiments Conclusion
SUMMARY
Similarity searchable encryption scheme
LSH secure index
Basic scheme
Not secure
Two round scheme
Good for a large number of data items
One round scheme (Pallier index)
Good for a large number of features
27 / 27
Detailed Algorithms
EXTRA SLIDES
Detailed Algorithms
BUILD INDEX
Detailed Algorithms
SEARCH ROUND 1
Detailed Algorithms
SEARCH ROUND 2
Detailed Algorithms
ONE ROUND SEARCH

More Related Content

Viewers also liked

Viewers also liked (18)

DBMask: Fine-Grained Access Control on Encrypted Relational Databases
DBMask: Fine-Grained Access Control on Encrypted Relational DatabasesDBMask: Fine-Grained Access Control on Encrypted Relational Databases
DBMask: Fine-Grained Access Control on Encrypted Relational Databases
 
Overview of CryptDB
Overview of CryptDBOverview of CryptDB
Overview of CryptDB
 
Privacy preserving multi-keyword ranked search over encrypted cloud data
Privacy preserving multi-keyword ranked search over encrypted cloud dataPrivacy preserving multi-keyword ranked search over encrypted cloud data
Privacy preserving multi-keyword ranked search over encrypted cloud data
 
Privacy preserving multi-keyword ranked search over encrypted cloud data
Privacy preserving multi-keyword ranked search over encrypted cloud dataPrivacy preserving multi-keyword ranked search over encrypted cloud data
Privacy preserving multi-keyword ranked search over encrypted cloud data
 
Encryption and Compression of Audio-Video Data Using Enhanced AES and J-Bit A...
Encryption and Compression of Audio-Video Data Using Enhanced AES and J-Bit A...Encryption and Compression of Audio-Video Data Using Enhanced AES and J-Bit A...
Encryption and Compression of Audio-Video Data Using Enhanced AES and J-Bit A...
 
Fibonacci Video Encryption
Fibonacci Video EncryptionFibonacci Video Encryption
Fibonacci Video Encryption
 
A New Approach for Video Encryption Based on Modified AES Algorithm
A New Approach for Video Encryption Based on Modified AES AlgorithmA New Approach for Video Encryption Based on Modified AES Algorithm
A New Approach for Video Encryption Based on Modified AES Algorithm
 
Helib
HelibHelib
Helib
 
CBIR with RF
CBIR with RFCBIR with RF
CBIR with RF
 
Image search engine
Image search engineImage search engine
Image search engine
 
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval System
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval SystemLIvRE: A Video Extension to the LIRE Content-Based Image Retrieval System
LIvRE: A Video Extension to the LIRE Content-Based Image Retrieval System
 
Fuzzy Hash Map
Fuzzy Hash MapFuzzy Hash Map
Fuzzy Hash Map
 
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
Quality, Relevance and Importance in Information Retrieval with Fuzzy Semanti...
 
Final 1st
Final 1stFinal 1st
Final 1st
 
Semantic Search Over The Web
Semantic Search Over The WebSemantic Search Over The Web
Semantic Search Over The Web
 
AES-Advanced Encryption Standard
AES-Advanced Encryption StandardAES-Advanced Encryption Standard
AES-Advanced Encryption Standard
 
Image encryption and decryption
Image encryption and decryptionImage encryption and decryption
Image encryption and decryption
 
Application Architecture Summit - Monitoring the Dynamic Cloud
Application Architecture Summit - Monitoring the Dynamic Cloud Application Architecture Summit - Monitoring the Dynamic Cloud
Application Architecture Summit - Monitoring the Dynamic Cloud
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 

Efficient Similarity Search over Encrypted Data

  • 1. Efficient Similarity Search over Encrypted Data Mehmet Kuzu, Mohammad Saiful Islam, and Murat Kantarcioglu IEEE 28th International Conference on Data Engineering ICDE 2012 Washington, DC - USA - 2012 KDE Seminar April 25, 2016 Mateus Cruz
  • 2. Introduction Preliminaries Proposal Experiments Conclusion OUTLINE 1 Introduction 2 Preliminaries 3 Proposal 4 Experiments 5 Conclusion
  • 3. Introduction Preliminaries Proposal Experiments Conclusion OUTLINE 1 Introduction 2 Preliminaries 3 Proposal 4 Experiments 5 Conclusion
  • 4. Introduction Preliminaries Proposal Experiments Conclusion SCENARIO Outsourcing storage and processing Privacy concerns Encrypt data before uploading Loss of utility Searchable encryption schemes Only exact query matching 1 / 27
  • 5. Introduction Preliminaries Proposal Experiments Conclusion PROPOSAL Similarity search over encrypted data Using locality sensitive hashing (LSH) Contributions Secure LSH index Fault tolerant keyword search Separation of leaked information Schemes Basic single server Two round multi-server Paillier index based one round multi-server 2 / 27
  • 6. Introduction Preliminaries Proposal Experiments Conclusion OUTLINE 1 Introduction 2 Preliminaries 3 Proposal 4 Experiments 5 Conclusion
  • 7. Introduction Preliminaries Proposal Experiments Conclusion DEFINITIONS D = {D1, D2, . . . , Dn} Collection of sensitive data items Fi = {fi1 , . . . , fiz } Feature set of item Di Ci is the encrypted form of Di I is a secure index Used to find items having a specific feature 3 / 27
  • 8. Introduction Preliminaries Proposal Experiments Conclusion SIMILARITY SEARCHABLE ENC. (1/2) Keygen(ψ): Outputs a key K (K ∈ {0, 1}ψ ) Enc(K, Di): Encrypts Di into Ci Dec(K, Ci): Decrypts Ci into Di 4 / 27
  • 9. Introduction Preliminaries Proposal Experiments Conclusion SIMILARITY SEARCHABLE ENC. (1/2) Keygen(ψ): Outputs a key K (K ∈ {0, 1}ψ ) Enc(K, Di): Encrypts Probabilistic encryption Di into Ci Dec(K, Ci): Decrypts Ci into Di 4 / 27
  • 10. Introduction Preliminaries Proposal Experiments Conclusion SIMILARITY SEARCHABLE ENC. (2/2) BuildIndex(K, D) Extract feature set from D Outputs the index I Trapdoor(K, f) Generates a trapdoor T for a feature f ∈ F Search(I, T) Search on I for the trapdoor T Outputs an encrypted result collection C 5 / 27
  • 11. Introduction Preliminaries Proposal Experiments Conclusion SIMILARITY SEARCHABLE ENC. (2/2) BuildIndex(K, D) Extract feature set from D Outputs the index I Trapdoor(K, f) Generates a trapdoor Value that allows search over probabilistic encryption T for a feature f ∈ F Search(I, T) Search on I for the trapdoor T Outputs an encrypted result collection C 5 / 27
  • 12. Introduction Preliminaries Proposal Experiments Conclusion FUZZY SEARCH dist : F × F → R Gives the distance between two features α and β (α < β) Thresholds for the similarity metric FuzzySearch(I, T) Search on I for the trapdoor T Outputs an encrypted result collection C With high probability – Cj ∈ C if ∃fi (dist(fi , f) ≤ α) – Cj /∈ C if ∀fi (dist(fi , f) ≥ β) 6 / 27
  • 13. Introduction Preliminaries Proposal Experiments Conclusion LOCALITY SENSITIVE HASHING Approximation algorithm Near neighbor search High dimensional spaces Map objects in several buckets Similar objects share a bucket (r1, r2, p1, p2)-sensitive family If dist(x, y) ≤ r1, Pr[h(x) = h(y)] ≥ p1 If dist(x, y) ≥ r2, Pr[h(x) = h(y)] ≤ p2 7 / 27
  • 14. Introduction Preliminaries Proposal Experiments Conclusion OUTLINE 1 Introduction 2 Preliminaries 3 Proposal 4 Experiments 5 Conclusion
  • 15. Introduction Preliminaries Proposal Experiments Conclusion SECURE INDEX (1/2) 1 Feature extraction Maps Di to Fi = {fi1 , . . . , fiz } 2 Metric space translation Translate features to vectors 8 / 27
  • 16. Introduction Preliminaries Proposal Experiments Conclusion SECURE INDEX (2/2) 3 Bucket index construction Map vectors to buckets by applying LSH A bucket is a bit vector of size – : total number of data items λ buckets – λ: number of used hash functions 4 Bucket index encryption Encrypt bucket identifiers and contents – [EncKid (Bj ), EncKpayload (VBj )] ∈ I Add some fake records into the index – Hide the number of features in the dataset 9 / 27
  • 17. Introduction Preliminaries Proposal Experiments Conclusion BASIC SECURE SEARCH SCHEME 1 Key generation 2 Index construction 3 Data encryption 4 Trapdoor construction 5 Search 6 Data decryption 10 / 27
  • 18. Introduction Preliminaries Proposal Experiments Conclusion BASIC SECURE SEARCH SCHEME 1 Key generation Creates private keys 2 Index construction 3 Data encryption 4 Trapdoor construction 5 Search 6 Data decryption 10 / 27
  • 19. Introduction Preliminaries Proposal Experiments Conclusion BASIC SECURE SEARCH SCHEME 1 Key generation 2 Index construction Creates index for collection D 3 Data encryption 4 Trapdoor construction 5 Search 6 Data decryption 10 / 27
  • 20. Introduction Preliminaries Proposal Experiments Conclusion BASIC SECURE SEARCH SCHEME 1 Key generation 2 Index construction 3 Data encryption Encrypts and uploads data items, and share keys and hash func- tions with users 4 Trapdoor construction 5 Search 6 Data decryption 10 / 27
  • 21. Introduction Preliminaries Proposal Experiments Conclusion TRAPDOOR CONSTRUCTION A user wants items that contain fi Apply LSH to construct the plain query g1(fi), . . . , gλ(fi), gi ∈ g Encrypts each component of the query Using the key received earlier Tfi = (Enc(g1(fi)), . . . , Enc(gλ(fi))) Sends Tfi to the server 11 / 27
  • 22. Introduction Preliminaries Proposal Experiments Conclusion SEARCH (1/2) The server searches on the index using Tfi Each component of Tfi Returns corresponding bit vectors The user... Decrypts the bit vectors Ranks the data identifiers 12 / 27
  • 23. Introduction Preliminaries Proposal Experiments Conclusion SEARCH (2/2) Rank using score(id(Dj)) Number of common buckets between Dj and fi More buckets in common means higher rank Users send desired identifiers to the server The server returns the encrypted items 13 / 27
  • 24. Introduction Preliminaries Proposal Experiments Conclusion BASIC SECURE SEARCH SCHEME 1 Key generation 2 Index construction 3 Data encryption 4 Trapdoor construction 5 Search 6 Data decryption The user decrypts the received items and ob- tains the plaintext data 14 / 27
  • 25. Introduction Preliminaries Proposal Experiments Conclusion BASIC SECURE SEARCH SCHEME 1 Key generation 2 Index construction 3 Data encryption 4 Trapdoor construction 5 Search 6 Data decryption Problem! Leakage of the association between identifiers and trapdoors 14 / 27
  • 26. Introduction Preliminaries Proposal Experiments Conclusion MULTI-SERVER SCHEME Use two servers instead of one One server for the index Another server for the data items Honest-but-curious servers Do not collaborate with each other 15 / 27
  • 27. Introduction Preliminaries Proposal Experiments Conclusion TWO ROUND SEARCH SCHEME The data owner... Sends the index to server Bob Sends encrypted items to server Charlie A user... Sends trapdoors to server Bob Receives encrypted bit vectors Decrypts and ranks vectors Sends desired identifiers to server Charlie Receives encrypted items from server Charlie 16 / 27
  • 28. Introduction Preliminaries Proposal Experiments Conclusion TWO ROUND SCHEME ARCHITECTURE 17 / 27
  • 29. Introduction Preliminaries Proposal Experiments Conclusion TWO ROUND SCHEME ARCHITECTURE Problem! Users still do a lot of work 17 / 27
  • 30. Introduction Preliminaries Proposal Experiments Conclusion PAILLIER INDEX BASED SEARCH One round search Minimize client computation Transfer the burden to the servers Servers communicate with each other Paillier index Use of the Paillier cryptosystem – Probabilistic – Homomorphic additive property Dec(Enc(m1) ∗ Enc(m2)) = m1 + m2 18 / 27
  • 31. Introduction Preliminaries Proposal Experiments Conclusion PAILLIER INDEX Keep the encrypted form of each bit Instead of a single encrypted bit vector Traditional index: (πs, σVs ) ∈ I πs: Encrypted bucket ID σVs : Encrypted bucket vector Paillier Index: (πs, [es1 , . . . , es ]) ∈ I esj = EncKpub (1) if Vs[id(Dj)] = 1 esj = EncKpub (0) if Vs[id(Dj)] = 0 19 / 27
  • 32. Introduction Preliminaries Proposal Experiments Conclusion ONE ROUND SEARCH SCHEME (1/2) After the Paillier index is constructed... Send index to Bob with Kpub Send encrypted items to Charlie with Kpriv Trapdoor construction Multi-component trapdoor Tfi = {π1, . . . , πλ} A user sends Tfi to Bob along with t – Retrieval of top t items 20 / 27
  • 33. Introduction Preliminaries Proposal Experiments Conclusion ONE ROUND SEARCH SCHEME (2/2) Index search (Bob) Scores computed using homomorphic addition – Common buckets between Tfi and a data item Sends (i, score(i)) to Charlie along with t Identifier resolution (Charlie) Decrypts the received scores Ranks items according to scores Sends the encrypted results to the user 21 / 27
  • 34. Introduction Preliminaries Proposal Experiments Conclusion ONE ROUND SCHEME ARCHITECTURE 22 / 27
  • 35. Introduction Preliminaries Proposal Experiments Conclusion OUTLINE 1 Introduction 2 Preliminaries 3 Proposal 4 Experiments 5 Conclusion
  • 36. Introduction Preliminaries Proposal Experiments Conclusion SETUP Scenario Error aware keyword search Datasets 5000 random Enron e-mails Index construction 2-grams embedded into 500-bit Bloom filter 15 hash functions AES in CTR mode 128-bit key 23 / 27
  • 37. Introduction Preliminaries Proposal Experiments Conclusion RETRIEVAL EVALUATION t most similar items retrieved 24 / 27
  • 38. Introduction Preliminaries Proposal Experiments Conclusion PERFORMANCE EVALUATION (1/2) Basic search scheme 25 / 27
  • 39. Introduction Preliminaries Proposal Experiments Conclusion PERFORMANCE EVALUATION (2/2) One round search scheme 26 / 27
  • 40. Introduction Preliminaries Proposal Experiments Conclusion OUTLINE 1 Introduction 2 Preliminaries 3 Proposal 4 Experiments 5 Conclusion
  • 41. Introduction Preliminaries Proposal Experiments Conclusion SUMMARY Similarity searchable encryption scheme LSH secure index Basic scheme Not secure Two round scheme Good for a large number of data items One round scheme (Pallier index) Good for a large number of features 27 / 27