SlideShare a Scribd company logo
Security of Social Informationfrom Query Analysis in DaaS Junpei Kawamotoand Masatoshi Yoshikawa Kyoto University, Japan
Database as a Service One of the component of the cloud computing Data are stored and managed by service providers The DaaS brings down a risk of compromise Paris London Bob Tokyo Alice DaaS Server Carol
Database as a Service There are studies to guarantee the safety Security of data stored in the servers Preventing guess of data from query analyses Protecting personal information from query analyses Name, Age, … DaaSServer Name, Age, … Is it enough for the compromise?
Overview of this presentation Name, Age, … friend 1. We introduce a new problem                  – Social Information –     That is relational information Name, Age, … co-worker 2. We discuss an attack modelThat extracts the social information                                          from query log DaaSServer Alice DaaS Server They seem to have a relation What's the schedule at 3:00pm, March 6th in “room A”? 3. We propose a method    protecting social info.    from query analysis Conversion Server match(binary(hash(Where)), “01*”)
What is Social Information? Social information is information about users’ relation That is NOT personal information So that is not protected by any rows in Japan Risks The structure of users’ org. can be extracted Strength of relations may indicate interests of the org. friend co-worker Bob Alice executive Carol Paris Tokyo London Next, I will introduce the attack model for this social information.
An assumption for our attack model Users who send same characteristic queries have a relation.     e.g. Users who request the event at particular date and time. What's the schedule at 3:00pm, March 6th in “room A”? Bob Alice What's the schedule at 3:00pm, March 6th in “room C”? DaaS Server What's the schedule at 3:00pm, March 6th in “room A”? Carol We presuppose they have a same interest, therefore have a relation
Attackers can obtain the query log in servers. That is described as the below table Attack model  What's the schedule at 3:00pm, March 6th in “room A”? Date = 0306, Time = 1500, Where = Room A Alice Bob What's the schedule at 3:00pm, March 6th in “room C”? Date = 0306, Time = 1500, Where = Room C DaaS Server What's the schedule at 3:00pm, March 6th in “room A”? Date = 0306, Time = 1500, Where = Room A Carol To compute the similarity between the users, attacker calculate query feature vectors in this model
Query feature vector Calculating literal frequencies Normalize each values are divided by the number of request of the user Room A 1600 Room B 1500 1700 Room C 0306 … … … … 1 1 1 2 1 1 3 1 2 0 33 22 22 13 13 13 13 23 12 12 12 12 12 12 22 … … … … 1 1 1 2 1 1 0 0 … … … … 1 1 1 2 1 2 0 0 0 Query feature vector
Compute Similarities We define the cosine value as the similarity If sim(u, v) is greater than threshold θ it is judged that user u and v have a relation (QVu: Query vector of user u) 33 13 13 13 13 23 22 12 12 12 12 22 12 12 22 Next, I will explain the basic scenario of our approach to prevent from this attack. … … … … 1 1 1 2 1 1 3 1 2 0 Alice Sim(Alice, Bob) =  Bob … … … … 1 1 1 2 1 1 0 0 … … … … 1 1 1 2 1 2 0 0 0 Carol
Basic Scenario of Query Conversion  Paris London Next, I will introduce how one conversion server works. How all servers collaborate with each other is a future work. ConversionServer Bob Alice Tokyo DaaS Server To remove the feature from queries received by the server, ,[object Object]
  the server works between users and the DaaS serverCarol means a trusted network such as a local network in business places
Query Conversion Tree We introduce a conversion tree to convert queries  That is based on the extendible hashing† It is a binary tree and leaf nodes have strings Each edge has a label (0 or 1) Inner node Leaf node A 0 root Node A has0010101, 000101, … 2 0 1 1 B 1 C †R. Fagin, J. Nievergelt, N. Pippenger, and H. R. Strong. Extendible hashing - a fast access method for dynamic files. ACM Transactions on Database Systems, 4(3):315344, 1979.
A user asks schedules and sends the query Let me show how to convert “Where = room A” Conversion Process 1: Hash the literal of the query: hash(“room A”) = 3 2: Convert the hash value into the binary string: binary(hash(“room A”)) = “0110” 3: Convert the binary string with the conversion tree: Alice DaaS Server What's the schedule at 3:00pm, March 6th in “room A”? Conversion Server Date = 0306, Time = 1500, Where = “room A” Date = 0306, Time = 1500, Where = “room A”
Convert the binary string with the tree The conversion start from the root node Compare the 1st character of the binary string with labels Compare the next character with labels from the node #2 Continue the step 3 until reaching a leaf node Inner node Leaf node Binary string: 0110 A 0 root 0 2 1 Connect the labels from rootto the mapped leaf node: 01 1 B 2 1 1 C Append a wild-card character *: 01* Converted query
A user asks schedules and sends the query Let me show how to convert “Date = 0306” Conversion Process 1: Hash the literal of the query: hash(“room A”) = 3 2: Convert the hash value into the binary string: binary(hash(“room A”)) = “0110” 3: Convert the binary string with the conversion tree: 01* 4: Finally create the new query: match(binary(hash(Where)), “01*”) Alice DaaS Server What's the schedule at 3:00pm, March 6th in “room A”? Conversion Server Date = 0306, Time = 1500, Where = “room A” Date = 0306, Time = 1500, Where = “room A” match(binary(hash(Where)), “01*”)
Summary of the conversion match(binary(hash(Where)), “01*”) is the final query * is a wild-card character match is a function to compare binary strings with queries The original query is “Where = room A” Result of the conversion Any queries starting with “01” is converted to “01*” No one can distinguish the original queries binary(hash(“room A”)) = “0110” binary(hash(“room X”)) = “0100” match(binary(hash(Where)), “01*”) binary(hash(“abc cafe”)) = “0101” Next, I will explain the method updating conversion tree to reduce costs.
Updating Conversion Tree Some irrelevant data are obtained by the conversion We define the cost as the number of datawhich user u has to obtain when s/he request a querymapped the leaf node n To reduce the above cost under the given cmax, We update conversion tree max allowable cost
Updating Process (1 of 2) Target node n is chosen in order of the frequency The literals included in the node is divided 2 sets Where d isthe depth of the target node (1origin) The set of nodes Ls is divided whether the d-th character is 0 or not Leaf node n has: 1000, 1001, 1010, 1011 1100, 1101, 1110, 1111 (for easily, let  us think only 4 bits) Ls0 n:Ls 1000, 1001, 1010, 1011 1000, 1001, 1010, 1011 1100, 1101, 1110, 1111 0 Ls1 root 1100, 1101, 1110, 1111 0 2 1 1 1 n:Ls
Updating Process (2 of 2) Compute the following to 2 sets (Ls0 and Ls1) If cost0or cost1 are greater than cmax Delete the node Ls then add a new node and 2 new leaves Count(u, l) is how many user u inquires by literal l totalu is the # of inquiry of user u max allowable cost 0 Ls0 root 1000, 1001, 1010, 1011 0 2 1 1 Ls1 1 n:Ls 1100, 1101, 1110, 1111 1000, 1001, 1010, 1011 1100, 1101, 1110, 1111 3 Next, I will talk about the evaluation.
Evaluation Experiment We have selected a dataset used by Alexander et al.† This dataset is constructed from Open Directory It contains users’ groups and queries. †Alexander Löser, Steffen Staab, and ChristophTempich:                    “Semantic Methods for P2P Query Routing”, Multiagent System Technologies(MATES2005)
The # of users is 133,602 and the # of groups is 6,280 The precision and recall of the attack are Result Introducing attack model can extract users relation in high precision The higher precision is, the higher risk is
Result How much does the query conversion reducethe precision. 0.8 -> 0.55
Result How much does the query conversion reducethe recall

More Related Content

Viewers also liked

Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Junpei Kawamoto
 
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシマルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
Junpei Kawamoto
 
VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2
Junpei Kawamoto
 
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
Junpei Kawamoto
 
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法Junpei Kawamoto
 
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシマルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシJunpei Kawamoto
 
暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造
Junpei Kawamoto
 
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Junpei Kawamoto
 
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Junpei Kawamoto
 

Viewers also liked (9)

Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
Reducing Data Decryption Cost by Broadcast Encryption and Account Assignment ...
 
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシマルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
マルコフモデルを仮定した位置情報開示のためのアドバーザリアルプライバシ
 
VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2VLDB09勉強会 Session27 Privacy2
VLDB09勉強会 Session27 Privacy2
 
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
暗号化データベースモデルにおける問合せの関連情報を秘匿する範囲検索
 
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
暗号化データベースモデルにおける関係情報推定を防ぐ索引手法
 
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシマルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
マルコフ過程を用いた位置情報継続開示のためのアドバーザリアルプライバシ
 
暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造暗号化ベクトルデータベースのための索引構造
暗号化ベクトルデータベースのための索引構造
 
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
Sponsored Search Markets (from Networks, Crowds, and Markets: Reasoning About...
 
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
 

Similar to Security of Social Information from Query Analysis in DaaS

OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation QueriesOBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
Shantanu Sharma
 
IMPLEMENTATION OF DNA CRYPTOGRAPHY IN CLOUD COMPUTING AND.pptx
IMPLEMENTATION OF DNA CRYPTOGRAPHY IN CLOUD COMPUTING AND.pptxIMPLEMENTATION OF DNA CRYPTOGRAPHY IN CLOUD COMPUTING AND.pptx
IMPLEMENTATION OF DNA CRYPTOGRAPHY IN CLOUD COMPUTING AND.pptx
DeepikaShivam
 
Exploiting tls to disrupt privacy of web application's traffic
Exploiting tls to disrupt privacy of web application's trafficExploiting tls to disrupt privacy of web application's traffic
Exploiting tls to disrupt privacy of web application's traffic
Sandipan Biswas
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
VijayasankariS
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Salah Amean
 
Unit 10 Assignment_2_Sig_Theory_and_Data Elements V3
Unit 10 Assignment_2_Sig_Theory_and_Data Elements V3Unit 10 Assignment_2_Sig_Theory_and_Data Elements V3
Unit 10 Assignment_2_Sig_Theory_and_Data Elements V3
John Mathias
 
Real time data-pipeline from inception to production
Real time data-pipeline from inception to productionReal time data-pipeline from inception to production
Real time data-pipeline from inception to production
Shreya Mukhopadhyay
 
Types Working for You, Not Against You
Types Working for You, Not Against YouTypes Working for You, Not Against You
Types Working for You, Not Against You
C4Media
 
03Preprocessing01.pdf
03Preprocessing01.pdf03Preprocessing01.pdf
03Preprocessing01.pdf
Alireza418370
 
User_42751212015Module1and2pagestocompetework.pdf.docx
User_42751212015Module1and2pagestocompetework.pdf.docxUser_42751212015Module1and2pagestocompetework.pdf.docx
User_42751212015Module1and2pagestocompetework.pdf.docx
dickonsondorris
 
Introduction to Data Science With R Notes
Introduction to Data Science With R NotesIntroduction to Data Science With R Notes
Introduction to Data Science With R Notes
LakshmiSarvani6
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
HBaseCon
 
Building High Fidelity Data Streams (QCon London 2023)
Building High Fidelity Data Streams (QCon London 2023)Building High Fidelity Data Streams (QCon London 2023)
Building High Fidelity Data Streams (QCon London 2023)
Sid Anand
 
1.1.1 binary systems By Zak
1.1.1 binary systems By Zak1.1.1 binary systems By Zak
1.1.1 binary systems By Zak
Tabsheer Hasan
 
AWS Enterprise Summit Netherlands - AWS IoT
AWS Enterprise Summit Netherlands - AWS IoTAWS Enterprise Summit Netherlands - AWS IoT
AWS Enterprise Summit Netherlands - AWS IoT
Amazon Web Services
 
Q01725110114
Q01725110114Q01725110114
Q01725110114
IOSR Journals
 
Enhancing Cloud Computing Security for Data Sharing Within Group Members
Enhancing Cloud Computing Security for Data Sharing Within Group MembersEnhancing Cloud Computing Security for Data Sharing Within Group Members
Enhancing Cloud Computing Security for Data Sharing Within Group Members
iosrjce
 
Unit 3-2.ppt
Unit 3-2.pptUnit 3-2.ppt
Unit 3-2.ppt
Ankit506645
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
Sandeep Sharma IIMK Smart City,IoT,Bigdata,Cloud,BI,DW
 
10.1.1.118.8129
10.1.1.118.812910.1.1.118.8129
10.1.1.118.8129
Julius Francisco
 

Similar to Security of Social Information from Query Analysis in DaaS (20)

OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation QueriesOBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
OBSCURE: Information Theoretic Oblivious and Verifiable Aggregation Queries
 
IMPLEMENTATION OF DNA CRYPTOGRAPHY IN CLOUD COMPUTING AND.pptx
IMPLEMENTATION OF DNA CRYPTOGRAPHY IN CLOUD COMPUTING AND.pptxIMPLEMENTATION OF DNA CRYPTOGRAPHY IN CLOUD COMPUTING AND.pptx
IMPLEMENTATION OF DNA CRYPTOGRAPHY IN CLOUD COMPUTING AND.pptx
 
Exploiting tls to disrupt privacy of web application's traffic
Exploiting tls to disrupt privacy of web application's trafficExploiting tls to disrupt privacy of web application's traffic
Exploiting tls to disrupt privacy of web application's traffic
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 
Unit 10 Assignment_2_Sig_Theory_and_Data Elements V3
Unit 10 Assignment_2_Sig_Theory_and_Data Elements V3Unit 10 Assignment_2_Sig_Theory_and_Data Elements V3
Unit 10 Assignment_2_Sig_Theory_and_Data Elements V3
 
Real time data-pipeline from inception to production
Real time data-pipeline from inception to productionReal time data-pipeline from inception to production
Real time data-pipeline from inception to production
 
Types Working for You, Not Against You
Types Working for You, Not Against YouTypes Working for You, Not Against You
Types Working for You, Not Against You
 
03Preprocessing01.pdf
03Preprocessing01.pdf03Preprocessing01.pdf
03Preprocessing01.pdf
 
User_42751212015Module1and2pagestocompetework.pdf.docx
User_42751212015Module1and2pagestocompetework.pdf.docxUser_42751212015Module1and2pagestocompetework.pdf.docx
User_42751212015Module1and2pagestocompetework.pdf.docx
 
Introduction to Data Science With R Notes
Introduction to Data Science With R NotesIntroduction to Data Science With R Notes
Introduction to Data Science With R Notes
 
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huaweihbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
 
Building High Fidelity Data Streams (QCon London 2023)
Building High Fidelity Data Streams (QCon London 2023)Building High Fidelity Data Streams (QCon London 2023)
Building High Fidelity Data Streams (QCon London 2023)
 
1.1.1 binary systems By Zak
1.1.1 binary systems By Zak1.1.1 binary systems By Zak
1.1.1 binary systems By Zak
 
AWS Enterprise Summit Netherlands - AWS IoT
AWS Enterprise Summit Netherlands - AWS IoTAWS Enterprise Summit Netherlands - AWS IoT
AWS Enterprise Summit Netherlands - AWS IoT
 
Q01725110114
Q01725110114Q01725110114
Q01725110114
 
Enhancing Cloud Computing Security for Data Sharing Within Group Members
Enhancing Cloud Computing Security for Data Sharing Within Group MembersEnhancing Cloud Computing Security for Data Sharing Within Group Members
Enhancing Cloud Computing Security for Data Sharing Within Group Members
 
Unit 3-2.ppt
Unit 3-2.pptUnit 3-2.ppt
Unit 3-2.ppt
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
10.1.1.118.8129
10.1.1.118.812910.1.1.118.8129
10.1.1.118.8129
 

More from Junpei Kawamoto

レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
Junpei Kawamoto
 
初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀
Junpei Kawamoto
 
Securing Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced DatabasesSecuring Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced Databases
Junpei Kawamoto
 
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
Junpei Kawamoto
 
Privacy for Continual Data Publishing
Privacy for Continual Data PublishingPrivacy for Continual Data Publishing
Privacy for Continual Data Publishing
Junpei Kawamoto
 
データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化
Junpei Kawamoto
 
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
Junpei Kawamoto
 
プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案
Junpei Kawamoto
 
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
A Locality Sensitive Hashing Filter for Encrypted Vector DatabasesA Locality Sensitive Hashing Filter for Encrypted Vector Databases
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
Junpei Kawamoto
 
位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法Junpei Kawamoto
 
Private Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionPrivate Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based Encryption
Junpei Kawamoto
 

More from Junpei Kawamoto (11)

レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
レビューサイトにおける不均質性を考慮した特異なレビュアー発⾒とレビューサマリの推測
 
初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀初期レビューを用いた長期間評価推定􏰀
初期レビューを用いた長期間評価推定􏰀
 
Securing Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced DatabasesSecuring Social Information from Query Analysis in Outsourced Databases
Securing Social Information from Query Analysis in Outsourced Databases
 
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
クエリログとナビゲーション履歴から探索意図抽出による協調探索支援
 
Privacy for Continual Data Publishing
Privacy for Continual Data PublishingPrivacy for Continual Data Publishing
Privacy for Continual Data Publishing
 
データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化データ共有型WEBアプリケーションにおけるサーバ暗号化
データ共有型WEBアプリケーションにおけるサーバ暗号化
 
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法プライベート問合せにおける問合せ頻度を用いた制約緩和手法
プライベート問合せにおける問合せ頻度を用いた制約緩和手法
 
プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案プライバシを考慮した移動系列情報解析のための安全性の提案
プライバシを考慮した移動系列情報解析のための安全性の提案
 
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
A Locality Sensitive Hashing Filter for Encrypted Vector DatabasesA Locality Sensitive Hashing Filter for Encrypted Vector Databases
A Locality Sensitive Hashing Filter for Encrypted Vector Databases
 
位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法位置情報解析のためのプライバシ保護手法
位置情報解析のためのプライバシ保護手法
 
Private Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based EncryptionPrivate Range Query by Perturbation and Matrix Based Encryption
Private Range Query by Perturbation and Matrix Based Encryption
 

Recently uploaded

Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
maazsz111
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 

Recently uploaded (20)

Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 

Security of Social Information from Query Analysis in DaaS

  • 1. Security of Social Informationfrom Query Analysis in DaaS Junpei Kawamotoand Masatoshi Yoshikawa Kyoto University, Japan
  • 2. Database as a Service One of the component of the cloud computing Data are stored and managed by service providers The DaaS brings down a risk of compromise Paris London Bob Tokyo Alice DaaS Server Carol
  • 3. Database as a Service There are studies to guarantee the safety Security of data stored in the servers Preventing guess of data from query analyses Protecting personal information from query analyses Name, Age, … DaaSServer Name, Age, … Is it enough for the compromise?
  • 4. Overview of this presentation Name, Age, … friend 1. We introduce a new problem – Social Information – That is relational information Name, Age, … co-worker 2. We discuss an attack modelThat extracts the social information from query log DaaSServer Alice DaaS Server They seem to have a relation What's the schedule at 3:00pm, March 6th in “room A”? 3. We propose a method protecting social info. from query analysis Conversion Server match(binary(hash(Where)), “01*”)
  • 5. What is Social Information? Social information is information about users’ relation That is NOT personal information So that is not protected by any rows in Japan Risks The structure of users’ org. can be extracted Strength of relations may indicate interests of the org. friend co-worker Bob Alice executive Carol Paris Tokyo London Next, I will introduce the attack model for this social information.
  • 6. An assumption for our attack model Users who send same characteristic queries have a relation.     e.g. Users who request the event at particular date and time. What's the schedule at 3:00pm, March 6th in “room A”? Bob Alice What's the schedule at 3:00pm, March 6th in “room C”? DaaS Server What's the schedule at 3:00pm, March 6th in “room A”? Carol We presuppose they have a same interest, therefore have a relation
  • 7. Attackers can obtain the query log in servers. That is described as the below table Attack model What's the schedule at 3:00pm, March 6th in “room A”? Date = 0306, Time = 1500, Where = Room A Alice Bob What's the schedule at 3:00pm, March 6th in “room C”? Date = 0306, Time = 1500, Where = Room C DaaS Server What's the schedule at 3:00pm, March 6th in “room A”? Date = 0306, Time = 1500, Where = Room A Carol To compute the similarity between the users, attacker calculate query feature vectors in this model
  • 8. Query feature vector Calculating literal frequencies Normalize each values are divided by the number of request of the user Room A 1600 Room B 1500 1700 Room C 0306 … … … … 1 1 1 2 1 1 3 1 2 0 33 22 22 13 13 13 13 23 12 12 12 12 12 12 22 … … … … 1 1 1 2 1 1 0 0 … … … … 1 1 1 2 1 2 0 0 0 Query feature vector
  • 9. Compute Similarities We define the cosine value as the similarity If sim(u, v) is greater than threshold θ it is judged that user u and v have a relation (QVu: Query vector of user u) 33 13 13 13 13 23 22 12 12 12 12 22 12 12 22 Next, I will explain the basic scenario of our approach to prevent from this attack. … … … … 1 1 1 2 1 1 3 1 2 0 Alice Sim(Alice, Bob) = Bob … … … … 1 1 1 2 1 1 0 0 … … … … 1 1 1 2 1 2 0 0 0 Carol
  • 10.
  • 11. the server works between users and the DaaS serverCarol means a trusted network such as a local network in business places
  • 12. Query Conversion Tree We introduce a conversion tree to convert queries That is based on the extendible hashing† It is a binary tree and leaf nodes have strings Each edge has a label (0 or 1) Inner node Leaf node A 0 root Node A has0010101, 000101, … 2 0 1 1 B 1 C †R. Fagin, J. Nievergelt, N. Pippenger, and H. R. Strong. Extendible hashing - a fast access method for dynamic files. ACM Transactions on Database Systems, 4(3):315344, 1979.
  • 13. A user asks schedules and sends the query Let me show how to convert “Where = room A” Conversion Process 1: Hash the literal of the query: hash(“room A”) = 3 2: Convert the hash value into the binary string: binary(hash(“room A”)) = “0110” 3: Convert the binary string with the conversion tree: Alice DaaS Server What's the schedule at 3:00pm, March 6th in “room A”? Conversion Server Date = 0306, Time = 1500, Where = “room A” Date = 0306, Time = 1500, Where = “room A”
  • 14. Convert the binary string with the tree The conversion start from the root node Compare the 1st character of the binary string with labels Compare the next character with labels from the node #2 Continue the step 3 until reaching a leaf node Inner node Leaf node Binary string: 0110 A 0 root 0 2 1 Connect the labels from rootto the mapped leaf node: 01 1 B 2 1 1 C Append a wild-card character *: 01* Converted query
  • 15. A user asks schedules and sends the query Let me show how to convert “Date = 0306” Conversion Process 1: Hash the literal of the query: hash(“room A”) = 3 2: Convert the hash value into the binary string: binary(hash(“room A”)) = “0110” 3: Convert the binary string with the conversion tree: 01* 4: Finally create the new query: match(binary(hash(Where)), “01*”) Alice DaaS Server What's the schedule at 3:00pm, March 6th in “room A”? Conversion Server Date = 0306, Time = 1500, Where = “room A” Date = 0306, Time = 1500, Where = “room A” match(binary(hash(Where)), “01*”)
  • 16. Summary of the conversion match(binary(hash(Where)), “01*”) is the final query * is a wild-card character match is a function to compare binary strings with queries The original query is “Where = room A” Result of the conversion Any queries starting with “01” is converted to “01*” No one can distinguish the original queries binary(hash(“room A”)) = “0110” binary(hash(“room X”)) = “0100” match(binary(hash(Where)), “01*”) binary(hash(“abc cafe”)) = “0101” Next, I will explain the method updating conversion tree to reduce costs.
  • 17. Updating Conversion Tree Some irrelevant data are obtained by the conversion We define the cost as the number of datawhich user u has to obtain when s/he request a querymapped the leaf node n To reduce the above cost under the given cmax, We update conversion tree max allowable cost
  • 18. Updating Process (1 of 2) Target node n is chosen in order of the frequency The literals included in the node is divided 2 sets Where d isthe depth of the target node (1origin) The set of nodes Ls is divided whether the d-th character is 0 or not Leaf node n has: 1000, 1001, 1010, 1011 1100, 1101, 1110, 1111 (for easily, let us think only 4 bits) Ls0 n:Ls 1000, 1001, 1010, 1011 1000, 1001, 1010, 1011 1100, 1101, 1110, 1111 0 Ls1 root 1100, 1101, 1110, 1111 0 2 1 1 1 n:Ls
  • 19. Updating Process (2 of 2) Compute the following to 2 sets (Ls0 and Ls1) If cost0or cost1 are greater than cmax Delete the node Ls then add a new node and 2 new leaves Count(u, l) is how many user u inquires by literal l totalu is the # of inquiry of user u max allowable cost 0 Ls0 root 1000, 1001, 1010, 1011 0 2 1 1 Ls1 1 n:Ls 1100, 1101, 1110, 1111 1000, 1001, 1010, 1011 1100, 1101, 1110, 1111 3 Next, I will talk about the evaluation.
  • 20. Evaluation Experiment We have selected a dataset used by Alexander et al.† This dataset is constructed from Open Directory It contains users’ groups and queries. †Alexander Löser, Steffen Staab, and ChristophTempich: “Semantic Methods for P2P Query Routing”, Multiagent System Technologies(MATES2005)
  • 21. The # of users is 133,602 and the # of groups is 6,280 The precision and recall of the attack are Result Introducing attack model can extract users relation in high precision The higher precision is, the higher risk is
  • 22. Result How much does the query conversion reducethe precision. 0.8 -> 0.55
  • 23. Result How much does the query conversion reducethe recall
  • 24. Summary friend 1. We introduce a new problem on DaaS – Social Information – co-worker 2. We introduce an attack modelThat extracts the social information from query log DaaSServer Alice DaaS Server They seem to have high relation What's the schedule at 3:00pm, March 6th in “room A”? 3. We propose a method protecting social info. from query analysis Conversion Server match(binary(hash(Where)), “01*”)