• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Presentation cedem luca
 

Presentation cedem luca

on

  • 305 views

 

Statistics

Views

Total Views
305
Views on SlideShare
76
Embed Views
229

Actions

Likes
0
Downloads
0
Comments
0

4 Embeds 229

http://digitalgovernment.wordpress.com 226
http://feedly.com 1
http://newsconsole.com 1
http://www.newsblur.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Presentation cedem luca Presentation cedem luca Presentation Transcript

    • Open Access and Database Anonymization an Open Source Procedure Based on an Italian Case Study Danube University Krems, 21-23 May 2014 L. Leschiutta, G.Futia
    • dd th Month Year What 222nd May 2014 Giuseppe Futia – Politecnico di Torino 2 Introduction (1)  The principal way to openly share a database is to remove all data that could lead to the identification of the involved subjects (i.e. database anonymization);  we describe a procedure on how to process and anonymize a collection of data that includes personal, sensitive and judicial data;  the procedure is general purpose and implemented relying solely on common open-source software applications.
    • dd th Month Year What 322nd May 2014 Giuseppe Futia – Politecnico di Torino 3 Introduction (2) • Our study is based on a real case in which a database consisting of 352 data fields of car accidents related data (TWIST) needs to be open accessed; • this work was developed in the framework of the Open-DAI project. Open-DAI is “Opening Data Architectures and Infrastructures” for European Public Administrations. It is a project funded under the ICT Policy Support Programme as part of the Competitiveness and Innovation framework Programme (CIP) Call 2011.
    • dd th Month Year What 422nd May 2014 Giuseppe Futia – Politecnico di Torino 4 Non Anonymous Data ID1 NID1 ID2 ID3 NID2 ID4 NID3 NID4 Item 1 Item 2 Item N
    • dd th Month Year What 522nd May 2014 Giuseppe Futia – Politecnico di Torino 5 Ordered Non Anonymous Data ID1 ID2 ID3 ID4 NID1 NID2 NID3 NID4 Item 1 Item 2 Item N
    • dd th Month Year What 622nd May 2014 Giuseppe Futia – Politecnico di Torino 6 Ordered Non Anonymous Data including Anonymous IDs ID1 ID2 ID3 ID4 AID NID1 NID2 NID3 NID4 Item 1 1053 Item 2 1001 1057 Item N 1133
    • dd th Month Year What 722nd May 2014 Giuseppe Futia – Politecnico di Torino 7 Anonymous Data AID NID1 NID2 NID3 NID4 1053 1001 1057 1133
    • dd th Month Year What 822nd May 2014 Giuseppe Futia – Politecnico di Torino 8 Random AIDs generation
    • dd th Month Year What 922nd May 2014 Giuseppe Futia – Politecnico di Torino 9 Advanced techniques: repeating IDs IF(ISNA(VLOOKUP(C4;C$1:C3;1; ));AID.A8;VLOOKUP(C4;C$1:F3;4; ))
    • dd th Month Year What 1022nd May 2014 Giuseppe Futia – Politecnico di Torino 10 Non Unique IDs In Multiple Cells (1) ID1 ID2 ID3 ID4 NID1 NID2 NID3 NID4 Item 1 Lorem ipsum Item 2 Lorem ipsum Item N Lorem ipsum
    • dd th Month Year What 1122nd May 2014 Giuseppe Futia – Politecnico di Torino 11 Non Unique IDs In Multiple Cells (2) flag=false; for (i=0; i<n: i++){ for (j=0; j<m: j++){ if(ID_Matrix[i][j]==ID_Matrix[n][m]{ AID_Matrix[n][m] = AID_Matrix[i][j]; flag=true; break; } } } if (flag==false){ AID_Matrix[n][m] = Next_Availabe_AID(k); k++; }
    • dd th Month Year What 1222nd May 2014 Giuseppe Futia – Politecnico di Torino 12 Data Wiping • To perform this operation on Windows, you can use the open source program Eraser (http://eraser.heidi.ie ); • on Linux, you can use the following commands: > shred NonAnonymousData.csv > rm NonAnonymousData.csv
    • dd th Month Year What 1322nd May 2014 Giuseppe Futia – Politecnico di Torino 13 Cryptograph the file • On Windows this can be achieved by using the open source 7zip program (http://www.7- zip.org/ ) that allows to achieve a strong AES- 256 encryption. • On Linux you can use the following command: > gpg -c NonAnonymousData.csv The encrypted file must then be backed up to a safe location e.g. a non-rewritable DVD or a WORM (Write Once Read Many) tape.
    • dd th Month Year What 1422nd May 2014 Giuseppe Futia – Politecnico di Torino 14 Data Degradation (location)
    • dd th Month Year What 1522nd May 2014 Giuseppe Futia – Politecnico di Torino 15 Data Degradation (location)
    • dd th Month Year What 1622nd May 2014 Giuseppe Futia – Politecnico di Torino 16 Data Degradation (time) • 10 November 2011 at 10:25 • 10 November 2011 between 10 and 11 • Winter 2011
    • dd th Month Year What 1722nd May 2014 Giuseppe Futia – Politecnico di Torino 17 Conclusions: de-anonymization test • How to test if a database is anonymous enough? • Reasonable efforts “the means possibly required to effect identification are to be considered disproportionate compared with the (risk of) damage resulting” • de-anonymization test
    • 22nd May 2014 Giuseppe Futia – Politecnico di Torino 18 Thank you Luca Leschiutta (luca.leschiutta@polito.it) Giuseppe Futia (giuseppe.futia@polito.it) Nexa Center for Internet & Society (http://nexa.polito.it) Dept. of Computer and Control Engineering (DAUIN) Politecnico di Torino, Italy