Computation systems for protecting delimited data

•

0 likes•259 views

G Prachi

Technology

Computational Systems for
Protecting Delimited Data
Unit 5

Table of contents
• The Goal
• What is delimited data?
• Various computational systems for protecting
delimited data
MinGen
Data fly
μ-Argus System
k-Similar Algorithm
Scrub System
• References


The Goal
Explore computational techniques to:
Release useful information
in such a way that the identity
of any individual or entity
contained in data cannot be
recognized while the data remains practically
useful

What is delimited data?
• Data separated by a delimiter such as a comma
character(,) or a tab.
• Generally used in hospital records, office
records etc.
• eg.

Datafly System
• Maintains anonymity in released data by
automatically substituting, generalizing and
suppressing information as appropriate.
• Decisions are made at the attribute and tuple level at
the time of database access
• Role based approach
• The end result - a subset of the original database that
provides minimal linking and matching of data
because
each tuple matches as many people as the data
holder specifies.

Datafly System
• User sets anonymity value
• The Datafly System iteratively computes
increasingly less specific versions of the values
for the attribute until eventually the desired
anonymity level is attained.
• The iterative process ends when there exists k
tuples having the same values assigned across a
group of attributes

Datafly System
•Output table - attributes and
tuples correspond to the
anonymity level specified by the
data holder
•anonymity level = 0.7.

μ-Argus System
• Provides protection by enforcing a k requirement on the
values found in a quasi-identifier.
• The data holder:
 provides a value of k
specifies which attributes are sensitive by assigning a
value to each attribute between 0 and 3 denoting "not
identifying," "most identifying," "more identifying," and
"identifying," respectively.
• The program identifies rare and therefore unsafe
combinations by testing some 2- or 3-combinations of
attributes declared to be sensitive.

μ-Argus System
• Unsafe combinations are eliminated by generalizing
attributes within the combination and by local cell
suppression.
• Rather than removing entire tuples when one or more
attributes contain outlier information as is done in the
Datafly System, the m-Argus System simply suppresses
or blanks out the outlier values at the cell-level
• The resulting data typically contain all the tuples and
attributes of the original data, though values may be
missing in some cell locations.

μ-Argus System
Combinations of More, Most, Identifying tested by m-Argus

k-Similar Algorithm
• There does not exists fewer than k tuples in the
release data having the same values across the
quasi identifier.
• Based on correctness of the k similar
clustering algo k- map protection is avoided

Scrub System
• Provides a methodology for removing
personally identifying info in medical writings
integrity of the info remains intact
Identity of the person remains confidential
• called Scrubbing

References
• Sweeney, Latanya. "Foundations of privacy protection from a computer
science perspective." In Proceedings, Joint Statistical Meeting, AAAS,
Indianapolis, IN. 2000.

What's hot

Introduction to Machine Learning ClassifiersFunctional Imperative

Data mining: Classification and predictionDataminingTools Inc

ClusteringLipikaSaha2

Security Lifecycle Management ProcessBill Ross

Instance based learningSlideshare

04 Classification in Data MiningValerii Klymchuk

Data mining: Concepts and Techniques, Chapter12 outlier Analysis Salah Amean

$Decision tree induction \ Decision Tree Algorithm with Example| Data science$ $Decision tree induction \ Decision Tree Algorithm with Example| Data science$

Decision tree induction \ Decision Tree Algorithm with Example| Data scienceMaryamRehman6

3.7 outlier analysisKrish_ver2

DB securityERSHUBHAM TIWARI

K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...Simplilearn

Data preprocessingJason Rodrigues

CISSP - Chapter 3 - CryptographyKarthikeyan Dhayalan

2.4 rule based classificationKrish_ver2

2.5 backpropagationKrish_ver2

Database securityMaryamAsghar9

Classification in data mining Sulman Ahmed

Data Mining: clustering and analysisDataminingTools Inc

3.2 partitioning methodsKrish_ver2

Types of clustering and different types of clustering algorithmsPrashanth Guntal

What's hot (20)

Introduction to Machine Learning Classifiers

Data mining: Classification and prediction

Clustering

Security Lifecycle Management Process

Instance based learning

04 Classification in Data Mining

Data mining: Concepts and Techniques, Chapter12 outlier Analysis

$Decision tree induction \ Decision Tree Algorithm with Example| Data science$ $Decision tree induction \ Decision Tree Algorithm with Example| Data science$

Decision tree induction \ Decision Tree Algorithm with Example| Data science

3.7 outlier analysis

DB security

K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...

Data preprocessing

CISSP - Chapter 3 - Cryptography

2.4 rule based classification

2.5 backpropagation

Database security

Classification in data mining

Data Mining: clustering and analysis

3.2 partitioning methods

Types of clustering and different types of clustering algorithms

Similar to Computation systems for protecting delimited data

JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...IEEEGLOBALSOFTTECHNOLOGIES

Utility privacy tradeoff in databases an information-theoretic approachIEEEFINALYEARPROJECTS

Data Mining: Cluster AnalysisSuman Mia

Pre-Processing and Data PreparationUmair Shafique

Key aggregate searchable encryption (kase) for group data sharing via cloud s...CloudTechnologies

Dmblogveeralakshmi pandi

FirewallsDr.Florence Dayana

privacy preserving forenciscs of encyrpted data.pptxGayathriSanthosh11

Secure Data Sharing Algorithm for Data Retrieval In Military Based NetworksIJTET Journal

ClusteringMeme Hei

Query Processing with k-AnonymityWaqas Tariq

Investigation on Revocable Fine-grained Access Control Scheme for Multi-Autho...IJCERT JOURNAL

Secure data retrieval for decentralized disruption tolerant military networksIGEEKS TECHNOLOGIES

Chapter 5.pdfDrGnaneswariG

130509International Journal of Technical Research & Application

CNS - Unit - 1 - IntroductionGyanmanjari Institute Of Technology

algoritma klastering.pdfbintis1

JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...IEEEGLOBALSOFTTECHNOLOGIES

Address bookSiva Rushi

Similar to Computation systems for protecting delimited data (20)

JAVA 2013 IEEE NETWORKSECURITY PROJECT Utility privacy tradeoff in databases ...

Utility privacy tradeoff in databases an information-theoretic approach

Data Mining: Cluster Analysis

Pre-Processing and Data Preparation

Key aggregate searchable encryption (kase) for group data sharing via cloud s...

Dmblog

Firewalls

privacy preserving forenciscs of encyrpted data.pptx

Secure Data Sharing Algorithm for Data Retrieval In Military Based Networks

Clustering

Query Processing with k-Anonymity

Investigation on Revocable Fine-grained Access Control Scheme for Multi-Autho...

Secure data retrieval for decentralized disruption tolerant military networks

Chapter 5.pdf

130509

CNS - Unit - 1 - Introduction

algoritma klastering.pdf

JAVA 2013 IEEE PARALLELDISTRIBUTION PROJECT A privacy leakage upper bound con...

Address book

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

How to convert PDF to text with Nanonetsnaman860154

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

CloudStudio User manual (basic edition):comworks

Slack Application Development 101 Slidespraypatel2

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j

Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Injustice - Developers Among Us (SciFiDevCon 2024)

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

How to convert PDF to text with Nanonets

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Next-generation AAM aircraft unveiled by Supernal, S-A2

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

CloudStudio User manual (basic edition):

Slack Application Development 101 Slides

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Presentation on how to chat with PDF using ChatGPT code interpreter

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget

Benefits Of Flutter Compared To Other Frameworks

08448380779 Call Girls In Friends Colony Women Seeking Men

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

The transition to renewables in India.pdf

SQL Database Design For Developers at php[tek] 2024

Computation systems for protecting delimited data

1. Computational Systems for Protecting Delimited Data Unit 5

2. Table of contents • The Goal • What is delimited data? • Various computational systems for protecting delimited data MinGen Data fly μ-Argus System k-Similar Algorithm Scrub System • References 

3. The Goal Explore computational techniques to: Release useful information in such a way that the identity of any individual or entity contained in data cannot be recognized while the data remains practically useful

4. What is delimited data? • Data separated by a delimiter such as a comma character(,) or a tab. • Generally used in hospital records, office records etc. • eg.

5. Computational systems for maintaining privacy when disclosing person-specific information Computational systems Description MinGen uses the generalization and suppression as disclosure limitation techniques Datafly System generalizes values based on a profile of the data recipient at the time of disclosure μ-Argus System somewhat similar system which is becoming a European standard for disclosing public use data k-Similar algorithm finds optimal results such that the data are minimally distorted yet adequately protected Scrub System locates and suppresses or replaces personally identifying information in letters, notes and other textual documents

6. MinGen

7. Datafly System • Maintains anonymity in released data by automatically substituting, generalizing and suppressing information as appropriate. • Decisions are made at the attribute and tuple level at the time of database access • Role based approach • The end result - a subset of the original database that provides minimal linking and matching of data because each tuple matches as many people as the data holder specifies.

8. Datafly System • User sets anonymity value • The Datafly System iteratively computes increasingly less specific versions of the values for the attribute until eventually the desired anonymity level is attained. • The iterative process ends when there exists k tuples having the same values assigned across a group of attributes

9. Datafly System •Output table - attributes and tuples correspond to the anonymity level specified by the data holder •anonymity level = 0.7.

10. μ-Argus System • Provides protection by enforcing a k requirement on the values found in a quasi-identifier. • The data holder:  provides a value of k specifies which attributes are sensitive by assigning a value to each attribute between 0 and 3 denoting "not identifying," "most identifying," "more identifying," and "identifying," respectively. • The program identifies rare and therefore unsafe combinations by testing some 2- or 3-combinations of attributes declared to be sensitive.

11. μ-Argus System • Unsafe combinations are eliminated by generalizing attributes within the combination and by local cell suppression. • Rather than removing entire tuples when one or more attributes contain outlier information as is done in the Datafly System, the m-Argus System simply suppresses or blanks out the outlier values at the cell-level • The resulting data typically contain all the tuples and attributes of the original data, though values may be missing in some cell locations.

12. μ-Argus System Combinations of More, Most, Identifying tested by m-Argus

13. k-Similar Algorithm • There does not exists fewer than k tuples in the release data having the same values across the quasi identifier. • Based on correctness of the k similar clustering algo k- map protection is avoided

14. Scrub System • Provides a methodology for removing personally identifying info in medical writings integrity of the info remains intact Identity of the person remains confidential • called Scrubbing

15. References • Sweeney, Latanya. "Foundations of privacy protection from a computer science perspective." In Proceedings, Joint Statistical Meeting, AAAS, Indianapolis, IN. 2000.

Editor's Notes

Generalizes values within attributes as needed, and removes extreme outlier information from the released data.

Computation systems for protecting delimited data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Computation systems for protecting delimited data

Similar to Computation systems for protecting delimited data (20)

More from G Prachi

More from G Prachi (20)

Recently uploaded

Recently uploaded (20)

Computation systems for protecting delimited data

Editor's Notes