SlideShare a Scribd company logo
1 of 18
Data Leakage Detection
Presented By: Guided By:
Miss. Ashwini A. Nerkar Prof. S. R. Sontakke
Computer Science & Engineering Department,
P. R. Patil College of Engg. & Technology, Amravati.
Contents:
 What is Data Leakage?
 How does Data Leakage take place?
 Current System & limitations
 Problem & Setup Notation
 Addition of Fake Objects
 Data Allocation Strategy
 Data Distribution Strategy
 Optimization Problem
 Calculation of Probability
 Conclusion
 References
1 1
0
1 0
0
1
0
1
0
0 1
0
1
What is Data Leakage?
 Data leakage is the unauthorized transmission of sensitive data or
information within an organization to an external destination or
recipient.
 Sensitive data of companies and organization includes
 intellectual property,
 financial information ,
 patient information,
 personal credit card data ,
 and other information depending upon the business and the
industry.
How does data leakage take place?
 In the course of doing business, sometimes data must be handed over to
trusted third parties for some enhancement or operations.
 Sometimes these trusted third parties may act as points of data leakage.
 Examples:
 A hospital may give patient records to researcher who will devise
new treatments.
 A company may have partnership with other companies that require
sharing of customer data.
 An enterprise may outsource its data processing , so data must be
given to various other companies.
 Owner of data is termed as the distributor and the third parties
are called as the agents .
 In case of data leakage, the distributor must assess or judge
the likelihood that the leaked data came from one or more
agents, as opposed to having been independently gathered by
other means.
Current System & Its Limitations:
 The current technique used for Data Leakage Detection is
‘Watermarking’.
 A unique code is embedded in each distributed copy. If that copy is
later discovered in the hands of an unauthorized party, the leaker
can be identified.
 Limitations:
 It involves some modification of data that is making the data
less sensitive by altering attributes of the data.
 The second problem is that these watermarks can be
sometimes destroyed if the recipient is malicious.
Thus we need a data leakage detection
technique which fulfils the following objective
and abides by the given constraint.
CONSTRAINT :
To satisfy agent requests by providing them with the number of objects
they request or with all available objects that satisfy their conditions.
Avoid perturbation of original data before handing it to agents
OBJECTIVE:
To be able to detect an agent who leaks any portion of his data.
Problem Setup And Notation
Entities and Agents:
 A distributor owns a set T= {t1, tm} of valuable data objects. The
distributor wants to share some of the objects with a set of agents U1,
U2,Un, but does not wish the objects be leaked to other third parties.
 The objects in T could be of any type and size, e.g., they could be
tuples in a relation, or relations in a database. An agent Ui receives a
subset of objects, determined either by
* Sample request
or
* Explicit request
Fig: Representation of Problem Definition
The Architectural View of the
Model:
Addition of Fake Objects:
 The distributor is able to add fake objects in order to improve
the effectiveness in detecting the guilty agent.
 Fake objects are objects generated by the distributor that are
not in the original set.
 The objects are designed which appear realistic, and are
distributed among the agents along with the original objects.
 Different fake objects may be added to the data sets of
different agents in order to increase the chances of detecting
agents that leak data.
Data Allocation Strategy:
 The distributor intelligently give data to agents in order to improve the
chances of detecting a guilty agent.
 There are four instances of this problem, depending on the type of data
requests made by agents and whether “fake objects” are allowed.
Data Distribution Strategy: `
 Sample Data Request:
 The distributor has the freedom to select the data items to provide the agents
with
 General Idea:
-- Provide agents with as much disjoint sets of data as possible.
 Explicit data requests:
 The distributor must provide agents with the data they request
 General Idea:
-- Add fake data to the distributed ones to minimize overlap of distributed data
Optimization Problem:
 The distributor’s data allocation to agents has one
constraint and one objective.
 The distributor’s constraint is to satisfy agents’
requests.
 His objective is to be able to detect an agent who leaks
any portion of his data by maximizing the guilt
probability difference.
Calculation Of Probability:
 The request of every agent is evaluated and probability
of each agent being guilty is calculated.
 Pr {Gi |Ri=S } is the probability that agent is guilty
(Gi) if the distributor discovers a leaked record (Ri) or
table S that contains all objects.
Conclusion:
 In the real scenario there is no need to hand over the sensitive data to
the agents who will unknowingly or maliciously leak it.
 However, in many cases, we must indeed work with agents that may
not be 100 percent trusted, and we may not be certain if a leaked
object came from an agent or from some other source.
 In spite of these difficulties, it is possible to assess the likelihood that
an agent is responsible for a leak, based on the overlap of his data
with the leaked data .
 The variety of data distribution strategies that can improve the
distributor’s chances of identifying a leaker.
References:
 P. Papadimitriou and H. Garcia-Molina, “Data leakage detection,” IEEE
Transactions on Knowledge and Data Engineering, pages 51-63, volume 23,
2011
 Anusha Koneru, G.Siva Nageswara Rao, J.Venkata Rao/International Journal
of P2P Network Trends and Technology- Volume3 Issue2-2013/ISSN: 2249-
2615/ ‘Data Leakage Detection Using Encrypted Fake Objects’/
http://www.internationaljournalssrg.org Page 104
 Sandip A. Kale, Prof. S.V.Kulkarni/ Data Leakage Detection / International
Journal of Advanced Research in Computer and Communication Engineering
Vol. 1, Issue 9, November 2012
 International Journal of Computer Trends and Technology- volume3Issue1-
2012 ISSN:2231-2803 http://www.internationaljournalssrg.org ‘Data
Allocation Strategies for Detecting Data Leakage’ Srikanth Yadav, Dr. Y.
Eswara rao, V. Shanmukha Rao, R. Vasantha
Data Leakage Detection

More Related Content

What's hot

Data leakage detection (synopsis)
Data leakage detection (synopsis)Data leakage detection (synopsis)
Data leakage detection (synopsis)Mumbai Academisc
 
Data Leakage Presentation
Data Leakage PresentationData Leakage Presentation
Data Leakage PresentationMike Spaulding
 
Information Leakage & DLP
Information Leakage & DLPInformation Leakage & DLP
Information Leakage & DLPYun Lu
 
Virus and Malicious Code Chapter 5
Virus and Malicious Code Chapter 5Virus and Malicious Code Chapter 5
Virus and Malicious Code Chapter 5AfiqEfendy Zaen
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousingSunny Gandhi
 
Application of data mining
Application of data miningApplication of data mining
Application of data miningSHIVANI SONI
 
OSINT: Open Source Intelligence gathering
OSINT: Open Source Intelligence gatheringOSINT: Open Source Intelligence gathering
OSINT: Open Source Intelligence gatheringJeremiah Tillman
 
data hiding techniques.ppt
data hiding techniques.pptdata hiding techniques.ppt
data hiding techniques.pptMuzamil Amin
 
The CIA triad.pptx
The CIA triad.pptxThe CIA triad.pptx
The CIA triad.pptxGulnurAzat
 
Proactive Defense: Understanding the 4 Main Threat Actor Types
Proactive Defense: Understanding the 4 Main Threat Actor TypesProactive Defense: Understanding the 4 Main Threat Actor Types
Proactive Defense: Understanding the 4 Main Threat Actor TypesRecorded Future
 
Dos & Ddos Attack. Man in The Middle Attack
Dos & Ddos Attack. Man in The Middle AttackDos & Ddos Attack. Man in The Middle Attack
Dos & Ddos Attack. Man in The Middle Attackmarada0033
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3varshakumar21
 

What's hot (20)

Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
Data leakage detection (synopsis)
Data leakage detection (synopsis)Data leakage detection (synopsis)
Data leakage detection (synopsis)
 
security and privacy-Internet of things
security and privacy-Internet of thingssecurity and privacy-Internet of things
security and privacy-Internet of things
 
Data Leakage Presentation
Data Leakage PresentationData Leakage Presentation
Data Leakage Presentation
 
Network forensic
Network forensicNetwork forensic
Network forensic
 
Information Leakage & DLP
Information Leakage & DLPInformation Leakage & DLP
Information Leakage & DLP
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Reports vs analysis
Reports vs analysisReports vs analysis
Reports vs analysis
 
Virus and Malicious Code Chapter 5
Virus and Malicious Code Chapter 5Virus and Malicious Code Chapter 5
Virus and Malicious Code Chapter 5
 
data mining and data warehousing
data mining and data warehousingdata mining and data warehousing
data mining and data warehousing
 
Data security
Data securityData security
Data security
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
OSINT: Open Source Intelligence gathering
OSINT: Open Source Intelligence gatheringOSINT: Open Source Intelligence gathering
OSINT: Open Source Intelligence gathering
 
data hiding techniques.ppt
data hiding techniques.pptdata hiding techniques.ppt
data hiding techniques.ppt
 
The CIA triad.pptx
The CIA triad.pptxThe CIA triad.pptx
The CIA triad.pptx
 
Tor Presentation
Tor PresentationTor Presentation
Tor Presentation
 
Proactive Defense: Understanding the 4 Main Threat Actor Types
Proactive Defense: Understanding the 4 Main Threat Actor TypesProactive Defense: Understanding the 4 Main Threat Actor Types
Proactive Defense: Understanding the 4 Main Threat Actor Types
 
Dos & Ddos Attack. Man in The Middle Attack
Dos & Ddos Attack. Man in The Middle AttackDos & Ddos Attack. Man in The Middle Attack
Dos & Ddos Attack. Man in The Middle Attack
 
Data science.chapter-1,2,3
Data science.chapter-1,2,3Data science.chapter-1,2,3
Data science.chapter-1,2,3
 
CS6004 Cyber Forensics
CS6004 Cyber ForensicsCS6004 Cyber Forensics
CS6004 Cyber Forensics
 

Similar to Data Leakage Detection

Dn31538540
Dn31538540Dn31538540
Dn31538540IJMER
 
83504808-Data-Leakage-Detection-1-Final.ppt
83504808-Data-Leakage-Detection-1-Final.ppt83504808-Data-Leakage-Detection-1-Final.ppt
83504808-Data-Leakage-Detection-1-Final.pptnaresh2004s
 
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdfDrog3
 
Data Allocation Strategies for Leakage Detection
Data Allocation Strategies for Leakage DetectionData Allocation Strategies for Leakage Detection
Data Allocation Strategies for Leakage DetectionIOSR Journals
 
A model to find the agent who responsible for data leakage
A model to find the agent who responsible for data leakageA model to find the agent who responsible for data leakage
A model to find the agent who responsible for data leakageeSAT Publishing House
 
A model to find the agent who responsible for data leakage
A model to find the agent who responsible for data leakageA model to find the agent who responsible for data leakage
A model to find the agent who responsible for data leakageeSAT Journals
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detectionAjitkaur saini
 
Jpdcs1(data lekage detection)
Jpdcs1(data lekage detection)Jpdcs1(data lekage detection)
Jpdcs1(data lekage detection)Chaitanya Kn
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detectionrejii
 
IRJET- Data Leakage Detection System
IRJET- Data Leakage Detection SystemIRJET- Data Leakage Detection System
IRJET- Data Leakage Detection SystemIRJET Journal
 
A Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
A Robust Approach for Detecting Data Leakage and Data Leaker in OrganizationsA Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
A Robust Approach for Detecting Data Leakage and Data Leaker in OrganizationsIOSR Journals
 
Modeling and Detection of Data Leakage Fraud
Modeling and Detection of Data Leakage FraudModeling and Detection of Data Leakage Fraud
Modeling and Detection of Data Leakage FraudIOSR Journals
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detectionbunnz12345
 
Data Leakage Detectionusing Distribution Method
Data Leakage Detectionusing Distribution Method Data Leakage Detectionusing Distribution Method
Data Leakage Detectionusing Distribution Method IJMER
 
Final review m score
Final review m scoreFinal review m score
Final review m scoreazhar4010
 
Privacy Preserving Based Cloud Storage System
Privacy Preserving Based Cloud Storage SystemPrivacy Preserving Based Cloud Storage System
Privacy Preserving Based Cloud Storage SystemKumar Goud
 

Similar to Data Leakage Detection (20)

Dn31538540
Dn31538540Dn31538540
Dn31538540
 
83504808-Data-Leakage-Detection-1-Final.ppt
83504808-Data-Leakage-Detection-1-Final.ppt83504808-Data-Leakage-Detection-1-Final.ppt
83504808-Data-Leakage-Detection-1-Final.ppt
 
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
164788616_Data_Leakage_Detection_Complete_Project_Report__1_.docx.pdf
 
Data Allocation Strategies for Leakage Detection
Data Allocation Strategies for Leakage DetectionData Allocation Strategies for Leakage Detection
Data Allocation Strategies for Leakage Detection
 
Sub1555
Sub1555Sub1555
Sub1555
 
A model to find the agent who responsible for data leakage
A model to find the agent who responsible for data leakageA model to find the agent who responsible for data leakage
A model to find the agent who responsible for data leakage
 
A model to find the agent who responsible for data leakage
A model to find the agent who responsible for data leakageA model to find the agent who responsible for data leakage
A model to find the agent who responsible for data leakage
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
Jpdcs1(data lekage detection)
Jpdcs1(data lekage detection)Jpdcs1(data lekage detection)
Jpdcs1(data lekage detection)
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
IRJET- Data Leakage Detection System
IRJET- Data Leakage Detection SystemIRJET- Data Leakage Detection System
IRJET- Data Leakage Detection System
 
A Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
A Robust Approach for Detecting Data Leakage and Data Leaker in OrganizationsA Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
A Robust Approach for Detecting Data Leakage and Data Leaker in Organizations
 
Modeling and Detection of Data Leakage Fraud
Modeling and Detection of Data Leakage FraudModeling and Detection of Data Leakage Fraud
Modeling and Detection of Data Leakage Fraud
 
DLD_SYNOPSIS
DLD_SYNOPSISDLD_SYNOPSIS
DLD_SYNOPSIS
 
547 551
547 551547 551
547 551
 
Data leakage detection
Data leakage detectionData leakage detection
Data leakage detection
 
Data Leakage Detectionusing Distribution Method
Data Leakage Detectionusing Distribution Method Data Leakage Detectionusing Distribution Method
Data Leakage Detectionusing Distribution Method
 
Final review m score
Final review m scoreFinal review m score
Final review m score
 
Privacy Preserving Based Cloud Storage System
Privacy Preserving Based Cloud Storage SystemPrivacy Preserving Based Cloud Storage System
Privacy Preserving Based Cloud Storage System
 
709 713
709 713709 713
709 713
 

Recently uploaded

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Recently uploaded (20)

APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

Data Leakage Detection

  • 1. Data Leakage Detection Presented By: Guided By: Miss. Ashwini A. Nerkar Prof. S. R. Sontakke Computer Science & Engineering Department, P. R. Patil College of Engg. & Technology, Amravati.
  • 2. Contents:  What is Data Leakage?  How does Data Leakage take place?  Current System & limitations  Problem & Setup Notation  Addition of Fake Objects  Data Allocation Strategy  Data Distribution Strategy  Optimization Problem  Calculation of Probability  Conclusion  References 1 1 0 1 0 0 1 0 1 0 0 1 0 1
  • 3. What is Data Leakage?  Data leakage is the unauthorized transmission of sensitive data or information within an organization to an external destination or recipient.  Sensitive data of companies and organization includes  intellectual property,  financial information ,  patient information,  personal credit card data ,  and other information depending upon the business and the industry.
  • 4. How does data leakage take place?  In the course of doing business, sometimes data must be handed over to trusted third parties for some enhancement or operations.  Sometimes these trusted third parties may act as points of data leakage.  Examples:  A hospital may give patient records to researcher who will devise new treatments.  A company may have partnership with other companies that require sharing of customer data.  An enterprise may outsource its data processing , so data must be given to various other companies.
  • 5.  Owner of data is termed as the distributor and the third parties are called as the agents .  In case of data leakage, the distributor must assess or judge the likelihood that the leaked data came from one or more agents, as opposed to having been independently gathered by other means.
  • 6. Current System & Its Limitations:  The current technique used for Data Leakage Detection is ‘Watermarking’.  A unique code is embedded in each distributed copy. If that copy is later discovered in the hands of an unauthorized party, the leaker can be identified.  Limitations:  It involves some modification of data that is making the data less sensitive by altering attributes of the data.  The second problem is that these watermarks can be sometimes destroyed if the recipient is malicious.
  • 7. Thus we need a data leakage detection technique which fulfils the following objective and abides by the given constraint. CONSTRAINT : To satisfy agent requests by providing them with the number of objects they request or with all available objects that satisfy their conditions. Avoid perturbation of original data before handing it to agents OBJECTIVE: To be able to detect an agent who leaks any portion of his data.
  • 8. Problem Setup And Notation Entities and Agents:  A distributor owns a set T= {t1, tm} of valuable data objects. The distributor wants to share some of the objects with a set of agents U1, U2,Un, but does not wish the objects be leaked to other third parties.  The objects in T could be of any type and size, e.g., they could be tuples in a relation, or relations in a database. An agent Ui receives a subset of objects, determined either by * Sample request or * Explicit request
  • 9. Fig: Representation of Problem Definition
  • 10. The Architectural View of the Model:
  • 11. Addition of Fake Objects:  The distributor is able to add fake objects in order to improve the effectiveness in detecting the guilty agent.  Fake objects are objects generated by the distributor that are not in the original set.  The objects are designed which appear realistic, and are distributed among the agents along with the original objects.  Different fake objects may be added to the data sets of different agents in order to increase the chances of detecting agents that leak data.
  • 12. Data Allocation Strategy:  The distributor intelligently give data to agents in order to improve the chances of detecting a guilty agent.  There are four instances of this problem, depending on the type of data requests made by agents and whether “fake objects” are allowed.
  • 13. Data Distribution Strategy: `  Sample Data Request:  The distributor has the freedom to select the data items to provide the agents with  General Idea: -- Provide agents with as much disjoint sets of data as possible.  Explicit data requests:  The distributor must provide agents with the data they request  General Idea: -- Add fake data to the distributed ones to minimize overlap of distributed data
  • 14. Optimization Problem:  The distributor’s data allocation to agents has one constraint and one objective.  The distributor’s constraint is to satisfy agents’ requests.  His objective is to be able to detect an agent who leaks any portion of his data by maximizing the guilt probability difference.
  • 15. Calculation Of Probability:  The request of every agent is evaluated and probability of each agent being guilty is calculated.  Pr {Gi |Ri=S } is the probability that agent is guilty (Gi) if the distributor discovers a leaked record (Ri) or table S that contains all objects.
  • 16. Conclusion:  In the real scenario there is no need to hand over the sensitive data to the agents who will unknowingly or maliciously leak it.  However, in many cases, we must indeed work with agents that may not be 100 percent trusted, and we may not be certain if a leaked object came from an agent or from some other source.  In spite of these difficulties, it is possible to assess the likelihood that an agent is responsible for a leak, based on the overlap of his data with the leaked data .  The variety of data distribution strategies that can improve the distributor’s chances of identifying a leaker.
  • 17. References:  P. Papadimitriou and H. Garcia-Molina, “Data leakage detection,” IEEE Transactions on Knowledge and Data Engineering, pages 51-63, volume 23, 2011  Anusha Koneru, G.Siva Nageswara Rao, J.Venkata Rao/International Journal of P2P Network Trends and Technology- Volume3 Issue2-2013/ISSN: 2249- 2615/ ‘Data Leakage Detection Using Encrypted Fake Objects’/ http://www.internationaljournalssrg.org Page 104  Sandip A. Kale, Prof. S.V.Kulkarni/ Data Leakage Detection / International Journal of Advanced Research in Computer and Communication Engineering Vol. 1, Issue 9, November 2012  International Journal of Computer Trends and Technology- volume3Issue1- 2012 ISSN:2231-2803 http://www.internationaljournalssrg.org ‘Data Allocation Strategies for Detecting Data Leakage’ Srikanth Yadav, Dr. Y. Eswara rao, V. Shanmukha Rao, R. Vasantha