A
SEMINAR
ON
DATA LEAKAGE
DETECTION
Sankhadip Kundu (14501212023)
Saurabh Hazra (14501212020)
Shubham Seal (14501212009)
PRESENTED BY:
INTRODUCTION
 Data leakage is defined as the accidental or unintentional distribution
of private or sensitive data to an unauthorized entity .
 Data leakage poses a serious issue for companies as the number of
incidents and the cost to those experiencing them continue to increase.
 Data leakage is enhanced by the fact that transmitted data including
emails, instant messaging, website forms, and file transfers among
others, are largely unregulated and unmonitored on their way to their
destinations.
OBJECTIVE
 A data distributor has given sensitive data to a set of
supposedly trusted agents (third parties).
 Some of the data is leaked and found in an unauthorized place
(e.g., on the web or somebody’s laptop).
 The distributor must assess the likelihood that the leaked data
came from one or more agents, as opposed to having been
independently gathered by other means.
 We propose data allocation strategies (across the agents) that
improve the probability of identifying leakages.
EXISTING SYSTEM
 Traditionally, leakage detection is handled by watermarking,
e.g., a unique code is embedded in each distributed copy.
 If that copy is later discovered in the hands of an unauthorized
party, the leaker can be identified.
Disadvantages of Existing Systems
 Watermarks can be very useful in some cases, but again, involve
some modification of the original data.
 Furthermore, watermarks can sometimes be destroyed if the data
recipient is malicious. E.g. A hospital may give patient records to
researchers who will devise new treatments.
 Similarly, a company may have partnerships with other
companies that require sharing customer data.
 Another enterprise may outsource its data processing, so data
must be given to various other companies.
 We call the owner of the data the distributor and the supposedly
trusted third parties the agents.
PROPOSED SYSTEM
 Our goal is to detect when the distributor's sensitive data has
been leaked by agents, and if possible to identify the agent that
leaked the data.
 Perturbation is a very useful technique where the data is
modified and made "less sensitive" before being handed to agents.
We develop unobtrusive techniques for detecting leakage of a set
of objects or records.
 We develop a model for assessing the "guilt" of agents.
 We also present algorithms for distributing objects to agents, in
a way that improves our chances of identifying a leaker.
Types of employees that put our company at
risk
 The security illiterate
 The unlawful residents
 The malicious/disgruntled employees
IMPACT ON ORGANIZATIONS
 Financial & reputational loss
 Small leaks accumulate to big loss
 Loss of customer & employee private information
 Loss of competitive position
 Lawsuits or regulatory consequences
MODULES
Admin Module
 Administrator has to logon to the system.
 Admin can add/view/delete/edit the user details.
User Module
 A user must login to use the services.
 A user can accept/reject data sharing requests from other users.
DATA LOSS PREVENTION
 To protect against confidential data theft and loss, a multi-layered security
foundation is needed
 Control/limit access to the data –firewalls, remote access controls, network
access controls, physical security controls
 Secure information from threats –protect perimeter and endpoints from
malware, botnets, viruses, DoS, etc. with security technology
 Control use of sensitive data once access is granted –policy-based content
inspection, acceptable use, encryption
 Cisco’s Solution for Data Loss Prevention
 Build a secure foundation with a Self-Defending Network
 Integrate DLP controls into security devices to protect data and increase
visibility.
CONCLUSION
 In the real scenario there is no need to hand over the sensitive data to
the agents who will unknowingly or maliciously leak it.
 Though the leakers are identified using the traditional technique of
watermarking, certain data cannot admit watermarks.
 In spite of these difficulties, it is possible to assess the likelihood that
an agent is responsible for a leak, based on the overlap of his data
with the leaked data
REFERENCES
 www.google.com
 www.wikipedia.com
 www.about.com
Data leakage detection

Data leakage detection

  • 1.
  • 2.
    Sankhadip Kundu (14501212023) SaurabhHazra (14501212020) Shubham Seal (14501212009) PRESENTED BY:
  • 3.
    INTRODUCTION  Data leakageis defined as the accidental or unintentional distribution of private or sensitive data to an unauthorized entity .  Data leakage poses a serious issue for companies as the number of incidents and the cost to those experiencing them continue to increase.  Data leakage is enhanced by the fact that transmitted data including emails, instant messaging, website forms, and file transfers among others, are largely unregulated and unmonitored on their way to their destinations.
  • 4.
    OBJECTIVE  A datadistributor has given sensitive data to a set of supposedly trusted agents (third parties).  Some of the data is leaked and found in an unauthorized place (e.g., on the web or somebody’s laptop).  The distributor must assess the likelihood that the leaked data came from one or more agents, as opposed to having been independently gathered by other means.  We propose data allocation strategies (across the agents) that improve the probability of identifying leakages.
  • 5.
    EXISTING SYSTEM  Traditionally,leakage detection is handled by watermarking, e.g., a unique code is embedded in each distributed copy.  If that copy is later discovered in the hands of an unauthorized party, the leaker can be identified.
  • 6.
    Disadvantages of ExistingSystems  Watermarks can be very useful in some cases, but again, involve some modification of the original data.  Furthermore, watermarks can sometimes be destroyed if the data recipient is malicious. E.g. A hospital may give patient records to researchers who will devise new treatments.  Similarly, a company may have partnerships with other companies that require sharing customer data.  Another enterprise may outsource its data processing, so data must be given to various other companies.  We call the owner of the data the distributor and the supposedly trusted third parties the agents.
  • 7.
    PROPOSED SYSTEM  Ourgoal is to detect when the distributor's sensitive data has been leaked by agents, and if possible to identify the agent that leaked the data.  Perturbation is a very useful technique where the data is modified and made "less sensitive" before being handed to agents. We develop unobtrusive techniques for detecting leakage of a set of objects or records.  We develop a model for assessing the "guilt" of agents.  We also present algorithms for distributing objects to agents, in a way that improves our chances of identifying a leaker.
  • 8.
    Types of employeesthat put our company at risk  The security illiterate  The unlawful residents  The malicious/disgruntled employees
  • 9.
    IMPACT ON ORGANIZATIONS Financial & reputational loss  Small leaks accumulate to big loss  Loss of customer & employee private information  Loss of competitive position  Lawsuits or regulatory consequences
  • 10.
    MODULES Admin Module  Administratorhas to logon to the system.  Admin can add/view/delete/edit the user details. User Module  A user must login to use the services.  A user can accept/reject data sharing requests from other users.
  • 11.
    DATA LOSS PREVENTION To protect against confidential data theft and loss, a multi-layered security foundation is needed  Control/limit access to the data –firewalls, remote access controls, network access controls, physical security controls  Secure information from threats –protect perimeter and endpoints from malware, botnets, viruses, DoS, etc. with security technology  Control use of sensitive data once access is granted –policy-based content inspection, acceptable use, encryption  Cisco’s Solution for Data Loss Prevention  Build a secure foundation with a Self-Defending Network  Integrate DLP controls into security devices to protect data and increase visibility.
  • 12.
    CONCLUSION  In thereal scenario there is no need to hand over the sensitive data to the agents who will unknowingly or maliciously leak it.  Though the leakers are identified using the traditional technique of watermarking, certain data cannot admit watermarks.  In spite of these difficulties, it is possible to assess the likelihood that an agent is responsible for a leak, based on the overlap of his data with the leaked data
  • 13.