Data leakage detection

Data leakage detectionData leakage detection

ABSTRACTABSTRACT
A data distributor has given sensitive data to a
set of supposedly trusted agents.
Sometimes data is leaked and found in
unauthorized place e.g., on the web or on
somebody's laptop.
Data leakage happens every day when
confidential business information are leaked out.
When these are leaked out it leaves the company
unprotected and goes outside the jurisdiction of
the corporation.

MotivationMotivation
In the past few years ,there has been a
sharp increase in data leakage from many
organizations.
According to 2006, the FBI computer
crime and security survey, Data leakage is
the greatest source of financial loss of
organization.
The above issues motivated to me to
choose this project.

ObjectiveObjective
The objective of this project is to
improve the probability of identifying
leakages using Data allocation strategies
across the agents and also to identify the
guilty party who leaked the data by
injecting “realistic but fake” data records.

Problem StatementProblem Statement
In the course of doing business, sometimes
sensitive data must be given to trusted third
parties. Some of the data is leaked and found in
an unauthorized place.
The distributor cannot blame the agent without
any evidence. This project identifies the agent
who leaked the data with enough evidence.

Limitations of current systemLimitations of current system
Current approach can detect the hackers
but the total number of evidence will be
less and the organization may not be able
to proceed legally for further proceedings
due to lack of good amount of evidence
and the chances to escape of hackers are
high.

Proposed system addressesProposed system addresses
following issuesfollowing issues
1. Algorithm used to distribute the objects
to agents that improves the chances of
identifying a leaker.
2. Realistic but fake objects is injected to
the distributed set.
3. Leakers cannot argue that they did not
leak the confidential data, because this
system traces leakers with good amount
of evidence.

Block diagramBlock diagram
Request data
Leaks the data
Distributor
Agent
Database
View Data to transfer
the agents
Add the fake objects
to the original data
Find the guilty
agents
Probability distribution
of data leaked by guilty
agents
Login registration
Explicit Data
request
Transfer data to
agents
E-Random
(Algorithm)
E-Optimal
(Algorithm)

ModulesModules
1. Data allocation module
2. Fake object module
3. Optimization module
4. Data distributor module

Data Allocation Module:Data Allocation Module:
The main focus of our project is the data
allocation problem as how can the
distributor “intelligently” give data to
agents in order to improve the chances of
detecting a guilty agent.

Fake Object Module:Fake Object Module:
Fake objects are objects generated by the
distributor in order to increase the
chances of detecting agents that leak data.
The distributor may be able to add fake
objects to the distributed data in order to
improve his effectiveness in detecting
guilty agents. Our use of fake objects is
inspired by the use of “trace” records in
mailing lists.

Optimization Module:Optimization Module:
The Optimization Module is the
distributor’s data allocation to agents has
one constraint and one objective. The
distributor’s constraint is to satisfy
agents’ requests, by providing them with
the number of objects they request or
with all available objects that satisfy their
conditions. His objective is to be able to
detect an agent who leaks any portion of
his data.

Data Distributor:Data Distributor:
A data distributor has given sensitive data
to a set of supposedly trusted agents
(third parties). Some of the data is leaked
and found in an unauthorized place (e.g.,
on the web or somebody’s laptop). The
distributor must assess the likelihood that
the leaked data came from one or more
agents, as opposed to having been
independently gathered by other means.

Software &Hardware RequirementsSoftware &Hardware Requirements
Hardware Required:
System : Pentium IV 2.4 GHz
Hard Disk : 40 GB
Floppy Drive : 1.44 MB
RAM : 256 MB
Software Required:
O/S : Windows XP.
Language : J2EE
Data Base : MySql Server

ReferencesReferences
 P. Papadimitriou and H. Garcia-molina “Data leakage
detection " IEEE Transaction on knowledge and data
engineering, pages 51-63 volume 23,2011
 P.M Pardalos and S.A Vavasis,”Quadratic programming
with one negative Eigen value is NP-Hard,” J. Global
Optimization. Vol 1, no 1, pp.
 IEEE conference paper: Agrawal and J. Kiernan.
Watermarking relational databases. In VLDB ’02:
Proceedings of the 28th
international conference on Very
Large Data Bases, pages 155–166. VLDB Endowment,
2002
 Y. Cui and J. Widom. Lineage tracing for general data
warehouse transformations. In The VLDB Journal,
pages 471–480, 2001.

Data leakage detection

More Related Content

What's hot

Viewers also liked

Similar to Data leakage detection

Data leakage detection