UNIT-IV
DATA PRIVACY
Fundamental Concepts,Definitions, Data Privacy Attacks,
Data linking and profiling, access control models, role
based access control, privacy in different domains-
medical, financial, etc
Text Book Page No. 242
2.
Data Privacy BasicDefinitions
Data privacy is an area of data management that
involves the
proper handling of sensitive data to ensure
confidentiality
and accuracy. “Sensitive data” includes personal data
and
other confidential data, such as certain financial data
and
intellectual property data
Need for Data Privacy
Data privacy is an essential part of ensuring two main
business imperatives:
1. Asset Management: Data is one of the most
important
assets for any organization, regardless of industry,
size,
etc. Companies find enormous value in collecting,
sharing,
and using data from a variety of sources for many
reasons.
2. Regulatory Compliance: Managing data to ensure
regulatory compliance could be more important than
meeting expectation of staff, customers, and business
partners. Most organizations must meet legal
responsibilities
about how they collect, store, and process personal
data.
3.
Data Security andData privacy
Data privacy refers to the protection of sensitive
information,
allowing individuals to have control over how their
data is
collected, stored, and shared. With the increasing
amount
of personal data being captured and stored by various
organizations and entities, the need for data privacy
has
become more important than ever.
4.
Importance of DataPrivacy
Data is the foundation of countless interactions and
decisions
that we make in our professional and personal lives.
Experts
who navigate this vast and intricate data landscape
every
day must understand the true value of data privacy.
Data
privacy is important for the following four reasons:
™. Building good reputations and maintaining trust
™. Making sure there are no big legal and financial
problems
™. Taking control of your information
™. Keeping your digital assets safe from sneaky hackers
5.
Data Privacy Attack
Dataprivacy attacks can be broadly classified into
several
types, each with its own unique methods and
motivations.
Understanding these types of attacks is crucial for
maintaining the security of sensitive data and
protecting
individuals’ privacy.
1. Phishing
2. Malware
3. Man-in-the-Middle (MitM) Attacks
4. Denial of Service (DoS) Attacks
5. SQL Injection
6. Insider Threats
7. Social Engineering
8. Cross-Site Scripting (XSS)
6.
Data Linking andprofiling
Data linking refers to the process of connecting related data from different sources to create a
comprehensive dataset. This is often done to gain a more holistic view of a particular subject or to
perform more in-depth analysis. Examples of data linking include merging customer information from
sales records with demographic data.
Data Profiling
Data Profiling can be defined as the process of examining and analyzing data to create valuable
summaries of it. Data profiling is the process of examining, analyzing, and summarizing the structure and
content of a data set. It allow data analysts and scientists to gain a better understanding
7.
Types of DataProfiling:
1. Structural Profiling: This type of data profiling focuses on understanding the structure of
the data, including data types, length, and format. It helps in identifying inconsistencies and
anomalies within the structure of the data, such as missing values or data that does not
confirm to the expected format.
2. Content Profiling: Content profiling examines deeper into the actual content of the data,
examining the values and patterns within the data set. It involves analyzing the distribution of
values, identifying outliers, and detecting patterns or relationships between different data
elements.
3. Data Quality Profiling: Data quality profiling assesses the overall quality of the data, including
its accuracy, completeness, and consistency. It helps in identifying data quality issues such as
duplicates, invalid values, and data discrepancies between different data sources
8.
Key Benefits ofData Profiling
1. Improved Data Quality: By identifying and addressing
data quality issues early on, data profiling helps in improving
the overall quality and reliability of the data, which is
essential for making accurate and informed decisions.
2. Enhanced Data Understanding: Data profiling provides
deeper insights into the structure and content of the data,
allowing analysts to better understand the characteristics and
patterns within the data set. This enhanced understanding
can lead to more effective data analysis and interpretation.
3. Data Integration and Standardization: Profiling the
data helps in identifying inconsistencies and discrepancies
between different data sources, which is crucial for
data integration and standardization efforts. It enables
organizations to create a unified view of their data, leading
to greater consistency and efficiency in data management.
4. Compliance and Governance: Data profiling plays a
critical role in ensuring compliance with regulations and
data governance standards. By identifying and resolving
data quality issues, organizations can maintain data
integrity and meet regulatory requirements.
9.
5. Cost andTime Savings: Early detection and
resolution
of data quality issues through data profiling can lead
to
significant cost and time savings. It minimizes the
need for
manual data cleansing and reconciliation efforts,
which can be resource-intensive and time-consuming.
6. Predictive Decision-making: By using profiled
information, small mistakes can be prevented from
becoming major ones. Additionally, it can assist
businesses
in understanding possible outcomes.
7. Organized Sorting: Most databases interact with
data
coming from a variety of sources, such as social media,
surveys, and other big data markets. With Data
Profiling, it
is possible to track down the source of data and
ensure the security of data by encrypting it properly.
10.
Data Profiling Methods
DataProfiling consists of three basic methods:
1. Column Profiling: This method counts the number of times each value
appears within each column of a table. Data patterns can be discovered using
this method.
2. Cross-column Profiling: In this approach, users examinecolumns to conduct
Key and Dependency Analysis. Key
Analysis is used to scan the values in a table and find a
possible Primary Key. Dependency Analysis identifies
the relationships between data sets. By combining these
analyses, users can determine the connections and
dependencies between tables.
3. Cross-table Profiling: In this method, users examine
tables to find all possible Foreign Keys. It also aims to
find similarities and differences in data types and syntax
between tables.
11.
The 4 mainaccess control models are:
1. Discretionary access control (DAC)
2. Mandatory access control (MAC)
3. Role-based access control (RBAC)
4. Rule-based access control (RuBAC)
12.
DATA PRIVACY INDIFFERENT DOMAINS
Data privacy is critically important across various
domains,
including business, medical, and financial sectors, due
to the
sensitivity and confidentiality of the information
involved.
Here’s a breakdown of the need for data privacy in
each
sector:
1. Business Sector:
™. Protecting Customer Trust:
™. Compliance with Regulations:
™. Intellectual Property Protection:
™. Employee Privacy:
13.
2. Medical Sector
Inthe medical sector, patient confidentiality is a
cornerstone
of the doctor-patient relationship. Medical records
contain
highly personal information such as medical history,
test
results, and diagnosis. If
™. Patient Confidentiality:
™. Preventing Identity Theft and Fraud: ™. Research and
Development:
3. Financial Sector
In the financial sector, data privacy is essential to
protect
individuals’ financial information, such as bank
account
details, investment portfolios, and credit scores. Not
only
could unauthorized access to this information result in
financial loss for the individual, but it could also lead to
identity theft and fraud.
Preventing Financial Fraud:
™. Regulatory Compliance:
™. Trust and Reputation: