UNIT-IV
DATA PRIVACY
Fundamental Concepts, Definitions, Data Privacy Attacks,
Data linking and profiling, access control models, role
based access control, privacy in different domains-
medical, financial, etc
Text Book Page No. 242
Data Privacy Basic Definitions
Data privacy is an area of data management that
involves the
proper handling of sensitive data to ensure
confidentiality
and accuracy. “Sensitive data” includes personal data
and
other confidential data, such as certain financial data
and
intellectual property data
Need for Data Privacy
Data privacy is an essential part of ensuring two main
business imperatives:
1. Asset Management: Data is one of the most
important
assets for any organization, regardless of industry,
size,
etc. Companies find enormous value in collecting,
sharing,
and using data from a variety of sources for many
reasons.
2. Regulatory Compliance: Managing data to ensure
regulatory compliance could be more important than
meeting expectation of staff, customers, and business
partners. Most organizations must meet legal
responsibilities
about how they collect, store, and process personal
data.
Data Security and Data privacy
Data privacy refers to the protection of sensitive
information,
allowing individuals to have control over how their
data is
collected, stored, and shared. With the increasing
amount
of personal data being captured and stored by various
organizations and entities, the need for data privacy
has
become more important than ever.
Importance of Data Privacy
Data is the foundation of countless interactions and
decisions
that we make in our professional and personal lives.
Experts
who navigate this vast and intricate data landscape
every
day must understand the true value of data privacy.
Data
privacy is important for the following four reasons:
™. Building good reputations and maintaining trust
™. Making sure there are no big legal and financial
problems
™. Taking control of your information
™. Keeping your digital assets safe from sneaky hackers
Data Privacy Attack
Data privacy attacks can be broadly classified into
several
types, each with its own unique methods and
motivations.
Understanding these types of attacks is crucial for
maintaining the security of sensitive data and
protecting
individuals’ privacy.
1. Phishing
2. Malware
3. Man-in-the-Middle (MitM) Attacks
4. Denial of Service (DoS) Attacks
5. SQL Injection
6. Insider Threats
7. Social Engineering
8. Cross-Site Scripting (XSS)
Data Linking and profiling
Data linking refers to the process of connecting related data from different sources to create a
comprehensive dataset. This is often done to gain a more holistic view of a particular subject or to
perform more in-depth analysis. Examples of data linking include merging customer information from
sales records with demographic data.
Data Profiling
Data Profiling can be defined as the process of examining and analyzing data to create valuable
summaries of it. Data profiling is the process of examining, analyzing, and summarizing the structure and
content of a data set. It allow data analysts and scientists to gain a better understanding
Types of Data Profiling:
1. Structural Profiling: This type of data profiling focuses on understanding the structure of
the data, including data types, length, and format. It helps in identifying inconsistencies and
anomalies within the structure of the data, such as missing values or data that does not
confirm to the expected format.
2. Content Profiling: Content profiling examines deeper into the actual content of the data,
examining the values and patterns within the data set. It involves analyzing the distribution of
values, identifying outliers, and detecting patterns or relationships between different data
elements.
3. Data Quality Profiling: Data quality profiling assesses the overall quality of the data, including
its accuracy, completeness, and consistency. It helps in identifying data quality issues such as
duplicates, invalid values, and data discrepancies between different data sources
Key Benefits of Data Profiling
1. Improved Data Quality: By identifying and addressing
data quality issues early on, data profiling helps in improving
the overall quality and reliability of the data, which is
essential for making accurate and informed decisions.
2. Enhanced Data Understanding: Data profiling provides
deeper insights into the structure and content of the data,
allowing analysts to better understand the characteristics and
patterns within the data set. This enhanced understanding
can lead to more effective data analysis and interpretation.
3. Data Integration and Standardization: Profiling the
data helps in identifying inconsistencies and discrepancies
between different data sources, which is crucial for
data integration and standardization efforts. It enables
organizations to create a unified view of their data, leading
to greater consistency and efficiency in data management.
4. Compliance and Governance: Data profiling plays a
critical role in ensuring compliance with regulations and
data governance standards. By identifying and resolving
data quality issues, organizations can maintain data
integrity and meet regulatory requirements.
5. Cost and Time Savings: Early detection and
resolution
of data quality issues through data profiling can lead
to
significant cost and time savings. It minimizes the
need for
manual data cleansing and reconciliation efforts,
which can be resource-intensive and time-consuming.
6. Predictive Decision-making: By using profiled
information, small mistakes can be prevented from
becoming major ones. Additionally, it can assist
businesses
in understanding possible outcomes.
7. Organized Sorting: Most databases interact with
data
coming from a variety of sources, such as social media,
surveys, and other big data markets. With Data
Profiling, it
is possible to track down the source of data and
ensure the security of data by encrypting it properly.
Data Profiling Methods
Data Profiling consists of three basic methods:
1. Column Profiling: This method counts the number of times each value
appears within each column of a table. Data patterns can be discovered using
this method.
2. Cross-column Profiling: In this approach, users examinecolumns to conduct
Key and Dependency Analysis. Key
Analysis is used to scan the values in a table and find a
possible Primary Key. Dependency Analysis identifies
the relationships between data sets. By combining these
analyses, users can determine the connections and
dependencies between tables.
3. Cross-table Profiling: In this method, users examine
tables to find all possible Foreign Keys. It also aims to
find similarities and differences in data types and syntax
between tables.
The 4 main access control models are:
1. Discretionary access control (DAC)
2. Mandatory access control (MAC)
3. Role-based access control (RBAC)
4. Rule-based access control (RuBAC)
DATA PRIVACY IN DIFFERENT DOMAINS
Data privacy is critically important across various
domains,
including business, medical, and financial sectors, due
to the
sensitivity and confidentiality of the information
involved.
Here’s a breakdown of the need for data privacy in
each
sector:
1. Business Sector:
™. Protecting Customer Trust:
™. Compliance with Regulations:
™. Intellectual Property Protection:
™. Employee Privacy:
2. Medical Sector
In the medical sector, patient confidentiality is a
cornerstone
of the doctor-patient relationship. Medical records
contain
highly personal information such as medical history,
test
results, and diagnosis. If
™. Patient Confidentiality:
™. Preventing Identity Theft and Fraud: ™. Research and
Development:
3. Financial Sector
In the financial sector, data privacy is essential to
protect
individuals’ financial information, such as bank
account
details, investment portfolios, and credit scores. Not
only
could unauthorized access to this information result in
financial loss for the individual, but it could also lead to
identity theft and fraud.
Preventing Financial Fraud:
™. Regulatory Compliance:
™. Trust and Reputation:

unit-4-data privacy.pptxunit-4-data privacy.pptx

  • 1.
    UNIT-IV DATA PRIVACY Fundamental Concepts,Definitions, Data Privacy Attacks, Data linking and profiling, access control models, role based access control, privacy in different domains- medical, financial, etc Text Book Page No. 242
  • 2.
    Data Privacy BasicDefinitions Data privacy is an area of data management that involves the proper handling of sensitive data to ensure confidentiality and accuracy. “Sensitive data” includes personal data and other confidential data, such as certain financial data and intellectual property data Need for Data Privacy Data privacy is an essential part of ensuring two main business imperatives: 1. Asset Management: Data is one of the most important assets for any organization, regardless of industry, size, etc. Companies find enormous value in collecting, sharing, and using data from a variety of sources for many reasons. 2. Regulatory Compliance: Managing data to ensure regulatory compliance could be more important than meeting expectation of staff, customers, and business partners. Most organizations must meet legal responsibilities about how they collect, store, and process personal data.
  • 3.
    Data Security andData privacy Data privacy refers to the protection of sensitive information, allowing individuals to have control over how their data is collected, stored, and shared. With the increasing amount of personal data being captured and stored by various organizations and entities, the need for data privacy has become more important than ever.
  • 4.
    Importance of DataPrivacy Data is the foundation of countless interactions and decisions that we make in our professional and personal lives. Experts who navigate this vast and intricate data landscape every day must understand the true value of data privacy. Data privacy is important for the following four reasons: ™. Building good reputations and maintaining trust ™. Making sure there are no big legal and financial problems ™. Taking control of your information ™. Keeping your digital assets safe from sneaky hackers
  • 5.
    Data Privacy Attack Dataprivacy attacks can be broadly classified into several types, each with its own unique methods and motivations. Understanding these types of attacks is crucial for maintaining the security of sensitive data and protecting individuals’ privacy. 1. Phishing 2. Malware 3. Man-in-the-Middle (MitM) Attacks 4. Denial of Service (DoS) Attacks 5. SQL Injection 6. Insider Threats 7. Social Engineering 8. Cross-Site Scripting (XSS)
  • 6.
    Data Linking andprofiling Data linking refers to the process of connecting related data from different sources to create a comprehensive dataset. This is often done to gain a more holistic view of a particular subject or to perform more in-depth analysis. Examples of data linking include merging customer information from sales records with demographic data. Data Profiling Data Profiling can be defined as the process of examining and analyzing data to create valuable summaries of it. Data profiling is the process of examining, analyzing, and summarizing the structure and content of a data set. It allow data analysts and scientists to gain a better understanding
  • 7.
    Types of DataProfiling: 1. Structural Profiling: This type of data profiling focuses on understanding the structure of the data, including data types, length, and format. It helps in identifying inconsistencies and anomalies within the structure of the data, such as missing values or data that does not confirm to the expected format. 2. Content Profiling: Content profiling examines deeper into the actual content of the data, examining the values and patterns within the data set. It involves analyzing the distribution of values, identifying outliers, and detecting patterns or relationships between different data elements. 3. Data Quality Profiling: Data quality profiling assesses the overall quality of the data, including its accuracy, completeness, and consistency. It helps in identifying data quality issues such as duplicates, invalid values, and data discrepancies between different data sources
  • 8.
    Key Benefits ofData Profiling 1. Improved Data Quality: By identifying and addressing data quality issues early on, data profiling helps in improving the overall quality and reliability of the data, which is essential for making accurate and informed decisions. 2. Enhanced Data Understanding: Data profiling provides deeper insights into the structure and content of the data, allowing analysts to better understand the characteristics and patterns within the data set. This enhanced understanding can lead to more effective data analysis and interpretation. 3. Data Integration and Standardization: Profiling the data helps in identifying inconsistencies and discrepancies between different data sources, which is crucial for data integration and standardization efforts. It enables organizations to create a unified view of their data, leading to greater consistency and efficiency in data management. 4. Compliance and Governance: Data profiling plays a critical role in ensuring compliance with regulations and data governance standards. By identifying and resolving data quality issues, organizations can maintain data integrity and meet regulatory requirements.
  • 9.
    5. Cost andTime Savings: Early detection and resolution of data quality issues through data profiling can lead to significant cost and time savings. It minimizes the need for manual data cleansing and reconciliation efforts, which can be resource-intensive and time-consuming. 6. Predictive Decision-making: By using profiled information, small mistakes can be prevented from becoming major ones. Additionally, it can assist businesses in understanding possible outcomes. 7. Organized Sorting: Most databases interact with data coming from a variety of sources, such as social media, surveys, and other big data markets. With Data Profiling, it is possible to track down the source of data and ensure the security of data by encrypting it properly.
  • 10.
    Data Profiling Methods DataProfiling consists of three basic methods: 1. Column Profiling: This method counts the number of times each value appears within each column of a table. Data patterns can be discovered using this method. 2. Cross-column Profiling: In this approach, users examinecolumns to conduct Key and Dependency Analysis. Key Analysis is used to scan the values in a table and find a possible Primary Key. Dependency Analysis identifies the relationships between data sets. By combining these analyses, users can determine the connections and dependencies between tables. 3. Cross-table Profiling: In this method, users examine tables to find all possible Foreign Keys. It also aims to find similarities and differences in data types and syntax between tables.
  • 11.
    The 4 mainaccess control models are: 1. Discretionary access control (DAC) 2. Mandatory access control (MAC) 3. Role-based access control (RBAC) 4. Rule-based access control (RuBAC)
  • 12.
    DATA PRIVACY INDIFFERENT DOMAINS Data privacy is critically important across various domains, including business, medical, and financial sectors, due to the sensitivity and confidentiality of the information involved. Here’s a breakdown of the need for data privacy in each sector: 1. Business Sector: ™. Protecting Customer Trust: ™. Compliance with Regulations: ™. Intellectual Property Protection: ™. Employee Privacy:
  • 13.
    2. Medical Sector Inthe medical sector, patient confidentiality is a cornerstone of the doctor-patient relationship. Medical records contain highly personal information such as medical history, test results, and diagnosis. If ™. Patient Confidentiality: ™. Preventing Identity Theft and Fraud: ™. Research and Development: 3. Financial Sector In the financial sector, data privacy is essential to protect individuals’ financial information, such as bank account details, investment portfolios, and credit scores. Not only could unauthorized access to this information result in financial loss for the individual, but it could also lead to identity theft and fraud. Preventing Financial Fraud: ™. Regulatory Compliance: ™. Trust and Reputation: