This document provides an overview of data loss prevention (DLP) technology. It discusses what DLP is, different DLP models for data in use, in motion, and at rest. It also covers typical DLP system architecture, approaches for data classification and identification, and some technical challenges. The document references DLP product websites and summarizes two research papers on using machine learning for automatic text classification to identify sensitive data for DLP systems.
DATA LOSS PREVENTION ENSURES CRITICAL INFORMATION ARE KEPT SAFELY AT THE CORPORATE NETWORK AND HELPS ADMINISTRATOR CONTROL THE DATA WHAT
END-USERS WISH TO TRANSFER.
Data loss is considered by security experts to be one of the most serious threats that businesses currently face.
Maintaining the confidentiality of personal information and data is an essential factor in operating a successful business. People must be able to trust that their service provider takes the appropriate measures to implement security controls that will ultimately protect their privacy.
However, some of the largest and most reputable organizations have fallen victim to data loss security breaches resulting in significant legal, financial, and reputation loss, including [1]:
The Bank of America: Losing the personal employee information of over one million employees
The United States Government: Losing data related to the military
Heartland Payment Systems: Transferring credit card information and other personal records of over 130 million customers
In 2013, it was estimated that data breaches had resulted in the exploitation of over 800 million personal records [2]. This number is also expected to rise over the next several years given the advanced tools that cybercriminals use to steal information and data.
Interestingly, it is not just cybercriminals who represent a threat as:
64% of data loss is caused by well-meaning insiders.
50% of employees leave with data.
$3.5 million average cost of a security breach.
Considering these extensive data breaches, it is practical for organizations to understand where their critical data is located and understanding current security controls that can stop data loss.
Data Loss Prevention (DLP) solutions locate critical and personal data for organizations and help prevent data loss. By having a deeper understanding of efficient DLP security controls, you will help protect the reputation of your organization.
For more information contact: rkopaee@riskview.ca
https://www.threatview.ca
http://www.riskview.ca
Technology Overview - Symantec Data Loss Prevention (DLP)Iftikhar Ali Iqbal
The presentation provides the following:
- Symantec Corporate Overview
- Solution Portfolio of Symantec
- Symantec Data Loss Prevention - Introduction
- Symantec Data Loss Prevention - Components
- Symantec Data Loss Prevention - Features & Use Cases
- Symantec Data Loss Prevention - System Requirements
- Symantec Data Loss Prevention - Appendix (extra information)
This provides a brief overview of Symantec Data Loss Prevention (DLP). Please note all the information is based prior to May 2016 and the full integration of Blue Coat Systems's set of solutions.
Overview of Data Loss Prevention (DLP) TechnologyLiwei Ren任力偉
DLP is a technology that detects potential data breach incidents in timely manner and prevents them by monitoring data in-use (endpoints), in-motion (network traffic), and at-rest (data storage). It has been driven by regulatory compliances and intellectual property protection. This talk will introduce DLP models that describe the capabilities and scope that a DLP system should cover. A few system categories will be discussed accordingly with high-level system architecture. DLP is an interesting technology in that it provides advanced content inspection techniques. As such, a few content inspection techniques will be proposed and investigated in rigorous terms.
Data Leakage is an important concern for the business organizations in this increasingly networked world these days. Unauthorized disclosure may have serious consequences for an organization in both long term and short term. Risks include losing clients and stakeholder confidence, tarnishing of brand image, landing in unwanted lawsuits, and overall losing goodwill and market share in the industry.
DATA LOSS PREVENTION ENSURES CRITICAL INFORMATION ARE KEPT SAFELY AT THE CORPORATE NETWORK AND HELPS ADMINISTRATOR CONTROL THE DATA WHAT
END-USERS WISH TO TRANSFER.
Data loss is considered by security experts to be one of the most serious threats that businesses currently face.
Maintaining the confidentiality of personal information and data is an essential factor in operating a successful business. People must be able to trust that their service provider takes the appropriate measures to implement security controls that will ultimately protect their privacy.
However, some of the largest and most reputable organizations have fallen victim to data loss security breaches resulting in significant legal, financial, and reputation loss, including [1]:
The Bank of America: Losing the personal employee information of over one million employees
The United States Government: Losing data related to the military
Heartland Payment Systems: Transferring credit card information and other personal records of over 130 million customers
In 2013, it was estimated that data breaches had resulted in the exploitation of over 800 million personal records [2]. This number is also expected to rise over the next several years given the advanced tools that cybercriminals use to steal information and data.
Interestingly, it is not just cybercriminals who represent a threat as:
64% of data loss is caused by well-meaning insiders.
50% of employees leave with data.
$3.5 million average cost of a security breach.
Considering these extensive data breaches, it is practical for organizations to understand where their critical data is located and understanding current security controls that can stop data loss.
Data Loss Prevention (DLP) solutions locate critical and personal data for organizations and help prevent data loss. By having a deeper understanding of efficient DLP security controls, you will help protect the reputation of your organization.
For more information contact: rkopaee@riskview.ca
https://www.threatview.ca
http://www.riskview.ca
Technology Overview - Symantec Data Loss Prevention (DLP)Iftikhar Ali Iqbal
The presentation provides the following:
- Symantec Corporate Overview
- Solution Portfolio of Symantec
- Symantec Data Loss Prevention - Introduction
- Symantec Data Loss Prevention - Components
- Symantec Data Loss Prevention - Features & Use Cases
- Symantec Data Loss Prevention - System Requirements
- Symantec Data Loss Prevention - Appendix (extra information)
This provides a brief overview of Symantec Data Loss Prevention (DLP). Please note all the information is based prior to May 2016 and the full integration of Blue Coat Systems's set of solutions.
Overview of Data Loss Prevention (DLP) TechnologyLiwei Ren任力偉
DLP is a technology that detects potential data breach incidents in timely manner and prevents them by monitoring data in-use (endpoints), in-motion (network traffic), and at-rest (data storage). It has been driven by regulatory compliances and intellectual property protection. This talk will introduce DLP models that describe the capabilities and scope that a DLP system should cover. A few system categories will be discussed accordingly with high-level system architecture. DLP is an interesting technology in that it provides advanced content inspection techniques. As such, a few content inspection techniques will be proposed and investigated in rigorous terms.
Data Leakage is an important concern for the business organizations in this increasingly networked world these days. Unauthorized disclosure may have serious consequences for an organization in both long term and short term. Risks include losing clients and stakeholder confidence, tarnishing of brand image, landing in unwanted lawsuits, and overall losing goodwill and market share in the industry.
Data Loss Prevention (DLP) - Fundamental Concept - ErykEryk Budi Pratama
Presented at APTIKNAS (Indonesia ICT Business Association) DKI Jakarta regular webinar.
Title:Data Loss Prevention: Fundamental Concept in Enabling DLP System
2 July 2020
Symantec Data Loss Prevention 11 simplifies the detection and protection of intellectual property. Symantec’s market-leading data security suite features Vector Machine Learning, which makes it easier to detect hard-to-find intellectual property, and enhancements to Data Insight that streamline remediation, increasing the effectiveness of an organization’s data protection initiatives.
DLP (Data Loss Protection) is NOT dead, but needs to be revisited in the context of new methodologies and threats. Here are some practical steps to improve your cybersecurity awareness and response to data loss.
At the highest level, our mission continues to be about keeping our customers (companies and governments) safe from ever-evolving digital threats, so they are confident to move business forward. Our strategy to accomplish this mission centers around four key pillars: Advanced Threat Protection, Information Protection for On Premise and Cloud, Security as a Service -- all anchored by a Unified Security Analytics Platform. Symantec Data Loss Prevention is a foundational product in the Information Protection for On Premise and Cloud pillar.
Everyone knows that storing and accessing data and applications in the cloud and on mobile devices provides makes work much easier and productive by allowing employees to work everywhere they need to.
It allows for great business agility – applications are always up to date, new functionality and processes can be deployed and activated quickly and organizations can adjust things on the fly if they need to.
It also brings the convenience factor – all employees to work in the way that they need to, collaboration and sharing is made vastly easier with cloud applications and storage.
But it brings with it all the challenges of securing devices and applications that your don’t own, and whilst saying NO might be the right thing for security, end users will find a way around it. Right now, close to 30% of employees use their personal devices for work. And that number is on the rise, potentially turning BYOD into Bring Your Own Disaster.
The presentation explains about Data Security as an industrial concept. It addresses
its concern on Data Loss Prevention in detail, from what it is, its approach, the best practices and
common mistakes people make for the same. The presentation concludes with highlighting
Happiest Minds' expertise in the domain.
Learn more about Happiest Minds Data Security Service Offerings
http://www.happiestminds.com/IT-security-services/data-security-services/
DLP Systems: Models, Architecture and AlgorithmsLiwei Ren任力偉
DLP is a data security technology that detects and prevents data breach incidents by monitoring data in-use, in-motion and at-rest. It has been widely applied for regulatory compliances, data privacy and intellectual property protection. This talk will introduce basic concepts and security models to describe DLP systems with high level architecture. DLP is an interesting discipline with content inspection techniques supported by sophisticated algorithms. Special investigation will be taken for a few algorithms: document fingerprinting, data record fingerprinting, scalable M-pattern string match and etc..
Distributed Immutable Ephemeral - New Paradigms for the Next Era of SecuritySounil Yu
We are rapidly approaching the next era of security where we need to be focused on the ability to recover from irrecoverable attacks. This can also be defined as resiliency. The traditional view of resiliency attempts to quickly restore assets that support services that we care about. This new approach/paradigm looks at resilience in ways that promote design patterns (distributed, immutable, ephemeral) where we do not care about a given asset at all while still keeping the overall service functioning. This new approach allows us to avoid having to deal with security at all.
Symantec Data Loss Prevention. Las tendencias mundiales nos muestran que el mayor porcentaje de perdida y robo de datos responde a la falta de visibilidad y el error en el manejo de los mismos. Conozca como prevenirse.
Data Loss Prevention (DLP) - Fundamental Concept - ErykEryk Budi Pratama
Presented at APTIKNAS (Indonesia ICT Business Association) DKI Jakarta regular webinar.
Title:Data Loss Prevention: Fundamental Concept in Enabling DLP System
2 July 2020
Symantec Data Loss Prevention 11 simplifies the detection and protection of intellectual property. Symantec’s market-leading data security suite features Vector Machine Learning, which makes it easier to detect hard-to-find intellectual property, and enhancements to Data Insight that streamline remediation, increasing the effectiveness of an organization’s data protection initiatives.
DLP (Data Loss Protection) is NOT dead, but needs to be revisited in the context of new methodologies and threats. Here are some practical steps to improve your cybersecurity awareness and response to data loss.
At the highest level, our mission continues to be about keeping our customers (companies and governments) safe from ever-evolving digital threats, so they are confident to move business forward. Our strategy to accomplish this mission centers around four key pillars: Advanced Threat Protection, Information Protection for On Premise and Cloud, Security as a Service -- all anchored by a Unified Security Analytics Platform. Symantec Data Loss Prevention is a foundational product in the Information Protection for On Premise and Cloud pillar.
Everyone knows that storing and accessing data and applications in the cloud and on mobile devices provides makes work much easier and productive by allowing employees to work everywhere they need to.
It allows for great business agility – applications are always up to date, new functionality and processes can be deployed and activated quickly and organizations can adjust things on the fly if they need to.
It also brings the convenience factor – all employees to work in the way that they need to, collaboration and sharing is made vastly easier with cloud applications and storage.
But it brings with it all the challenges of securing devices and applications that your don’t own, and whilst saying NO might be the right thing for security, end users will find a way around it. Right now, close to 30% of employees use their personal devices for work. And that number is on the rise, potentially turning BYOD into Bring Your Own Disaster.
The presentation explains about Data Security as an industrial concept. It addresses
its concern on Data Loss Prevention in detail, from what it is, its approach, the best practices and
common mistakes people make for the same. The presentation concludes with highlighting
Happiest Minds' expertise in the domain.
Learn more about Happiest Minds Data Security Service Offerings
http://www.happiestminds.com/IT-security-services/data-security-services/
DLP Systems: Models, Architecture and AlgorithmsLiwei Ren任力偉
DLP is a data security technology that detects and prevents data breach incidents by monitoring data in-use, in-motion and at-rest. It has been widely applied for regulatory compliances, data privacy and intellectual property protection. This talk will introduce basic concepts and security models to describe DLP systems with high level architecture. DLP is an interesting discipline with content inspection techniques supported by sophisticated algorithms. Special investigation will be taken for a few algorithms: document fingerprinting, data record fingerprinting, scalable M-pattern string match and etc..
Distributed Immutable Ephemeral - New Paradigms for the Next Era of SecuritySounil Yu
We are rapidly approaching the next era of security where we need to be focused on the ability to recover from irrecoverable attacks. This can also be defined as resiliency. The traditional view of resiliency attempts to quickly restore assets that support services that we care about. This new approach/paradigm looks at resilience in ways that promote design patterns (distributed, immutable, ephemeral) where we do not care about a given asset at all while still keeping the overall service functioning. This new approach allows us to avoid having to deal with security at all.
Symantec Data Loss Prevention. Las tendencias mundiales nos muestran que el mayor porcentaje de perdida y robo de datos responde a la falta de visibilidad y el error en el manejo de los mismos. Conozca como prevenirse.
FUZZY FINGERPRINT METHOD FOR DETECTION OF SENSITIVE DATA EXPOSUREIJCI JOURNAL
Protecting confidential information is a major concern for organizations and individuals alike, who stand
to suffer huge losses if private data falls into the wrong hands. Network-based information leaks pose a
serious threat to confidentiality. This paper describes network-based data-leak detection (DLD) technique,
the main feature of which is that the detection does not require the data owner to reveal the content of the
sensitive data. Instead, only a small amount of specialized digests are needed. The technique referred to as
the fuzzy fingerprint – can be used to detect accidental data leaks due to human errors or application flaws.
The privacy-preserving feature of algorithms minimizes the exposure of sensitive data and enables the data
owner to safely delegate the detection to others
Privacy-Preserving Updates to Anonymous and Confidential Databaseijdmtaiir
The current trend in the application space towards
systems of loosely coupled and dynamically bound
components that enables just-in-time integration jeopardizes
the security of information that is shared between the broker,
the requester, and the provider at runtime. In particular, new
advances in data mining and knowledge discovery that allow
for the extraction of hidden knowledge in an enormous amount
of data impose new threats on the seamless integration of
information. We consider the problem of building privacy
preserving algorithms for one category of data mining
techniques, association rule mining.Suppose Alice owns a kanonymous database and needs to determine whether her
database, when inserted with a tuple owned by Bob, is still kanonymous. Also, suppose that access to the database is strictly
controlled, because for example data are used for certain
experiments that need to be maintained confidential. Clearly,
allowing Alice to directly read the contents of the tuple breaks
the privacy of Bob (e.g., a patient’s medical record); on the
other hand, the confidentiality of the database managed by
Alice is violated once Bob has access to the contents of the
database. Thus, the problem is to check whether the database
inserted with the tuple is still k-anonymous, without letting
Alice and Bob know the contents of the tuple and the database,
respectively. In this paper, we propose two protocols solving
this problem on suppression-based and generalization-based kanonymous and confidential databases. The protocols rely on
well-known cryptographic assumptions, and we provide
theoretical analyses to proof their soundness and experimental
results to illustrate their efficiency.We have presented two
secure protocols for privately checking whether a kanonymous database retains its anonymity once a new tuple is
being inserted to it. Since the proposed protocols ensure the
updated database remains K-anonymous, the results returned
from a user’s (or a medical researcher’s) query are also kanonymous. Thus, the patient or the data provider’s privacy
cannot be violated from any query. As long as the database is
updated properly using the proposed protocols, the user queries
under our application domain are always privacy-preserving
Abstract-The current trend in the application space towards systems of loosely coupled and dynamically bound components that enables just-in-time integration jeopardizes the security of information that is shared between the broker, the requester, and the provider at runtime. In particular, new advances in data mining and knowledge discovery that allow for the extraction of hidden knowledge in an enormous amount of data impose new threats on the seamless integration of information. We consider the problem of building privacy preserving algorithms for one category of data mining techniques, association rule mining.Suppose Alice owns a k-anonymous database and needs to determine whether her database, when inserted with a tuple owned by Bob, is still k-anonymous. Also, suppose that access to the database is strictly controlled, because for example data are used for certain experiments that need to be maintained confidential. Clearly, allowing Alice to directly read the contents of the tuple breaks the privacy of Bob (e.g., a patient’s medical record); on the other hand, the confidentiality of the database managed by Alice is violated once Bob has access to the contents of the database. Thus, the problem is to check whether the database inserted with the tuple is still k-anonymous, without letting Alice and Bob know the contents of the tuple and the database, respectively. In this paper, we propose two protocols solving this problem on suppression-based and generalization-based k-anonymous and confidential databases. The protocols rely on well-known cryptographic assumptions, and we provide theoretical analyses to proof their soundness and experimental results to illustrate their efficiency.We have presented two secure protocols for privately checking whether a k-anonymous database retains its anonymity once a new tuple is being inserted to it. Since the proposed protocols ensure the updated database remains K-anonymous, the results returned from a user’s (or a medical researcher’s) query are also k-anonymous. Thus, the patient or the data provider’s privacy cannot be violated from any query. As long as the database is updated properly using the proposed protocols, the user queries under our application domain are always privacy-preserving.
All the essential information you need about DLP in one eBook.
As security professionals struggle with how to keep up with threats, DLP - a technology designed to ensure sensitive data isn't stolen or lost - is hot again. This comprehensive guide provides what you need to understand, evaluate, and succeed with today's DLP. It includes insights from DLP Experts, Forrester Research, Gartner, and Digital Guardian's security analysts.
What's Inside:
-The seven trends that have made DLP hot again
-How to determine the right approach for your organization
-Making the business case to executives
-How to build an RFP and evaluate vendors
-How to start with a clearly defined quick win
-Straight-forward frameworks for success
Privacy Engineering: Enabling Mobility of Mental Health Services with Data Pr...CREST
This presentation describes privacy engineering for mobile health apps. it revealed that top-ranked apps lack fundamental data protection mechanisms, and that explicit and understandable consent in apps is needed for data access/sharing within or across organisations
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Essentials of Automations: Optimizing FME Workflows with Parameters
Data loss prevention (dlp)
1. SUBMITTED BY
HUSSEIN M. AL-SANABANI
SUPERVISOR
YRD.DOÇ.DR. MURAT İSKEFİYELİ
Overview of Data Loss
Prevention (DLP) Technology
DLP 11/23/2014
1
2. Outline
What is Data Loss Prevention ?
DLP Models
DLP Systems and Architecture
Data Classification and Identification
Technical Challenges
Reference
Researches
DLP 11/23/2014
2
3. What Is Data Loss Prevention?
What is Data Loss Prevention?
Data loss prevention (DLP) is a data security technology that
detects potential data breach incidents in timely manner and
prevents them by monitoring data in-use (endpoints), in-motion
(network traffic), and at-rest (data storage) in an organization’s
network.
DLP 11/23/2014
3
4. What Is Data Loss Prevention?
What drives DLP development?
Regulatory compliances such as PCI,SOX, HIPAA, GLBA,
SB1382 and etc
Confidential information protection
Intellectual property protection
What data loss incidents does a DLP system
handle?
Incautious data leak by an internal worker
Intentional data theft by an unskillful worker
Determined data theft by a highly technical worker
Determined data theft by external hackers or advanced
malwares or APT DLP 11/23/2014
4
5. What Is Data Loss Prevention?
The evolution of naming
Information Leak Prevention (ILP)
Information Leak Detection and Prevention (ILDP)
DLP
Data Leak Prevention
Data Loss Prevention
DLP 11/23/2014
5
6. DLP Models
A model is used to describe a technology with
rigorous terms
We need models to define/scope what a DLP system
should do?
Three States of Data
Data in Use (endpoints)
Data in Motion (network)
Data at Rest (storage)
DLP 11/23/2014
6
7. DLP Models
The data in use at endpoints can be leaked via
USB
Emails
Web mails
HTTP/HTTPS
FTP
…
The data in motion can be leaked via
SMTP
FTP
HTTP/HTTPS
…
DLP 11/23/2014
7
8. DLP Models
The data at rest could
reside at wrong place
Be accessed by wrong person
Be owned by wrong person
DLP 11/23/2014
8
11. DLP Models
DLP Model for data-in-use and data-in-motion:
DATA flows from SOURCE to DESTINATION via CHANNEL
do ACTIONs
DATA specifies what confidential data is
SOURCE can be an user, an endpoint, an email address, or a group
of them
DESTINATION can be an endpoint, an email address, or a group
of them, or simply the external world
CHANNEL indicates the data leak channel such as USB, email,
network protocols and etc
ACTION is the action that needs to be taken by the DLP system
when an incident occurs
DLP 11/23/2014
11
13. DLP Models
DLP Model for data-at-rest
DATA resides at SOURCE do ACTIONs
DATA specifies what the sensitive data (which has potential for
leakage) is
SOURCE can be an endpoint, a storage server or a group of them
ACTION is the action that needs to be taken by the DLP system
when confidential data is identified at rest.
DLP 11/23/2014
13
14. DLP Models
These two DLP models are fundamental
They basically define the formats of DLP security
rules (or DLP security policies)
DLP 11/23/2014
14
15. DLP Systems and Architecture
Typical DLP systems
DLP Management Console
DLP Endpoint Agent
DLP Network Gateway
Data Discovery Agent (or Appliance)
DLP 11/23/2014
15
16. DLP Systems and Architecture
Typical DLP system architecture
DLP 11/23/2014
16
17. Data Classification and Identification
One expects a DLP system can answer the following
questions
What is sensitive information?
How to define sensitive information?
How to categorize sensitive information?
How to check if a given document contains sensitive information?
How to measure data sensitivity?
Data inspection is an important capability for a content-
aware DLP solution. It consists of two parts:
To define sensitive data, i.e., data classification
To identify sensitive data in real time
DLP 11/23/2014
17
18. Data Classification and Identification
Sensitive data is contained in textual documents.
What does a document mean to you?
We need text models to describe a text:
DLP 11/23/2014
18
19. Data Classification and Identification
prefered to use UTF-8 text model
Handling all languages, especially for CJK group.
A textual document is normalized into a sequence of UTF-8
characters
Four fundamental approaches for sensitive data
definition and identification:
Document fingerprinting
Database record fingerprinting
Multiple Keyword matching
Regular expression matching
DLP 11/23/2014
19
20. Data Classification and Identification
What is document fingerprinting about?
It is a solution to a problem of information retrieval:
Identify modified versions of known documents
Near duplicate document detection (NDDD)
A technique of variant detection for documents
DLP 11/23/2014
20
21. Data Classification and Identification
What is database record fingerprinting about?
Also known as Exact Match in DLP field
It is a technique to detect if there exist sensitive data records
within a text.
Use Case:
We have several personal data records of <SSN, Phone#,
address> that are included in a text, we want to extract all
records from the file to determine the sensitivity of the file.
DLP 11/23/2014
21
22. Data Classification and Identification
Multiple keyword match and RegEx match
They are well-known & well-defined problems
Very useful in DLP data inspection
Problem Definition for Keyword Match:
Let S= {K1,K2,…,Kn} be a dictionary of keywords.
Given any text T, one needs to identify all keyword occurrences from T.
Problem Definition for RegEx Match:
Let S= {P1,P2,…,Pm} be a set of RegEx patterns.
Given any text T, one needs to identify all pattern instances from T.
Easy problems?
Not at all. For large n and m, one will have performance issue.
That’s the problem of scalability.
Scalable algorithms must be provided.
DLP 11/23/2014
22
23. Data Classification and Identification
How to evaluate a classification algorithm?
Accuracy in terms of false positive and false negative
Performance
Language independence
DLP 11/23/2014
23
25. Data Classification and Identification
DLP rule engine works on top of both DLP models
and data template framework:
DLP 11/23/2014
25
26. Technical Challenges
Some areas with challenges
Concept Match
Data Discovery
Document Classification Automation
Determined Data Theft Detection
DLP 11/23/2014
26
28. Researches (1)
Title:
Text Classification for Data Loss Prevention
Author:
Michael Hart, Pratyusa Manadhata, and Rob Johnson
Institute:
Computer Science Department, Stony Brook University and
HP Labs
Published on:
Copyright 2011 Hewlett-Packard Development Company, L.P.
DLP 11/23/2014
28
29. Research: 1 cont.cont.
This paper present automatic text classification algorithms for
classifying enterprise documents as either sensitive or non-
sensitive.
This paper also introduce a novel training strategy, supplement
and adjust, to create a classifier that has a low false discovery
(positive) rate, even when presented with documents unrelated
to the enterprise.
And evaluated the classifier on several corpora that assembled
from confidential documents published on WikiLeaks and other
archives. this classifier had a false negative rate of less than 3.0%
and a false discovery (positive) rate of less than 1.0% on all tests
(i.e, in a real deployment, the classifier can identify more than
97% of information leaks while raising at most 1 false alarm every
100th time).
DLP 11/23/2014
29
30. Research: 1 contcont..
Target:
Create automatic document classification techniques to identify confidential
data in a scalable and accurate manner.
And to make the finer distinction between enterprise public and private
documents.
How:
They performed a brute search evaluating multiple machine learning
algorithms for text classifier performance, including SVMs, Naive Bayesian
classifiers, and Rocchio classifiers from the WEKA toolkit to determine the
best classifier across all the datasets. They found that a support vector
machine with a linear kernel, performed the best on the test corpora.
And they builded a well-studied machine learning technique, Support Vector
Machines (SVMs), that scales well to large data sets.
DLP 11/23/2014
30
31. Supplement and Adjust
An SVM trained on enterprise documents achieves
reasonable performance on enterprise documents, but has
an unacceptably high false positive rate on non-enterprise
(NE) documents. The poor performance can be explained
by identifying weaknesses in the training approach.
To solve this problem , they supplement the classifier by
adding training data from non-enterprise collections such
as Wikipedia, Reuters. The presence of supplementary data
does not train the classifier to recognize NE documents, but
prevents it from overfitting the enterprise data.
DLP 11/23/2014
31
32. Research: 1 cont.cont.
Adding supplemental training data will likely
introduce a new problem: class imbalance.
Supplemental instances will bias the classifier
towards public documents because the size of this
class will overwhelm the size of secret documents.
This will result in a high false-negative rate on secret
documents. Therefore, they need to adjust the
decision boundary towards public instances. This
will reduce the false negative rate while increasing
the false positive rate.
DLP 11/23/2014
32
34. Research: 2 cont.cont.
In this project they try to tackle the problem of
classifying a body of text in corporate message as
private or public.
In comparison of text classifiers , they used Naive
Bayes, Logistic Regression, and Support Vector
Machine classifiers and found that SVMs showed
better results.
DLP 11/23/2014
34