SlideShare a Scribd company logo
1 of 9
T E S T D ATA
A N O N Y M I Z AT I O N
Prateek Gupta
T R A N S F O R M I N G
R E A L D A T A I N T O
R E A L I S T I C T E S T
D A T A
T E S T D ATA A N O N Y M I Z AT I O N
Test data anonymization is a critical practice in the realm of data privacy
and software testing. It involves the process of transforming sensitive
information in a dataset used for testing purposes to protect individuals'
privacy and adhere to data protection regulations. This ensures that
personal or sensitive data does not get exposed during testing while still
allowing organizations to effectively evaluate the functionality, performance,
and security of their software or systems.
2
T H E N E E D F O R T E S T D ATA
A N O N Y M I Z AT I O N
Today's digital age collects
vast amounts of data for
various purposes,
including software
development and testing
This data often
contains personally
identifiable information
(PII) and other sensitive
information.
Data protection regulations
like GDPR and HIPAA
require organizations to
protect this data.
Test data
anonymization is
necessary to fulfill
these obligations and
protect individuals'
privacy.
C O M M O N T E C H N I Q U E S F O R T E S T D ATA
A N O N Y M I Z AT I O N
Tokenization: Replace
sensitive data with
tokens that require
access to a secure
database.
Synthetic Data Generation:
Generate fictional data
mirroring the
characteristics of the
original data
Data Masking: Replace
sensitive data with fake or
pseudonymous data.
Data Encryption: Convert
sensitive data into a
scrambled format
Data Subset Selection: Use
a subset of non-sensitive
data for testing.
10/19/2023
B E N E F I T S O F T E S T D ATA
A N O N Y M I Z AT I O N
Risk Mitigation:
Minimizes the risk of
exposing sensitive data
during testing
Effective Testing:
Allows thorough
testing without
compromising data
privacy.
Privacy Protection:
Safeguards individuals'
sensitive information.
Privacy Protection:
Safeguards individuals'
sensitive information.
S O L U T I O N F O R D ATA M A S K I N G
A Python script was created to mask data in acceptable
form. The script takes an input CSV file with column
headers and prompts the user to choose the output file
type from CSV, XML, EXCEL, JSON, or SQL. Based on
the user's chosen data type for each column, the script
generates mock data and writes the output to the
selected file type in the output folder. The output file is
named as <”input_CSV__file_name" + "mock_data" +
{timestap}>.
U T I L I T Y A R C H I T E C T U R E
T E C H N O L O G Y U S E D A N D
A D VA N TA G E S
 Faker Library: A Python library for generating
fake data with various customizable data
types.
 YAML configuration file: A human-readable
data serialization format used to specify the
script's input and output file locations.
 pandas: A Python library used for data
manipulation and analysis.
 Element Tree: A Python library for working
with XML documents, which are a popular
format for storing structured data.
The solution can be used across various
environments such as for Load Testing , Performance
Testing , User-acceptance Testing, Pre-production and
Production.
And it offers the capability to generate output data in
multiple formats like CSV, XML, Excel, JSON, and SQL
T H A N K Y O U F O R D I V I N G I N T H E T E S T
D ATA A N O N Y M I Z AT I O N . .
P R A T E E K . G U P T A @ T H E P S I . C O
M
P R E S E N T E D B Y:

More Related Content

Similar to Automation for test data anonymization

Splunk for cyber_threat
Splunk for cyber_threatSplunk for cyber_threat
Splunk for cyber_threatGreg Hanchin
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
STEALTHbits Sensitive Data Discovery Solutions
STEALTHbits Sensitive Data Discovery SolutionsSTEALTHbits Sensitive Data Discovery Solutions
STEALTHbits Sensitive Data Discovery SolutionsSTEALTHbits Technologies
 
Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive...
Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive...Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive...
Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive...DataWorks Summit/Hadoop Summit
 
Privacy-Preserving Data Analysis, Adria Gascon
Privacy-Preserving Data Analysis, Adria GasconPrivacy-Preserving Data Analysis, Adria Gascon
Privacy-Preserving Data Analysis, Adria GasconUlrik Lyngs
 
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data ProjectCitiusTech
 
Next generation data protection and security for oracle users - the block cha...
Next generation data protection and security for oracle users - the block cha...Next generation data protection and security for oracle users - the block cha...
Next generation data protection and security for oracle users - the block cha...Ulf Mattsson
 
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxINTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxMadhumitha N
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsIRJET Journal
 
Achieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logsAchieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logsIOSR Journals
 
Computer Forensics
Computer ForensicsComputer Forensics
Computer ForensicsDaksh Verma
 
Secure Phrase Search for Intelligent Processing ofEncrypted Data in Cloud-Bas...
Secure Phrase Search for Intelligent Processing ofEncrypted Data in Cloud-Bas...Secure Phrase Search for Intelligent Processing ofEncrypted Data in Cloud-Bas...
Secure Phrase Search for Intelligent Processing ofEncrypted Data in Cloud-Bas...JAYAPRAKASH JPINFOTECH
 
Webinar: Enable Insight Driven Data Risk Assessments with AI
Webinar: Enable Insight Driven Data Risk Assessments with AIWebinar: Enable Insight Driven Data Risk Assessments with AI
Webinar: Enable Insight Driven Data Risk Assessments with AILucidworks
 
Data analytics using R programming
Data analytics using R programmingData analytics using R programming
Data analytics using R programmingUmang Singh
 
IRJET - Virtual Data Auditing at Overcast Environment
IRJET - Virtual Data Auditing at Overcast EnvironmentIRJET - Virtual Data Auditing at Overcast Environment
IRJET - Virtual Data Auditing at Overcast EnvironmentIRJET Journal
 
A Study on Big Data Privacy Protection Models using Data Masking Methods
A Study on Big Data Privacy Protection Models using Data Masking Methods A Study on Big Data Privacy Protection Models using Data Masking Methods
A Study on Big Data Privacy Protection Models using Data Masking Methods IJECEIAES
 
GDPR READY SOLUTION FOR UNSTRUCTURED DATA
GDPR READY SOLUTION FOR UNSTRUCTURED DATAGDPR READY SOLUTION FOR UNSTRUCTURED DATA
GDPR READY SOLUTION FOR UNSTRUCTURED DATAXeniT Solutions nv
 
PyData Sri Lanka 2023 Presentation - Nuzhi Meyen-V2.pptx
PyData Sri Lanka 2023 Presentation - Nuzhi Meyen-V2.pptxPyData Sri Lanka 2023 Presentation - Nuzhi Meyen-V2.pptx
PyData Sri Lanka 2023 Presentation - Nuzhi Meyen-V2.pptxnmeyen
 

Similar to Automation for test data anonymization (20)

Splunk for cyber_threat
Splunk for cyber_threatSplunk for cyber_threat
Splunk for cyber_threat
 
Pan Dhoni - Modernizing Data And Analytics using AI.pdf
Pan Dhoni - Modernizing Data And Analytics using AI.pdfPan Dhoni - Modernizing Data And Analytics using AI.pdf
Pan Dhoni - Modernizing Data And Analytics using AI.pdf
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
STEALTHbits Sensitive Data Discovery Solutions
STEALTHbits Sensitive Data Discovery SolutionsSTEALTHbits Sensitive Data Discovery Solutions
STEALTHbits Sensitive Data Discovery Solutions
 
Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive...
Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive...Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive...
Hortonworks Data Platform and IBM Systems - A Complete Solution for Cognitive...
 
Privacy-Preserving Data Analysis, Adria Gascon
Privacy-Preserving Data Analysis, Adria GasconPrivacy-Preserving Data Analysis, Adria Gascon
Privacy-Preserving Data Analysis, Adria Gascon
 
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project8 Guiding Principles to Kickstart Your Healthcare Big Data Project
8 Guiding Principles to Kickstart Your Healthcare Big Data Project
 
Securitarian
SecuritarianSecuritarian
Securitarian
 
Next generation data protection and security for oracle users - the block cha...
Next generation data protection and security for oracle users - the block cha...Next generation data protection and security for oracle users - the block cha...
Next generation data protection and security for oracle users - the block cha...
 
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptxINTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
INTRODUCTION TO DATA SCIENCE -CONCEPTS.pptx
 
An Overview of Python for Data Analytics
An Overview of Python for Data AnalyticsAn Overview of Python for Data Analytics
An Overview of Python for Data Analytics
 
Achieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logsAchieving Privacy in Publishing Search logs
Achieving Privacy in Publishing Search logs
 
Computer Forensics
Computer ForensicsComputer Forensics
Computer Forensics
 
Secure Phrase Search for Intelligent Processing ofEncrypted Data in Cloud-Bas...
Secure Phrase Search for Intelligent Processing ofEncrypted Data in Cloud-Bas...Secure Phrase Search for Intelligent Processing ofEncrypted Data in Cloud-Bas...
Secure Phrase Search for Intelligent Processing ofEncrypted Data in Cloud-Bas...
 
Webinar: Enable Insight Driven Data Risk Assessments with AI
Webinar: Enable Insight Driven Data Risk Assessments with AIWebinar: Enable Insight Driven Data Risk Assessments with AI
Webinar: Enable Insight Driven Data Risk Assessments with AI
 
Data analytics using R programming
Data analytics using R programmingData analytics using R programming
Data analytics using R programming
 
IRJET - Virtual Data Auditing at Overcast Environment
IRJET - Virtual Data Auditing at Overcast EnvironmentIRJET - Virtual Data Auditing at Overcast Environment
IRJET - Virtual Data Auditing at Overcast Environment
 
A Study on Big Data Privacy Protection Models using Data Masking Methods
A Study on Big Data Privacy Protection Models using Data Masking Methods A Study on Big Data Privacy Protection Models using Data Masking Methods
A Study on Big Data Privacy Protection Models using Data Masking Methods
 
GDPR READY SOLUTION FOR UNSTRUCTURED DATA
GDPR READY SOLUTION FOR UNSTRUCTURED DATAGDPR READY SOLUTION FOR UNSTRUCTURED DATA
GDPR READY SOLUTION FOR UNSTRUCTURED DATA
 
PyData Sri Lanka 2023 Presentation - Nuzhi Meyen-V2.pptx
PyData Sri Lanka 2023 Presentation - Nuzhi Meyen-V2.pptxPyData Sri Lanka 2023 Presentation - Nuzhi Meyen-V2.pptx
PyData Sri Lanka 2023 Presentation - Nuzhi Meyen-V2.pptx
 

More from Agile Testing Alliance

#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...Agile Testing Alliance
 
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by  Ajay Balamurugadas, "Where Are The Real Testers In T...#Interactive Session by  Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...Agile Testing Alliance
 
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by  Jishnu Nambiar and  Mayur Ovhal, "Monitoring Web Per...#Interactive Session by  Jishnu Nambiar and  Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...Agile Testing Alliance
 
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...Agile Testing Alliance
 
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...Agile Testing Alliance
 
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.Agile Testing Alliance
 
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...Agile Testing Alliance
 
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...Agile Testing Alliance
 
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...Agile Testing Alliance
 
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...Agile Testing Alliance
 
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...Agile Testing Alliance
 
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...Agile Testing Alliance
 
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...Agile Testing Alliance
 
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...Agile Testing Alliance
 
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Interactive Session by  Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...#Interactive Session by  Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...Agile Testing Alliance
 
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...Agile Testing Alliance
 
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.Agile Testing Alliance
 
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...Agile Testing Alliance
 
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...
#Interactive Session by Aniket Diwakar Kadukar and  Padimiti Vaidik Eswar Dat...#Interactive Session by Aniket Diwakar Kadukar and  Padimiti Vaidik Eswar Dat...
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...Agile Testing Alliance
 
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...Agile Testing Alliance
 

More from Agile Testing Alliance (20)

#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
 
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by  Ajay Balamurugadas, "Where Are The Real Testers In T...#Interactive Session by  Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...
 
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by  Jishnu Nambiar and  Mayur Ovhal, "Monitoring Web Per...#Interactive Session by  Jishnu Nambiar and  Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...
 
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
 
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
 
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
 
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
 
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
 
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
 
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
 
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
 
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
 
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
 
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
 
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Interactive Session by  Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...#Interactive Session by  Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
 
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
 
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
 
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
 
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...
#Interactive Session by Aniket Diwakar Kadukar and  Padimiti Vaidik Eswar Dat...#Interactive Session by Aniket Diwakar Kadukar and  Padimiti Vaidik Eswar Dat...
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...
 
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
 

Recently uploaded

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 

Recently uploaded (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Automation for test data anonymization

  • 1. T E S T D ATA A N O N Y M I Z AT I O N Prateek Gupta T R A N S F O R M I N G R E A L D A T A I N T O R E A L I S T I C T E S T D A T A
  • 2. T E S T D ATA A N O N Y M I Z AT I O N Test data anonymization is a critical practice in the realm of data privacy and software testing. It involves the process of transforming sensitive information in a dataset used for testing purposes to protect individuals' privacy and adhere to data protection regulations. This ensures that personal or sensitive data does not get exposed during testing while still allowing organizations to effectively evaluate the functionality, performance, and security of their software or systems. 2
  • 3. T H E N E E D F O R T E S T D ATA A N O N Y M I Z AT I O N Today's digital age collects vast amounts of data for various purposes, including software development and testing This data often contains personally identifiable information (PII) and other sensitive information. Data protection regulations like GDPR and HIPAA require organizations to protect this data. Test data anonymization is necessary to fulfill these obligations and protect individuals' privacy.
  • 4. C O M M O N T E C H N I Q U E S F O R T E S T D ATA A N O N Y M I Z AT I O N Tokenization: Replace sensitive data with tokens that require access to a secure database. Synthetic Data Generation: Generate fictional data mirroring the characteristics of the original data Data Masking: Replace sensitive data with fake or pseudonymous data. Data Encryption: Convert sensitive data into a scrambled format Data Subset Selection: Use a subset of non-sensitive data for testing.
  • 5. 10/19/2023 B E N E F I T S O F T E S T D ATA A N O N Y M I Z AT I O N Risk Mitigation: Minimizes the risk of exposing sensitive data during testing Effective Testing: Allows thorough testing without compromising data privacy. Privacy Protection: Safeguards individuals' sensitive information. Privacy Protection: Safeguards individuals' sensitive information.
  • 6. S O L U T I O N F O R D ATA M A S K I N G A Python script was created to mask data in acceptable form. The script takes an input CSV file with column headers and prompts the user to choose the output file type from CSV, XML, EXCEL, JSON, or SQL. Based on the user's chosen data type for each column, the script generates mock data and writes the output to the selected file type in the output folder. The output file is named as <”input_CSV__file_name" + "mock_data" + {timestap}>.
  • 7. U T I L I T Y A R C H I T E C T U R E
  • 8. T E C H N O L O G Y U S E D A N D A D VA N TA G E S  Faker Library: A Python library for generating fake data with various customizable data types.  YAML configuration file: A human-readable data serialization format used to specify the script's input and output file locations.  pandas: A Python library used for data manipulation and analysis.  Element Tree: A Python library for working with XML documents, which are a popular format for storing structured data. The solution can be used across various environments such as for Load Testing , Performance Testing , User-acceptance Testing, Pre-production and Production. And it offers the capability to generate output data in multiple formats like CSV, XML, Excel, JSON, and SQL
  • 9. T H A N K Y O U F O R D I V I N G I N T H E T E S T D ATA A N O N Y M I Z AT I O N . . P R A T E E K . G U P T A @ T H E P S I . C O M P R E S E N T E D B Y: