SlideShare a Scribd company logo
1 of 12
Download to read offline
A human-in-loop analytics driven
approach for anonymization &
redaction of clinical data submissions
Ganes Kesari
August 23, ‘22
Ganes Kesari
Co-founder & Chief Decision Scientist
“Simplify Data Science for all”
100+ Clients
Solve business problems using insights
and stories on a low-code platform
@kesaritweets
/gkesari
Ø New regulatory requirements such as EMA
0070 and Health Canada PRCI
Ø Specific data privacy guidelines for all reports
published in the public domain
Ø These regulations also call for a level of data
transparency
Ø Prescribe sufficient data granularity to help
the scientific community
Anonymizing Clinical Study Reports (CSR) with the right balance of Privacy
and Transparency is challenging
Privacy
Transparency
Anonymization of CSRs is a three-fold problem
Human Errors
Ø Time consuming & cumbersome process with many complex steps
Ø Error prone and requires multiple manual reviews
Unstructured Content
Ø No plug & play, off-the shelf
Named Entity Recog. models
Ø Requires pharma domain
specific entities
Regulatory Constraints
Ø Complex and rapidly evolving
regulations
Ø High quality thresholds with
stringent re-identification
thresholds
1. Regulatory Constraints: Increasing regulations spike compliance costs
and the likely penalties for breaches
Typical annual Spends of $2-3 Mn on
external vendor costs
Typical cost of a healthcare breach
was $9.2 Mn per incident
84% increase in healthcare data
breaches, impacting 45 million people
Tech advances & public data spike re-
identification risks
IBM Report - Cost of a Data breach
2. Human Errors: Clinical teams go through a long and cumbersome
process for CSR anonymization
Time consuming
processes with cycle
times up to 45 days
for each summary
document
25+ complex steps in
achieving
anonymization using
different clinical trial
management
systems
Higher potential for
error with data
flowing across
multiple internal
systems, databases
and emails for reviews
and approvals
Anonymization
Data anonymization is the process of transforming
information by removing or encrypting sensitive
data (PII or PHI), in order to protect data subjects’
privacy and confidentiality
NLP
Process of transforming and
understanding human language to
identify meaningful patterns and
new insights
3. Unstructured Content: Advanced analytics and Natural Language
Processing is needed to extract PII entities with high accuracy
Anonymization Techniques:
Ø Character Masking
Ø Pseudonymisation
Ø Generalization
Ø Swapping
Ø Data perturbation etc
NLP Techniques:
ØInformation Retrieval
ØNatural Language Processing
ØInformation extraction (NER-
Named entity recognition)
The Solution: A measured approach which balances human validation
and judgement with analytics and automation
Ø User-centered solution design
Ø Collaborative workflows with user feedback
Ø Leveraged open-source tech
Ø Custom algorithm training for
better domain understanding
Ø Domain experts helped tailor
algorithms for unstructured data
Ø Strong & scalable solution
capabilities basis past experience
Ø Regulatory & research
community help understand
required quality thresholds
Ø Iterative optimization till the
desired EMA and Health
Canada controls were met
Human-in-the Loop
Advanced Analytics Regulatory Compliance
Unstructured Data Transformation
The Anonymization Solution handles structured and unstructured data
with iterative risk scoring to ensure compliance
CSR
documents
Reference population
(data on similar trials)
Parsing
CSR docs
Entity
recognition
Sampling
for users to
validate
Recall
calculation
Structured data
transformation
Iterative risk scoring
and optimization
algorithm
Final risk
adjusted CSR
document
User Input
What did we learn from implementing such solutions for clients?
Be prepared to tackle a variety of input data sources
in terms of document structure, style, and entities
Typical CSRs contain 100+ tables and figures which
need to be treated as independent problems
Paucity of research on the risk of reidentification and
patient privacy in pharma clinical space
Where are we headed? Solutions must be geared for more attacks,
tightening regulations, and regional variations
Data breaches have become
more easy
World regulations are evolving
& norms are being tightened
Region specific variations of
regulations are emerging
Please share your session feedback!
@kesaritweets
/gkesari
Ganes Kesari
www.gramener.com
Thank You!

More Related Content

Similar to How AI Can Help Anonymize Clinical Trial Data

Social Listening for Scientists - BLA Case Study
Social Listening for Scientists - BLA Case StudySocial Listening for Scientists - BLA Case Study
Social Listening for Scientists - BLA Case StudyMasood Akhtar
 
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and ClinicalBig Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and ClinicalAdrish Sannyasi
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareDATA360US
 
Presentation on Healthcare Interoperability at AEA, delhi chapter meeting 27t...
Presentation on Healthcare Interoperability at AEA, delhi chapter meeting 27t...Presentation on Healthcare Interoperability at AEA, delhi chapter meeting 27t...
Presentation on Healthcare Interoperability at AEA, delhi chapter meeting 27t...Kumar Satyam
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data miningNeeda Multani
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousingJuliaWilson68
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDatamining Tools
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data miningDataminingTools Inc
 
Improving practitioner decision making capabilities with data and analytics v1
Improving practitioner decision making capabilities with data and analytics v1Improving practitioner decision making capabilities with data and analytics v1
Improving practitioner decision making capabilities with data and analytics v1Ali Khan
 
Presentation at Bio IT World West: To AI or Not to AI, Presented by Simon Tay...
Presentation at Bio IT World West: To AI or Not to AI, Presented by Simon Tay...Presentation at Bio IT World West: To AI or Not to AI, Presented by Simon Tay...
Presentation at Bio IT World West: To AI or Not to AI, Presented by Simon Tay...Lucidworks
 
Big data, RWE and AI in Clinical Trials made simple
Big data, RWE and AI in Clinical Trials made simpleBig data, RWE and AI in Clinical Trials made simple
Big data, RWE and AI in Clinical Trials made simpleHadas Jacoby
 
DataSpryng Overview
DataSpryng OverviewDataSpryng Overview
DataSpryng Overviewjkvr
 
Evaluating How Blockchain Can Transform the Pharmaceutical and Healthcare Ind...
Evaluating How Blockchain Can Transform the Pharmaceutical and Healthcare Ind...Evaluating How Blockchain Can Transform the Pharmaceutical and Healthcare Ind...
Evaluating How Blockchain Can Transform the Pharmaceutical and Healthcare Ind...Kate Barlow
 
data minig for eng with all topics and history
data minig for eng with all topics and historydata minig for eng with all topics and history
data minig for eng with all topics and historynbaisane16
 
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Perficient, Inc.
 
The Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcareThe Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcarePerficient, Inc.
 
Data analytics - May 2016
Data analytics - May 2016Data analytics - May 2016
Data analytics - May 2016Mark Yunger
 
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Bigfinite
 

Similar to How AI Can Help Anonymize Clinical Trial Data (20)

Social Listening for Scientists - BLA Case Study
Social Listening for Scientists - BLA Case StudySocial Listening for Scientists - BLA Case Study
Social Listening for Scientists - BLA Case Study
 
Challenges of Big Data Research
Challenges of Big Data ResearchChallenges of Big Data Research
Challenges of Big Data Research
 
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and ClinicalBig Data Analytics for Healthcare Decision Support- Operational and Clinical
Big Data Analytics for Healthcare Decision Support- Operational and Clinical
 
Enterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for HealthcareEnterprise Analytics: Serving Big Data Projects for Healthcare
Enterprise Analytics: Serving Big Data Projects for Healthcare
 
Presentation on Healthcare Interoperability at AEA, delhi chapter meeting 27t...
Presentation on Healthcare Interoperability at AEA, delhi chapter meeting 27t...Presentation on Healthcare Interoperability at AEA, delhi chapter meeting 27t...
Presentation on Healthcare Interoperability at AEA, delhi chapter meeting 27t...
 
Data mining and privacy preserving in data mining
Data mining and privacy preserving in data miningData mining and privacy preserving in data mining
Data mining and privacy preserving in data mining
 
Data mining and data warehousing
Data mining and data warehousingData mining and data warehousing
Data mining and data warehousing
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Data Mining: Application and trends in data mining
Data Mining: Application and trends in data miningData Mining: Application and trends in data mining
Data Mining: Application and trends in data mining
 
Improving practitioner decision making capabilities with data and analytics v1
Improving practitioner decision making capabilities with data and analytics v1Improving practitioner decision making capabilities with data and analytics v1
Improving practitioner decision making capabilities with data and analytics v1
 
Presentation at Bio IT World West: To AI or Not to AI, Presented by Simon Tay...
Presentation at Bio IT World West: To AI or Not to AI, Presented by Simon Tay...Presentation at Bio IT World West: To AI or Not to AI, Presented by Simon Tay...
Presentation at Bio IT World West: To AI or Not to AI, Presented by Simon Tay...
 
Big data, RWE and AI in Clinical Trials made simple
Big data, RWE and AI in Clinical Trials made simpleBig data, RWE and AI in Clinical Trials made simple
Big data, RWE and AI in Clinical Trials made simple
 
DataSpryng Overview
DataSpryng OverviewDataSpryng Overview
DataSpryng Overview
 
Evaluating How Blockchain Can Transform the Pharmaceutical and Healthcare Ind...
Evaluating How Blockchain Can Transform the Pharmaceutical and Healthcare Ind...Evaluating How Blockchain Can Transform the Pharmaceutical and Healthcare Ind...
Evaluating How Blockchain Can Transform the Pharmaceutical and Healthcare Ind...
 
Machine Learning and Multi Drug Resistant(MDR) Infections case study
Machine Learning and Multi Drug Resistant(MDR) Infections case studyMachine Learning and Multi Drug Resistant(MDR) Infections case study
Machine Learning and Multi Drug Resistant(MDR) Infections case study
 
data minig for eng with all topics and history
data minig for eng with all topics and historydata minig for eng with all topics and history
data minig for eng with all topics and history
 
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
Combining Patient Records, Genomic Data and Environmental Data to Enable Tran...
 
The Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcareThe Role of Data Lakes in Healthcare
The Role of Data Lakes in Healthcare
 
Data analytics - May 2016
Data analytics - May 2016Data analytics - May 2016
Data analytics - May 2016
 
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
Maximize Your Understanding of Operational Realities in Manufacturing with Pr...
 

More from Ganes Kesari

Project Management Careers in Data Science
Project Management Careers in Data ScienceProject Management Careers in Data Science
Project Management Careers in Data ScienceGanes Kesari
 
Penn State Guest Lecture: Business Forecasting in Real Life
Penn State Guest Lecture: Business Forecasting in Real LifePenn State Guest Lecture: Business Forecasting in Real Life
Penn State Guest Lecture: Business Forecasting in Real LifeGanes Kesari
 
500 startups cognitive bias in decision making - ganes kesari - nov 2021 - final
500 startups cognitive bias in decision making - ganes kesari - nov 2021 - final500 startups cognitive bias in decision making - ganes kesari - nov 2021 - final
500 startups cognitive bias in decision making - ganes kesari - nov 2021 - finalGanes Kesari
 
RBS Guest Lecture - Actionable Customer Intelligence with Journey Mapping
RBS Guest Lecture - Actionable Customer Intelligence with Journey MappingRBS Guest Lecture - Actionable Customer Intelligence with Journey Mapping
RBS Guest Lecture - Actionable Customer Intelligence with Journey MappingGanes Kesari
 
AI - Savior or Supervillain?
AI - Savior or Supervillain?AI - Savior or Supervillain?
AI - Savior or Supervillain?Ganes Kesari
 
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
 5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen... 5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...Ganes Kesari
 
Applications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityApplications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityGanes Kesari
 
How AI can Save Lives with the Help of Satellite Imagery
How AI can Save Lives with the Help of Satellite ImageryHow AI can Save Lives with the Help of Satellite Imagery
How AI can Save Lives with the Help of Satellite ImageryGanes Kesari
 
Saving lives by applying AI to Satellite imagery
Saving lives by applying AI to Satellite imagerySaving lives by applying AI to Satellite imagery
Saving lives by applying AI to Satellite imageryGanes Kesari
 
What Really is AI and How will it Shape our Future?
What Really is AI and How will it Shape our Future?What Really is AI and How will it Shape our Future?
What Really is AI and How will it Shape our Future?Ganes Kesari
 
How AI can help you make your Audience Sit up and take Notice
How AI can help you make your Audience Sit up and take NoticeHow AI can help you make your Audience Sit up and take Notice
How AI can help you make your Audience Sit up and take NoticeGanes Kesari
 
'Recession-proofing' your Business with Data
'Recession-proofing' your Business with Data'Recession-proofing' your Business with Data
'Recession-proofing' your Business with DataGanes Kesari
 
What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...Ganes Kesari
 
How Brands can use AI for Actionable Customer Intelligence
How Brands can use AI for Actionable Customer IntelligenceHow Brands can use AI for Actionable Customer Intelligence
How Brands can use AI for Actionable Customer IntelligenceGanes Kesari
 
Transform your Brand's Customer Experience by using AI
Transform your Brand's Customer Experience by using AITransform your Brand's Customer Experience by using AI
Transform your Brand's Customer Experience by using AIGanes Kesari
 
How to Build Data Science Teams
How to Build Data Science TeamsHow to Build Data Science Teams
How to Build Data Science TeamsGanes Kesari
 
How Data Science can help Understand your Customers Better
How Data Science can help Understand your Customers BetterHow Data Science can help Understand your Customers Better
How Data Science can help Understand your Customers BetterGanes Kesari
 
Why is it difficult to achieve strategic differentiation using AI
Why is it difficult to achieve strategic differentiation using AIWhy is it difficult to achieve strategic differentiation using AI
Why is it difficult to achieve strategic differentiation using AIGanes Kesari
 
How Organizations can gain Strategic Advantage when Everyone is applying AI
How Organizations can gain Strategic Advantage when Everyone is applying AIHow Organizations can gain Strategic Advantage when Everyone is applying AI
How Organizations can gain Strategic Advantage when Everyone is applying AIGanes Kesari
 
How to Build Data Science Teams that Deliver Business Value
How to Build Data Science Teams that Deliver Business ValueHow to Build Data Science Teams that Deliver Business Value
How to Build Data Science Teams that Deliver Business ValueGanes Kesari
 

More from Ganes Kesari (20)

Project Management Careers in Data Science
Project Management Careers in Data ScienceProject Management Careers in Data Science
Project Management Careers in Data Science
 
Penn State Guest Lecture: Business Forecasting in Real Life
Penn State Guest Lecture: Business Forecasting in Real LifePenn State Guest Lecture: Business Forecasting in Real Life
Penn State Guest Lecture: Business Forecasting in Real Life
 
500 startups cognitive bias in decision making - ganes kesari - nov 2021 - final
500 startups cognitive bias in decision making - ganes kesari - nov 2021 - final500 startups cognitive bias in decision making - ganes kesari - nov 2021 - final
500 startups cognitive bias in decision making - ganes kesari - nov 2021 - final
 
RBS Guest Lecture - Actionable Customer Intelligence with Journey Mapping
RBS Guest Lecture - Actionable Customer Intelligence with Journey MappingRBS Guest Lecture - Actionable Customer Intelligence with Journey Mapping
RBS Guest Lecture - Actionable Customer Intelligence with Journey Mapping
 
AI - Savior or Supervillain?
AI - Savior or Supervillain?AI - Savior or Supervillain?
AI - Savior or Supervillain?
 
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
 5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen... 5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
5 Steps to Transform into a Data-Driven Organization - Ganes Kesari - Gramen...
 
Applications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityApplications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus Reality
 
How AI can Save Lives with the Help of Satellite Imagery
How AI can Save Lives with the Help of Satellite ImageryHow AI can Save Lives with the Help of Satellite Imagery
How AI can Save Lives with the Help of Satellite Imagery
 
Saving lives by applying AI to Satellite imagery
Saving lives by applying AI to Satellite imagerySaving lives by applying AI to Satellite imagery
Saving lives by applying AI to Satellite imagery
 
What Really is AI and How will it Shape our Future?
What Really is AI and How will it Shape our Future?What Really is AI and How will it Shape our Future?
What Really is AI and How will it Shape our Future?
 
How AI can help you make your Audience Sit up and take Notice
How AI can help you make your Audience Sit up and take NoticeHow AI can help you make your Audience Sit up and take Notice
How AI can help you make your Audience Sit up and take Notice
 
'Recession-proofing' your Business with Data
'Recession-proofing' your Business with Data'Recession-proofing' your Business with Data
'Recession-proofing' your Business with Data
 
What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...
 
How Brands can use AI for Actionable Customer Intelligence
How Brands can use AI for Actionable Customer IntelligenceHow Brands can use AI for Actionable Customer Intelligence
How Brands can use AI for Actionable Customer Intelligence
 
Transform your Brand's Customer Experience by using AI
Transform your Brand's Customer Experience by using AITransform your Brand's Customer Experience by using AI
Transform your Brand's Customer Experience by using AI
 
How to Build Data Science Teams
How to Build Data Science TeamsHow to Build Data Science Teams
How to Build Data Science Teams
 
How Data Science can help Understand your Customers Better
How Data Science can help Understand your Customers BetterHow Data Science can help Understand your Customers Better
How Data Science can help Understand your Customers Better
 
Why is it difficult to achieve strategic differentiation using AI
Why is it difficult to achieve strategic differentiation using AIWhy is it difficult to achieve strategic differentiation using AI
Why is it difficult to achieve strategic differentiation using AI
 
How Organizations can gain Strategic Advantage when Everyone is applying AI
How Organizations can gain Strategic Advantage when Everyone is applying AIHow Organizations can gain Strategic Advantage when Everyone is applying AI
How Organizations can gain Strategic Advantage when Everyone is applying AI
 
How to Build Data Science Teams that Deliver Business Value
How to Build Data Science Teams that Deliver Business ValueHow to Build Data Science Teams that Deliver Business Value
How to Build Data Science Teams that Deliver Business Value
 

Recently uploaded

Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 

Recently uploaded (20)

Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 

How AI Can Help Anonymize Clinical Trial Data

  • 1. A human-in-loop analytics driven approach for anonymization & redaction of clinical data submissions Ganes Kesari August 23, ‘22
  • 2. Ganes Kesari Co-founder & Chief Decision Scientist “Simplify Data Science for all” 100+ Clients Solve business problems using insights and stories on a low-code platform @kesaritweets /gkesari
  • 3. Ø New regulatory requirements such as EMA 0070 and Health Canada PRCI Ø Specific data privacy guidelines for all reports published in the public domain Ø These regulations also call for a level of data transparency Ø Prescribe sufficient data granularity to help the scientific community Anonymizing Clinical Study Reports (CSR) with the right balance of Privacy and Transparency is challenging Privacy Transparency
  • 4. Anonymization of CSRs is a three-fold problem Human Errors Ø Time consuming & cumbersome process with many complex steps Ø Error prone and requires multiple manual reviews Unstructured Content Ø No plug & play, off-the shelf Named Entity Recog. models Ø Requires pharma domain specific entities Regulatory Constraints Ø Complex and rapidly evolving regulations Ø High quality thresholds with stringent re-identification thresholds
  • 5. 1. Regulatory Constraints: Increasing regulations spike compliance costs and the likely penalties for breaches Typical annual Spends of $2-3 Mn on external vendor costs Typical cost of a healthcare breach was $9.2 Mn per incident 84% increase in healthcare data breaches, impacting 45 million people Tech advances & public data spike re- identification risks IBM Report - Cost of a Data breach
  • 6. 2. Human Errors: Clinical teams go through a long and cumbersome process for CSR anonymization Time consuming processes with cycle times up to 45 days for each summary document 25+ complex steps in achieving anonymization using different clinical trial management systems Higher potential for error with data flowing across multiple internal systems, databases and emails for reviews and approvals
  • 7. Anonymization Data anonymization is the process of transforming information by removing or encrypting sensitive data (PII or PHI), in order to protect data subjects’ privacy and confidentiality NLP Process of transforming and understanding human language to identify meaningful patterns and new insights 3. Unstructured Content: Advanced analytics and Natural Language Processing is needed to extract PII entities with high accuracy Anonymization Techniques: Ø Character Masking Ø Pseudonymisation Ø Generalization Ø Swapping Ø Data perturbation etc NLP Techniques: ØInformation Retrieval ØNatural Language Processing ØInformation extraction (NER- Named entity recognition)
  • 8. The Solution: A measured approach which balances human validation and judgement with analytics and automation Ø User-centered solution design Ø Collaborative workflows with user feedback Ø Leveraged open-source tech Ø Custom algorithm training for better domain understanding Ø Domain experts helped tailor algorithms for unstructured data Ø Strong & scalable solution capabilities basis past experience Ø Regulatory & research community help understand required quality thresholds Ø Iterative optimization till the desired EMA and Health Canada controls were met Human-in-the Loop Advanced Analytics Regulatory Compliance
  • 9. Unstructured Data Transformation The Anonymization Solution handles structured and unstructured data with iterative risk scoring to ensure compliance CSR documents Reference population (data on similar trials) Parsing CSR docs Entity recognition Sampling for users to validate Recall calculation Structured data transformation Iterative risk scoring and optimization algorithm Final risk adjusted CSR document User Input
  • 10. What did we learn from implementing such solutions for clients? Be prepared to tackle a variety of input data sources in terms of document structure, style, and entities Typical CSRs contain 100+ tables and figures which need to be treated as independent problems Paucity of research on the risk of reidentification and patient privacy in pharma clinical space
  • 11. Where are we headed? Solutions must be geared for more attacks, tightening regulations, and regional variations Data breaches have become more easy World regulations are evolving & norms are being tightened Region specific variations of regulations are emerging
  • 12. Please share your session feedback! @kesaritweets /gkesari Ganes Kesari www.gramener.com Thank You!