SlideShare a Scribd company logo
1 of 8
Download to read offline
Cognizant 20-20 Insights
September 2020
Accelerating Medical Labeling for
Diagnostic AI: Add Non-Clinicians
to Labeling Teams
Healthcare organizations can create a large repository of high-quality labeled
medical images at speed and at scale for training AI algorithms by following
best practices and by leveraging a data-labeling network of clinicians and
non-clinicians.
Executive Summary
Artificial intelligence (AI)-driven diagnostic models based
on medical images from X-rays, MRIs, and CT scans
promise to improve outcomes and reduce costs with
earlier detection and diagnosis.Their promise is such
that AI-driven imaging and diagnostic start-ups were the
largest group of all AI and machine learning (ML) start-ups
in healthcare in 2019.1
These AI/ML models learn how to
diagnose by reviewing and recognizing recurring patterns
associated with a vast amount of diverse and accurately
labeled medical images.
Achieving this large set of labeled medical images is
challenging to organizations for a variety of reasons. To
comply with privacy regulations, either patients must
Cognizant 20-20 Insights
2 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams
consent to their data being used for research
purposes or organizations must de-identify the
data. Data set creators must also obtain data from
a wide range of patient populations so the data set
is diverse and accurately reflects representative
disease states and outcomes. Finally, many
healthcare organizations assume that all medical
images for data sets must be reviewed and labeled
by specially trained physicians. This expensive and
inefficient route to labeling data sets contributes
to the current shortage of AI medical training data
available to organizations today.
Many images may be processed by well-trained, non-
clinician labelers accompanied by physician oversight.
By using a combination of both clinically trained
and non-clinician labelers, healthcare organizations
can more quickly and more inexpensively create
comprehensive, accurately labeled medical image
data sets and hasten the use of AI for detection and
diagnosis support.The keys to successfully using
a diverse mix of labelers include understanding
the problem that the AI model is meant to solve,
evaluating the images against pre-defined
requirements, and following a set of best practices.
Diverse labelers for diverse data
AI/ML diagnostic models must learn from highly
accurate training data sets containing a diverse set of
diseases and outcomes. Faults in learning data sets
may lead to faulty AI diagnostic models. Healthcare
organizations often have tapped medical experts to
review and label data to guard against inaccurate
labels and ensure that a data set is meaningful.
However, not all images need that level of expertise
to be properly labeled. To determine what expertise
is required, healthcare organizations must articulate
the complexity of the question that the AI/ML model
is meant to solve using these three criteria:
Cognizant 20-20 Insights
3 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams
	❙ Annotation requirements. Factors to consider
include the grade of disease state, how many
components need to be identified, and the
level of detail required in the annotation, which
influence the annotation method (e.g., semantic
segmentation, bounding boxes).
	❙ Imaging modality. These include X-rays, MRI
images and CT scans. Some of these are easier to
“read”than others.Within a modality, some images
also may be of higher resolution than others.
	❙ Presentation of symptoms.Some disease states
presentwith distinctfeatures,whereas others
present more ambiguously.The answers that the
healthcare organization requires for its AI/ML
model determine how much clinical expertise
that the labeling team needs to process the
corresponding data set. Disease complexity is
proportional to the clinical expertise needed
for accurate data labeling (see Figure 1). A
diagnostic model designed to merely detect
potential tumors in a breast scan will need
different data than a model that is meant to
precisely stage the cancer. In the former case,
non-clinicians trained to spot abnormalities may
label images as normal or suspect. Clinicians
can then review and stage the subset of suspect
images.
Disease complexity: Proportional to clinical expertise for accurate
data labeling
Example of a high-complexity case:
Invasive Ductal Carcinoma
Example of a low-complexity case:
Pulmonary Nodules
Current process that a pathologist undertakes to
diagnose Invasive Ductal Carcinoma (IDC):
Current process to diagnose a malignant pulmonary nodule:
Identify tumor as part of a
routine mammogram
Perform biopsy of
the tumor
Analyze the
biopsy
Perform imaging
exam (MRI, CT, or
PET) for tumor
details
1
2
3
4
Gland (acinus) formation
❙	 Evaluate the percentage
of the tumor that is
made up of acini
Size and shape of cells
❙	 Compare diseased cell
nuclei to benign breast
epithelium
❙	 Evaluate the shape of the
diseased cells compared
to the benign breast tissue
Mitosis rate
❙	 Count the number of
cells undergoing mitosis
Perform imaging exam
(X-ray and CT)
Discuss medical history
(smoking history & family
history of cancer)
Appearance of nodule
❙	 Inhomogeneous versus
homogeneous density
❙	 Thick versus thin walls
Location of nodule
❙	 Upper lung versus other
locations in the lung
Shape of nodule
❙	 Margin shape – Irregular
or spiculated margins
Perform follow-up imaging
exam (X-ray and CT)
Perform biopsy to
determine if malignant
1
2
3
4
Given the expertise required to recognize, annotate,
and ultimately diagnose IDC, only a specialist would be
able to provide high-quality annotations.
Given the relatively straightforward characteristics that are
evaluated, the level of domain expertise needed to annotate
the nodules is relatively low.
Figure 1
Cognizant 20-20 Insights
4 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams
Figure 2
Differing considerations for low-, medium-, and high-complexity cases
By starting with a small foundational team,
organizations can perfect their labeler selection and
quality assurance (QA) processes before scaling to
larger projects. The foundational team should consist
of physicians, other clinicians and non-physicians, so
that it can adapt and flex to label any initial disease
state. For a labeling team made up of primarily non-
clinical labelers, QA will be built into the processes
along with enhanced training and frequent audits
of labelers’ work. For an AI/ML model that requires
labeling by highly trained physicians, QA likely will
involve fewer audits but require more time spent
establishing the “ground truth” data (see Figure 2).
High-complexity cases Medium-complexity cases Low-complexity cases
Description
Considerations
Labeler mix
Complex cases require
advanced clinical
expertise and often have
substantial variation
between cases.
1.	 Complex data labeling
annotation types
(e.g., grading level of
diagnoses, semantic
segmentation)
2.	 Advanced imaging
modalities (e.g., MRIs,
mammography)
3.	 Ambiguous symptoms
of the disease that are
not consistent across
cases
4.	 Minimal or no clinical
training needed; likely
fewer audits from QA
perspective
5.	 Some training on
annotation best
practices
potentially needed
1.	 Straightforward data
labeling annotation
types (e.g., bounding
boxes, locating centroids,
polygon outlines)
2.	 Easier imaging modality
(e.g., black and white,
defined shapes) 
3.	 Distinct patterns of
symptoms
4.	 Some training on
annotation best practices
potentially needed
5.	 Likely fewer audits
for clinical accuracy
1.	 Straightforward data
labeling annotation
types (e.g., bounding
boxes, locating centroids,
polygon outlines)
2.	 Easier imaging modality
(black and white, defined
shapes, etc.)
3.	 Distinct patterns of
symptoms
4.	 Minimal or no training
needed on annotation
best practices
5.	 Substantial clinical
training and subsequent
quality assurance and
audits necessary
Physicians
With physician oversight,
other clinicians and non-
clinicians can be trained to
identify the disease.
Mix of physicians, other
clinicians, and non-clinicians
Non-clinicians can be
trained to locate easily
identifiable diseases
with training and
physician oversight.
Non-clinicians with some
physician oversight
Cognizant 20-20 Insights
5 / Accelerating Medical Labeling for Diagnostic AI: Add Non-clinicians to Labeling Teams
Figure 3
Ensuring high-quality results from non-clinical labelers
Basing the labeling team’s required clinical
expertise QA oversight level on the needs of the AI/
ML model provides organizations with a scalable,
and time- and cost-effective approach. It allows
organizations to evaluate and label multiple disease
states concurrently by reducing reliance on already-
busy physicians. This accelerates the creation of an
ImageNet repository of medical data.
Best practices
Healthcare organizations must support their diverse
labeling teams with best practices in data pre-
processing, training and quality control (see Figure
3). This helps ensure that the resulting labeled data
sets meet expectations and are appropriate for AI/
ML diagnostic model training.
Data preprocessing. Data preprocessing sets
up the project for success by ensuring that the
data is properly prepared for the data labelers
and by establishing quality parameters and the
requirements for the training set. The size and depth
of the training set needs to be selected based on
statistical significance and the complexity of the
proposed problem. An experienced statistician or
data scientist is critical to this step. They should
advise on the validity of the labeled data and
help maintain the project metrics as it scales. In
preprocessing, it is also important to ensure that
the images are in similar formats with adequate
resolution. This preparatory step helps increase
efficiency and improves the quality and accuracy
of labels by allowing labelers to make precise
observations and more consistent annotations.
Ground truth. The ground truth data set consists
of a subset of data points identified and labeled by
an expert. This data set becomes the gold standard
benchmark to which all other labelers’ work is
compared to monitor the quality of labels and help
ensure the project’s ultimate success.
Once defined, organizations may use the ground
truth data set to identify high-quality labelers
and to continuously monitor their performance
throughout the project’s lifecycle. Error-seeding, in
which random errors are distributed in the image
stream, can be based off the ground truth data set to
measure how well labelers compare in quality to the
domain expert.
Training. Regardless of whether the team is
composed entirely of physicians or includes
non-clinicians, extensive training at the project
onset prepares all labelers, sets expectations and
safeguards the integrity of the labels. In the initial
training, the medical images and labels are broken
down to a granular, easy-to-spot level and the
labeling requirements are described in a series of
simple, clear steps.
Ongoing training helps ensure that labelers
consistently understand and apply requirements.
Industry best practices for creating training materials
include leveraging deep domain expertise to
define clear labeling criteria. Additionally, cultural
competency is a key training component because
Develop
distinct label
parameters
Establish
ground truth
Train annotators
to level set
expectations
Integrate error-seeding
exercise to ensure
consistency
Monitor quality of
labels using recall
and precision rates
Cognizant 20-20 Insights
6 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams
diagnoses or levels of severity may differ among
countries. For example, a mild case of eczema may
be defined differently in the U.S. compared to how it
is commonly defined in Thailand. By clearly defining
disease and diagnosis requirements during
training, organizations can avoid many cultural
biases in the labels.
Quality. In addition to leveraging ground truth
data in error-seeding exercises to monitor quality
performance, other metrics can be employed to
monitor quality.“Recall” refers to the measure
of sensitivity and completeness, measuring the
percentage of total relevant results that are classified
by the machine learning algorithm.“Precision”
refers to the measure of exactness, measuring
the percentage of results that are relevant.
Recall and precision rates cannot be maximized
simultaneously; however, the F-1 score provides
a mean of precision and recall (see Figure 4).
This metric is often used for instances when both
precision and recall are important to the problem.
To measure whether annotators are labeling with a
consistent understanding of the defined labels, inter-
rater reliability can also be measured. Inter-rater
reliability refers to the test-retest method to measure
that annotators are labeling with a consistent
understanding of the defined labels. These
quantifiable metrics for measuring the accuracy of
the data labels provide a distinct way to measure the
success of the labels and the model.
AI helping AI
It is essential that organizations leverage best
practices and emerging AI tools to effectively
accelerate the creation of a labeled medical image
repository at speed and scale. Best practices around
data preprocessing, ground truth, training, and
quality combined with emerging AI trends and
technologies reduce the burdens associated with
labeling a high volume of complex images. For
example, federated learning is one opportunity that
combats the need to host highly sensitive medical
data in a single location to train an ML model. In
federated learning, each healthcare system may
train their respective model based on local training
data. Then, the models across healthcare systems
are merged to create a master AI/ML model, thereby
reducing the compliance risk of sharing data across
health systems. Leveraging federated learning and
other techniques decreases the time spent on the
end-to-end data labeling and training process,
thereby providing more efficient and scalable
diagnostic model development.
Approaches to labeling will vary across teams. The
constant should be ensuring that the complexity
of the diagnostic question defines the labeler
requirements and best practices around labeler
selection and model development and training.
Careful consideration of these requirements and
following best practices are pivotal to accelerating
the creation of an accurate and valid repository of
labeled medical images.
Figure 4
Calculating the F-1 score to capture a mean of precision and recall
Recall =
True Positive
True Positive + False Negative Precision =
True Positive
True Positive + False Positive
F-1 score = 2
Precision * Recall
Precision + Recall
Endnote
1	 From Drug R&D To Diagnostics: 90+ Artificial Intelligence Startups In Healthcare, CB Insights research brief, September 12,
2019, https://www.cbinsights.com/research/artificial-intelligence-startups-healthcare/.
Cognizant 20-20 Insights
7 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams
About the authors
Sashi Padarthy
Assistant Vice President, Cognizant Consulting’s Healthcare Practice
Sashi Padarthy leads Cognizant’s Digital Strategy and Transformation service line. For more than 20 years,
Sashi has helped healthcare organizations in the areas of digital strategy, digital transformation, innovation,
technology-enabled strategy, new product development, value-based care and operational improvement. He
is a Sloan Fellow from the London Business School. Sashi can be reached at Sashi.Padarthy@cognizant.com |
www.linkedin.com/in/sashipadarthy/.
Kristin Knudson
Consulting Manager, Cognizant Consulting’s Healthcare Practice
Kristin Knudson is a Manager within Cognizant Consulting’s Healthcare Practice. She has experience in
implementing digital transformation solutions and integrating machine intelligence into care delivery.
Kristin holds an MBA in business strategy and a master’s degree in biomedical engineering, allowing her to
provide a unique perspective on the advantages of leveraging technology in the healthcare industry. Kristin’s
current interest is exploring applications of machine intelligence to advance digital insights and improve the
consumer’s overall experience. Kristin can be reached at Kristin.Knudson@cognizant.com | www.linkedin.
com/in/kristin-knudson-4266b22a/.
Jacqueline Parzivand (Zelener)
Senior Consultant, Cognizant Consulting’s Healthcare Practice
Jacqueline Parzivand (Zelener) is a Senior Consultant in Cognizant Consulting’s Healthcare Practice. She has
valuable experience in payer operational processes, process optimization, business readiness, and medical
management.Jacqueline holds a master in health administration and has worked in operations and clinical
research, which provides her with a dynamic outlook on the intersection of business outcomes, technology,
and healthcare.Jacqueline can be reached at Jacqueline.Parzivand@cognizant.com | www.linkedin.com/in/
jacquelinezelener/.
Mary Helen Turnage
Senior Consultant, Cognizant Consulting’s Healthcare Practice
Mary Helen Turnage is a Senior Consultant within Cognizant Consulting’s Healthcare Practice. She has
experience in payer operational readiness as well as process optimization. Mary Helen has an MBA in strategy
and finance and has worked in clinical recruitment and strategic execution. She is especially interested in
how technology is changing the workforce landscape in healthcare. Mary Helen can be reached at Mary.
Turnage@cognizant.com | www.linkedin.com/in/maryhelenturnage/.
World Headquarters
500 Frank W. Burr Blvd.
Teaneck, NJ 07666 USA
Phone: +1 201 801 0233
Fax: +1 201 801 0243
Toll Free: +1 888 937 3277
European Headquarters
1 Kingdom Street
Paddington Central
London W2 6BD England
Phone: +44 (0) 20 7297 7600
Fax: +44 (0) 20 7121 0102
India Operations Headquarters
#5/535 Old Mahabalipuram Road
Okkiyam Pettai, Thoraipakkam
Chennai, 600 096 India
Phone: +91 (0) 44 4209 6000
Fax: +91 (0) 44 4209 6060
APAC Headquarters
1 Changi Business Park Crescent,
Plaza 8@CBP # 07-04/05/06,
Tower A, Singapore 486025
Phone: + 65 6812 4051
Fax: + 65 6324 4051
About Cognizant Healthcare
Cognizant’s Healthcare business unit works with healthcare organizations to provide collaborative, innovative solutions that address the industry’s most
pressing IT and business challenges — from rethinking new business models, to optimizing operations and enabling technology innovation. As a global
leader in healthcare, our industry-specific services and solutions support leading payers, providers and pharmacy benefit managers worldwide. For more
information, visit www.cognizant.com/healthcare.
About Cognizant Digital Business
We help clients build digital businesses and innovate products that create new value — by using sensing, insights, software and experience to deliver on what
customers demand in the digital age. Through IoT we connect the digital and physical worlds to make smart, efficient and safe products, operations and
enterprises. Leveraging data, analytics and AI we drive intelligent decisions and anticipate where markets and customers are going next. Then we use those
insights, combining design and software to deliver the experiences that consumers expect of their brands. Learn more about how we’re engineering the
modern enterprise at www.cognizant.com/digitalbusiness.
About Cognizant
Cognizant (Nasdaq-100: CTSH) is one of the world’s leading professional services companies, transforming clients’ business, operating and technology
models for the digital era. Our unique industry-based, consultative approach helps clients envision, build and run more innovative and efficient businesses.
Headquartered in the U.S., Cognizant is ranked 194 on the Fortune 500 and is consistently listed among the most admired companies in the world. Learn
how Cognizant helps clients lead with digital at www.cognizant.com or follow us @Cognizant.
© Copyright 2020, Cognizant. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by any means, electronic, mechanical,
photocopying, recording, or otherwise, without the express written permission from Cognizant. The information contained herein is subject to change without notice. All other trademarks mentioned
herein are the property of their respective owners.
Codex 5914

More Related Content

What's hot

EY Medtech Pulse of the Industry 2018
EY Medtech Pulse of the Industry 2018EY Medtech Pulse of the Industry 2018
EY Medtech Pulse of the Industry 2018Revital (Tali) Hirsch
 
Regulating AI & ML in Medicine
Regulating AI & ML in MedicineRegulating AI & ML in Medicine
Regulating AI & ML in Medicineshyamal_patel
 
Quality management system model
Quality management system modelQuality management system model
Quality management system modelselinasimpson2601
 
The Future of Healthcare in Consumerism World
The Future of Healthcare in Consumerism WorldThe Future of Healthcare in Consumerism World
The Future of Healthcare in Consumerism WorldCitiusTech
 
mHealth Israel_E-Health to Digital Health_Jeremy Brody_Kantar Health
mHealth Israel_E-Health to Digital Health_Jeremy Brody_Kantar HealthmHealth Israel_E-Health to Digital Health_Jeremy Brody_Kantar Health
mHealth Israel_E-Health to Digital Health_Jeremy Brody_Kantar HealthLevi Shapiro
 
Medical Devices: Equipped for the Future?
Medical Devices: Equipped for the Future?Medical Devices: Equipped for the Future?
Medical Devices: Equipped for the Future?Revital (Tali) Hirsch
 
High Flyer Health IT Investments and Health IT Investment Trends
High Flyer Health IT Investments and Health IT Investment TrendsHigh Flyer Health IT Investments and Health IT Investment Trends
High Flyer Health IT Investments and Health IT Investment TrendsPlatform Houston
 
Clinical information system
Clinical information systemClinical information system
Clinical information systemNUR3563Team1
 
LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE CASE STUDY: AUTOMOBILE INSURAN...
LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE  CASE STUDY: AUTOMOBILE INSURAN...LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE  CASE STUDY: AUTOMOBILE INSURAN...
LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE CASE STUDY: AUTOMOBILE INSURAN...ijmvsc
 
Improving pharmaceutical marketing using big data solutions
Improving pharmaceutical marketing using big data solutionsImproving pharmaceutical marketing using big data solutions
Improving pharmaceutical marketing using big data solutionsPaul Grant
 
IRJET - E-Health Chain and Anticipation of Future Disease
IRJET - E-Health Chain and Anticipation of Future DiseaseIRJET - E-Health Chain and Anticipation of Future Disease
IRJET - E-Health Chain and Anticipation of Future DiseaseIRJET Journal
 
Data Mining - Health Insurance - Jabran Noor
Data Mining - Health Insurance - Jabran NoorData Mining - Health Insurance - Jabran Noor
Data Mining - Health Insurance - Jabran NoorJabran Noor
 
Ahima data quality management model
Ahima data quality management modelAhima data quality management model
Ahima data quality management modelselinasimpson2301
 
Next Generation Analytics: The Backbone of the High Performing Health System
Next Generation Analytics: The Backbone of the High Performing Health SystemNext Generation Analytics: The Backbone of the High Performing Health System
Next Generation Analytics: The Backbone of the High Performing Health SystemInvestnet
 
National Association of Healthcare Access Management Presentation
National Association of Healthcare Access Management PresentationNational Association of Healthcare Access Management Presentation
National Association of Healthcare Access Management Presentationmikemike09
 
PATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM projectPATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM projectLaud Randy Amofah
 
Ehr Testing Challenge
Ehr Testing ChallengeEhr Testing Challenge
Ehr Testing ChallengeQASymphony
 

What's hot (20)

EY Medtech Pulse of the Industry 2018
EY Medtech Pulse of the Industry 2018EY Medtech Pulse of the Industry 2018
EY Medtech Pulse of the Industry 2018
 
Regulating AI & ML in Medicine
Regulating AI & ML in MedicineRegulating AI & ML in Medicine
Regulating AI & ML in Medicine
 
Quality management system model
Quality management system modelQuality management system model
Quality management system model
 
Quality management model
Quality management modelQuality management model
Quality management model
 
The Future of Healthcare in Consumerism World
The Future of Healthcare in Consumerism WorldThe Future of Healthcare in Consumerism World
The Future of Healthcare in Consumerism World
 
mHealth Israel_E-Health to Digital Health_Jeremy Brody_Kantar Health
mHealth Israel_E-Health to Digital Health_Jeremy Brody_Kantar HealthmHealth Israel_E-Health to Digital Health_Jeremy Brody_Kantar Health
mHealth Israel_E-Health to Digital Health_Jeremy Brody_Kantar Health
 
Medical Devices: Equipped for the Future?
Medical Devices: Equipped for the Future?Medical Devices: Equipped for the Future?
Medical Devices: Equipped for the Future?
 
High Flyer Health IT Investments and Health IT Investment Trends
High Flyer Health IT Investments and Health IT Investment TrendsHigh Flyer Health IT Investments and Health IT Investment Trends
High Flyer Health IT Investments and Health IT Investment Trends
 
The Road to Value-Based Care
The Road to Value-Based CareThe Road to Value-Based Care
The Road to Value-Based Care
 
Clinical information system
Clinical information systemClinical information system
Clinical information system
 
LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE CASE STUDY: AUTOMOBILE INSURAN...
LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE  CASE STUDY: AUTOMOBILE INSURAN...LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE  CASE STUDY: AUTOMOBILE INSURAN...
LABELING CUSTOMERS USING DISCOVERED KNOWLEDGE CASE STUDY: AUTOMOBILE INSURAN...
 
Improving pharmaceutical marketing using big data solutions
Improving pharmaceutical marketing using big data solutionsImproving pharmaceutical marketing using big data solutions
Improving pharmaceutical marketing using big data solutions
 
IRJET - E-Health Chain and Anticipation of Future Disease
IRJET - E-Health Chain and Anticipation of Future DiseaseIRJET - E-Health Chain and Anticipation of Future Disease
IRJET - E-Health Chain and Anticipation of Future Disease
 
Data Mining - Health Insurance - Jabran Noor
Data Mining - Health Insurance - Jabran NoorData Mining - Health Insurance - Jabran Noor
Data Mining - Health Insurance - Jabran Noor
 
Ahima data quality management model
Ahima data quality management modelAhima data quality management model
Ahima data quality management model
 
Next Generation Analytics: The Backbone of the High Performing Health System
Next Generation Analytics: The Backbone of the High Performing Health SystemNext Generation Analytics: The Backbone of the High Performing Health System
Next Generation Analytics: The Backbone of the High Performing Health System
 
DTx Development
DTx DevelopmentDTx Development
DTx Development
 
National Association of Healthcare Access Management Presentation
National Association of Healthcare Access Management PresentationNational Association of Healthcare Access Management Presentation
National Association of Healthcare Access Management Presentation
 
PATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM projectPATIENT MANAGEMENT SYSTEM project
PATIENT MANAGEMENT SYSTEM project
 
Ehr Testing Challenge
Ehr Testing ChallengeEhr Testing Challenge
Ehr Testing Challenge
 

Similar to Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams

Automated clinical documentation improvement
Automated clinical documentation improvementAutomated clinical documentation improvement
Automated clinical documentation improvementNehal (Neil) Shah
 
Employing AI to Diagnose Cancer
Employing AI to Diagnose CancerEmploying AI to Diagnose Cancer
Employing AI to Diagnose CancerEMMAIntl
 
AI Regulation Is Coming to Life Sciences: Three Steps to Take Now
AI Regulation Is Coming to Life Sciences: Three Steps to Take NowAI Regulation Is Coming to Life Sciences: Three Steps to Take Now
AI Regulation Is Coming to Life Sciences: Three Steps to Take NowCognizant
 
New IBM IBV Study - Cognitive in Pharmacovigilence
New IBM IBV Study - Cognitive in PharmacovigilenceNew IBM IBV Study - Cognitive in Pharmacovigilence
New IBM IBV Study - Cognitive in PharmacovigilenceSarah Freemantle
 
Predictive Analytics: It's The Intervention That Matters
Predictive Analytics: It's The Intervention That MattersPredictive Analytics: It's The Intervention That Matters
Predictive Analytics: It's The Intervention That MattersHealth Catalyst
 
The AI Advantage- Complete, Accurate, and Compliant Medical Coding.pdf
The AI Advantage- Complete, Accurate, and Compliant Medical Coding.pdfThe AI Advantage- Complete, Accurate, and Compliant Medical Coding.pdf
The AI Advantage- Complete, Accurate, and Compliant Medical Coding.pdfAGSHealth2
 
The AI Advantage: Complete, Accurate, and Compliant Medical Coding | AGS Heal...
The AI Advantage: Complete, Accurate, and Compliant Medical Coding | AGS Heal...The AI Advantage: Complete, Accurate, and Compliant Medical Coding | AGS Heal...
The AI Advantage: Complete, Accurate, and Compliant Medical Coding | AGS Heal...AGSHealth1
 
The AI Advantage - Complete, Accurate, and Compliant Medical Coding.pdf
The AI Advantage - Complete, Accurate, and Compliant Medical Coding.pdfThe AI Advantage - Complete, Accurate, and Compliant Medical Coding.pdf
The AI Advantage - Complete, Accurate, and Compliant Medical Coding.pdfAGSHealth2
 
Data science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxData science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxArpitaDebnath20
 
List out the challenges of ml ai for delivering clinical impact - Pubrica
List out the challenges of ml ai for delivering clinical impact -  PubricaList out the challenges of ml ai for delivering clinical impact -  Pubrica
List out the challenges of ml ai for delivering clinical impact - PubricaPubrica
 
mHealth Israel_Reimbursement Bootcamp_Jan 2020
mHealth Israel_Reimbursement Bootcamp_Jan 2020mHealth Israel_Reimbursement Bootcamp_Jan 2020
mHealth Israel_Reimbursement Bootcamp_Jan 2020Levi Shapiro
 
Practical and Succinct Solutions to Coding - Select Data, Inc.
Practical and Succinct Solutions to Coding - Select Data, Inc. Practical and Succinct Solutions to Coding - Select Data, Inc.
Practical and Succinct Solutions to Coding - Select Data, Inc. RachelBuckleySelect
 
Practicalandsuccinctsolutionstocoding 140519132624-phpapp02
Practicalandsuccinctsolutionstocoding 140519132624-phpapp02Practicalandsuccinctsolutionstocoding 140519132624-phpapp02
Practicalandsuccinctsolutionstocoding 140519132624-phpapp02SelectData
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkStats Statswork
 
Gleecus Whitepaper : Applications of Artificial Intelligence in Healthcare
Gleecus Whitepaper : Applications of Artificial Intelligence in HealthcareGleecus Whitepaper : Applications of Artificial Intelligence in Healthcare
Gleecus Whitepaper : Applications of Artificial Intelligence in HealthcareSuprit Patra
 
Producing Better and Affordable Healthcare Services Using Computational Intel...
Producing Better and Affordable Healthcare Services Using Computational Intel...Producing Better and Affordable Healthcare Services Using Computational Intel...
Producing Better and Affordable Healthcare Services Using Computational Intel...EMMAIntl
 

Similar to Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams (20)

Data mining applications
Data mining applicationsData mining applications
Data mining applications
 
Automated clinical documentation improvement
Automated clinical documentation improvementAutomated clinical documentation improvement
Automated clinical documentation improvement
 
Employing AI to Diagnose Cancer
Employing AI to Diagnose CancerEmploying AI to Diagnose Cancer
Employing AI to Diagnose Cancer
 
AI Regulation Is Coming to Life Sciences: Three Steps to Take Now
AI Regulation Is Coming to Life Sciences: Three Steps to Take NowAI Regulation Is Coming to Life Sciences: Three Steps to Take Now
AI Regulation Is Coming to Life Sciences: Three Steps to Take Now
 
New IBM IBV Study - Cognitive in Pharmacovigilence
New IBM IBV Study - Cognitive in PharmacovigilenceNew IBM IBV Study - Cognitive in Pharmacovigilence
New IBM IBV Study - Cognitive in Pharmacovigilence
 
Medicaid Analytics eBook
Medicaid Analytics eBookMedicaid Analytics eBook
Medicaid Analytics eBook
 
Predictive Analytics: It's The Intervention That Matters
Predictive Analytics: It's The Intervention That MattersPredictive Analytics: It's The Intervention That Matters
Predictive Analytics: It's The Intervention That Matters
 
The AI Advantage- Complete, Accurate, and Compliant Medical Coding.pdf
The AI Advantage- Complete, Accurate, and Compliant Medical Coding.pdfThe AI Advantage- Complete, Accurate, and Compliant Medical Coding.pdf
The AI Advantage- Complete, Accurate, and Compliant Medical Coding.pdf
 
The AI Advantage: Complete, Accurate, and Compliant Medical Coding | AGS Heal...
The AI Advantage: Complete, Accurate, and Compliant Medical Coding | AGS Heal...The AI Advantage: Complete, Accurate, and Compliant Medical Coding | AGS Heal...
The AI Advantage: Complete, Accurate, and Compliant Medical Coding | AGS Heal...
 
The AI Advantage - Complete, Accurate, and Compliant Medical Coding.pdf
The AI Advantage - Complete, Accurate, and Compliant Medical Coding.pdfThe AI Advantage - Complete, Accurate, and Compliant Medical Coding.pdf
The AI Advantage - Complete, Accurate, and Compliant Medical Coding.pdf
 
Data science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxData science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptx
 
page 3
page 3page 3
page 3
 
List out the challenges of ml ai for delivering clinical impact - Pubrica
List out the challenges of ml ai for delivering clinical impact -  PubricaList out the challenges of ml ai for delivering clinical impact -  Pubrica
List out the challenges of ml ai for delivering clinical impact - Pubrica
 
mHealth Israel_Reimbursement Bootcamp_Jan 2020
mHealth Israel_Reimbursement Bootcamp_Jan 2020mHealth Israel_Reimbursement Bootcamp_Jan 2020
mHealth Israel_Reimbursement Bootcamp_Jan 2020
 
Practical and Succinct Solutions to Coding - Select Data, Inc.
Practical and Succinct Solutions to Coding - Select Data, Inc. Practical and Succinct Solutions to Coding - Select Data, Inc.
Practical and Succinct Solutions to Coding - Select Data, Inc.
 
Practicalandsuccinctsolutionstocoding 140519132624-phpapp02
Practicalandsuccinctsolutionstocoding 140519132624-phpapp02Practicalandsuccinctsolutionstocoding 140519132624-phpapp02
Practicalandsuccinctsolutionstocoding 140519132624-phpapp02
 
How to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - StatsworkHow to establish and evaluate clinical prediction models - Statswork
How to establish and evaluate clinical prediction models - Statswork
 
Artificial intelligence and Medicine - Copy.pptx
Artificial intelligence and Medicine - Copy.pptxArtificial intelligence and Medicine - Copy.pptx
Artificial intelligence and Medicine - Copy.pptx
 
Gleecus Whitepaper : Applications of Artificial Intelligence in Healthcare
Gleecus Whitepaper : Applications of Artificial Intelligence in HealthcareGleecus Whitepaper : Applications of Artificial Intelligence in Healthcare
Gleecus Whitepaper : Applications of Artificial Intelligence in Healthcare
 
Producing Better and Affordable Healthcare Services Using Computational Intel...
Producing Better and Affordable Healthcare Services Using Computational Intel...Producing Better and Affordable Healthcare Services Using Computational Intel...
Producing Better and Affordable Healthcare Services Using Computational Intel...
 

More from Cognizant

Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Cognizant
 
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingData Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingCognizant
 
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesIt Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesCognizant
 
Intuition Engineered
Intuition EngineeredIntuition Engineered
Intuition EngineeredCognizant
 
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...Cognizant
 
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesEnhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesCognizant
 
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateThe Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateCognizant
 
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...Cognizant
 
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Cognizant
 
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Cognizant
 
Green Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityGreen Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityCognizant
 
Policy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersPolicy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersCognizant
 
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalThe Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalCognizant
 
AI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueAI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueCognizant
 
Operations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachOperations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachCognizant
 
Five Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudFive Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudCognizant
 
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedGetting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedCognizant
 
Crafting the Utility of the Future
Crafting the Utility of the FutureCrafting the Utility of the Future
Crafting the Utility of the FutureCognizant
 
Utilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformUtilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformCognizant
 
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...Cognizant
 

More from Cognizant (20)

Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
Using Adaptive Scrum to Tame Process Reverse Engineering in Data Analytics Pr...
 
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-makingData Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
Data Modernization: Breaking the AI Vicious Cycle for Superior Decision-making
 
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional ExperiencesIt Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
It Takes an Ecosystem: How Technology Companies Deliver Exceptional Experiences
 
Intuition Engineered
Intuition EngineeredIntuition Engineered
Intuition Engineered
 
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
The Work Ahead: Transportation and Logistics Delivering on the Digital-Physic...
 
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital InitiativesEnhancing Desirability: Five Considerations for Winning Digital Initiatives
Enhancing Desirability: Five Considerations for Winning Digital Initiatives
 
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility MandateThe Work Ahead in Manufacturing: Fulfilling the Agility Mandate
The Work Ahead in Manufacturing: Fulfilling the Agility Mandate
 
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
The Work Ahead in Higher Education: Repaving the Road for the Employees of To...
 
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...Engineering the Next-Gen Digital Claims Organisation for Australian General I...
Engineering the Next-Gen Digital Claims Organisation for Australian General I...
 
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
Profitability in the Direct-to-Consumer Marketplace: A Playbook for Media and...
 
Green Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for SustainabilityGreen Rush: The Economic Imperative for Sustainability
Green Rush: The Economic Imperative for Sustainability
 
Policy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for InsurersPolicy Administration Modernization: Four Paths for Insurers
Policy Administration Modernization: Four Paths for Insurers
 
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with DigitalThe Work Ahead in Utilities: Powering a Sustainable Future with Digital
The Work Ahead in Utilities: Powering a Sustainable Future with Digital
 
AI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to ValueAI in Media & Entertainment: Starting the Journey to Value
AI in Media & Entertainment: Starting the Journey to Value
 
Operations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First ApproachOperations Workforce Management: A Data-Informed, Digital-First Approach
Operations Workforce Management: A Data-Informed, Digital-First Approach
 
Five Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the CloudFive Priorities for Quality Engineering When Taking Banking to the Cloud
Five Priorities for Quality Engineering When Taking Banking to the Cloud
 
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining FocusedGetting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
Getting Ahead With AI: How APAC Companies Replicate Success by Remaining Focused
 
Crafting the Utility of the Future
Crafting the Utility of the FutureCrafting the Utility of the Future
Crafting the Utility of the Future
 
Utilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data PlatformUtilities Can Ramp Up CX with a Customer Data Platform
Utilities Can Ramp Up CX with a Customer Data Platform
 
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
The Work Ahead in Intelligent Automation: Coping with Complexity in a Post-Pa...
 

Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams

  • 1. Cognizant 20-20 Insights September 2020 Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams Healthcare organizations can create a large repository of high-quality labeled medical images at speed and at scale for training AI algorithms by following best practices and by leveraging a data-labeling network of clinicians and non-clinicians. Executive Summary Artificial intelligence (AI)-driven diagnostic models based on medical images from X-rays, MRIs, and CT scans promise to improve outcomes and reduce costs with earlier detection and diagnosis.Their promise is such that AI-driven imaging and diagnostic start-ups were the largest group of all AI and machine learning (ML) start-ups in healthcare in 2019.1 These AI/ML models learn how to diagnose by reviewing and recognizing recurring patterns associated with a vast amount of diverse and accurately labeled medical images. Achieving this large set of labeled medical images is challenging to organizations for a variety of reasons. To comply with privacy regulations, either patients must
  • 2. Cognizant 20-20 Insights 2 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams consent to their data being used for research purposes or organizations must de-identify the data. Data set creators must also obtain data from a wide range of patient populations so the data set is diverse and accurately reflects representative disease states and outcomes. Finally, many healthcare organizations assume that all medical images for data sets must be reviewed and labeled by specially trained physicians. This expensive and inefficient route to labeling data sets contributes to the current shortage of AI medical training data available to organizations today. Many images may be processed by well-trained, non- clinician labelers accompanied by physician oversight. By using a combination of both clinically trained and non-clinician labelers, healthcare organizations can more quickly and more inexpensively create comprehensive, accurately labeled medical image data sets and hasten the use of AI for detection and diagnosis support.The keys to successfully using a diverse mix of labelers include understanding the problem that the AI model is meant to solve, evaluating the images against pre-defined requirements, and following a set of best practices. Diverse labelers for diverse data AI/ML diagnostic models must learn from highly accurate training data sets containing a diverse set of diseases and outcomes. Faults in learning data sets may lead to faulty AI diagnostic models. Healthcare organizations often have tapped medical experts to review and label data to guard against inaccurate labels and ensure that a data set is meaningful. However, not all images need that level of expertise to be properly labeled. To determine what expertise is required, healthcare organizations must articulate the complexity of the question that the AI/ML model is meant to solve using these three criteria:
  • 3. Cognizant 20-20 Insights 3 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams ❙ Annotation requirements. Factors to consider include the grade of disease state, how many components need to be identified, and the level of detail required in the annotation, which influence the annotation method (e.g., semantic segmentation, bounding boxes). ❙ Imaging modality. These include X-rays, MRI images and CT scans. Some of these are easier to “read”than others.Within a modality, some images also may be of higher resolution than others. ❙ Presentation of symptoms.Some disease states presentwith distinctfeatures,whereas others present more ambiguously.The answers that the healthcare organization requires for its AI/ML model determine how much clinical expertise that the labeling team needs to process the corresponding data set. Disease complexity is proportional to the clinical expertise needed for accurate data labeling (see Figure 1). A diagnostic model designed to merely detect potential tumors in a breast scan will need different data than a model that is meant to precisely stage the cancer. In the former case, non-clinicians trained to spot abnormalities may label images as normal or suspect. Clinicians can then review and stage the subset of suspect images. Disease complexity: Proportional to clinical expertise for accurate data labeling Example of a high-complexity case: Invasive Ductal Carcinoma Example of a low-complexity case: Pulmonary Nodules Current process that a pathologist undertakes to diagnose Invasive Ductal Carcinoma (IDC): Current process to diagnose a malignant pulmonary nodule: Identify tumor as part of a routine mammogram Perform biopsy of the tumor Analyze the biopsy Perform imaging exam (MRI, CT, or PET) for tumor details 1 2 3 4 Gland (acinus) formation ❙ Evaluate the percentage of the tumor that is made up of acini Size and shape of cells ❙ Compare diseased cell nuclei to benign breast epithelium ❙ Evaluate the shape of the diseased cells compared to the benign breast tissue Mitosis rate ❙ Count the number of cells undergoing mitosis Perform imaging exam (X-ray and CT) Discuss medical history (smoking history & family history of cancer) Appearance of nodule ❙ Inhomogeneous versus homogeneous density ❙ Thick versus thin walls Location of nodule ❙ Upper lung versus other locations in the lung Shape of nodule ❙ Margin shape – Irregular or spiculated margins Perform follow-up imaging exam (X-ray and CT) Perform biopsy to determine if malignant 1 2 3 4 Given the expertise required to recognize, annotate, and ultimately diagnose IDC, only a specialist would be able to provide high-quality annotations. Given the relatively straightforward characteristics that are evaluated, the level of domain expertise needed to annotate the nodules is relatively low. Figure 1
  • 4. Cognizant 20-20 Insights 4 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams Figure 2 Differing considerations for low-, medium-, and high-complexity cases By starting with a small foundational team, organizations can perfect their labeler selection and quality assurance (QA) processes before scaling to larger projects. The foundational team should consist of physicians, other clinicians and non-physicians, so that it can adapt and flex to label any initial disease state. For a labeling team made up of primarily non- clinical labelers, QA will be built into the processes along with enhanced training and frequent audits of labelers’ work. For an AI/ML model that requires labeling by highly trained physicians, QA likely will involve fewer audits but require more time spent establishing the “ground truth” data (see Figure 2). High-complexity cases Medium-complexity cases Low-complexity cases Description Considerations Labeler mix Complex cases require advanced clinical expertise and often have substantial variation between cases. 1. Complex data labeling annotation types (e.g., grading level of diagnoses, semantic segmentation) 2. Advanced imaging modalities (e.g., MRIs, mammography) 3. Ambiguous symptoms of the disease that are not consistent across cases 4. Minimal or no clinical training needed; likely fewer audits from QA perspective 5. Some training on annotation best practices potentially needed 1. Straightforward data labeling annotation types (e.g., bounding boxes, locating centroids, polygon outlines) 2. Easier imaging modality (e.g., black and white, defined shapes)  3. Distinct patterns of symptoms 4. Some training on annotation best practices potentially needed 5. Likely fewer audits for clinical accuracy 1. Straightforward data labeling annotation types (e.g., bounding boxes, locating centroids, polygon outlines) 2. Easier imaging modality (black and white, defined shapes, etc.) 3. Distinct patterns of symptoms 4. Minimal or no training needed on annotation best practices 5. Substantial clinical training and subsequent quality assurance and audits necessary Physicians With physician oversight, other clinicians and non- clinicians can be trained to identify the disease. Mix of physicians, other clinicians, and non-clinicians Non-clinicians can be trained to locate easily identifiable diseases with training and physician oversight. Non-clinicians with some physician oversight
  • 5. Cognizant 20-20 Insights 5 / Accelerating Medical Labeling for Diagnostic AI: Add Non-clinicians to Labeling Teams Figure 3 Ensuring high-quality results from non-clinical labelers Basing the labeling team’s required clinical expertise QA oversight level on the needs of the AI/ ML model provides organizations with a scalable, and time- and cost-effective approach. It allows organizations to evaluate and label multiple disease states concurrently by reducing reliance on already- busy physicians. This accelerates the creation of an ImageNet repository of medical data. Best practices Healthcare organizations must support their diverse labeling teams with best practices in data pre- processing, training and quality control (see Figure 3). This helps ensure that the resulting labeled data sets meet expectations and are appropriate for AI/ ML diagnostic model training. Data preprocessing. Data preprocessing sets up the project for success by ensuring that the data is properly prepared for the data labelers and by establishing quality parameters and the requirements for the training set. The size and depth of the training set needs to be selected based on statistical significance and the complexity of the proposed problem. An experienced statistician or data scientist is critical to this step. They should advise on the validity of the labeled data and help maintain the project metrics as it scales. In preprocessing, it is also important to ensure that the images are in similar formats with adequate resolution. This preparatory step helps increase efficiency and improves the quality and accuracy of labels by allowing labelers to make precise observations and more consistent annotations. Ground truth. The ground truth data set consists of a subset of data points identified and labeled by an expert. This data set becomes the gold standard benchmark to which all other labelers’ work is compared to monitor the quality of labels and help ensure the project’s ultimate success. Once defined, organizations may use the ground truth data set to identify high-quality labelers and to continuously monitor their performance throughout the project’s lifecycle. Error-seeding, in which random errors are distributed in the image stream, can be based off the ground truth data set to measure how well labelers compare in quality to the domain expert. Training. Regardless of whether the team is composed entirely of physicians or includes non-clinicians, extensive training at the project onset prepares all labelers, sets expectations and safeguards the integrity of the labels. In the initial training, the medical images and labels are broken down to a granular, easy-to-spot level and the labeling requirements are described in a series of simple, clear steps. Ongoing training helps ensure that labelers consistently understand and apply requirements. Industry best practices for creating training materials include leveraging deep domain expertise to define clear labeling criteria. Additionally, cultural competency is a key training component because Develop distinct label parameters Establish ground truth Train annotators to level set expectations Integrate error-seeding exercise to ensure consistency Monitor quality of labels using recall and precision rates
  • 6. Cognizant 20-20 Insights 6 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams diagnoses or levels of severity may differ among countries. For example, a mild case of eczema may be defined differently in the U.S. compared to how it is commonly defined in Thailand. By clearly defining disease and diagnosis requirements during training, organizations can avoid many cultural biases in the labels. Quality. In addition to leveraging ground truth data in error-seeding exercises to monitor quality performance, other metrics can be employed to monitor quality.“Recall” refers to the measure of sensitivity and completeness, measuring the percentage of total relevant results that are classified by the machine learning algorithm.“Precision” refers to the measure of exactness, measuring the percentage of results that are relevant. Recall and precision rates cannot be maximized simultaneously; however, the F-1 score provides a mean of precision and recall (see Figure 4). This metric is often used for instances when both precision and recall are important to the problem. To measure whether annotators are labeling with a consistent understanding of the defined labels, inter- rater reliability can also be measured. Inter-rater reliability refers to the test-retest method to measure that annotators are labeling with a consistent understanding of the defined labels. These quantifiable metrics for measuring the accuracy of the data labels provide a distinct way to measure the success of the labels and the model. AI helping AI It is essential that organizations leverage best practices and emerging AI tools to effectively accelerate the creation of a labeled medical image repository at speed and scale. Best practices around data preprocessing, ground truth, training, and quality combined with emerging AI trends and technologies reduce the burdens associated with labeling a high volume of complex images. For example, federated learning is one opportunity that combats the need to host highly sensitive medical data in a single location to train an ML model. In federated learning, each healthcare system may train their respective model based on local training data. Then, the models across healthcare systems are merged to create a master AI/ML model, thereby reducing the compliance risk of sharing data across health systems. Leveraging federated learning and other techniques decreases the time spent on the end-to-end data labeling and training process, thereby providing more efficient and scalable diagnostic model development. Approaches to labeling will vary across teams. The constant should be ensuring that the complexity of the diagnostic question defines the labeler requirements and best practices around labeler selection and model development and training. Careful consideration of these requirements and following best practices are pivotal to accelerating the creation of an accurate and valid repository of labeled medical images. Figure 4 Calculating the F-1 score to capture a mean of precision and recall Recall = True Positive True Positive + False Negative Precision = True Positive True Positive + False Positive F-1 score = 2 Precision * Recall Precision + Recall Endnote 1 From Drug R&D To Diagnostics: 90+ Artificial Intelligence Startups In Healthcare, CB Insights research brief, September 12, 2019, https://www.cbinsights.com/research/artificial-intelligence-startups-healthcare/.
  • 7. Cognizant 20-20 Insights 7 / Accelerating Medical Labeling for Diagnostic AI: Add Non-Clinicians to Labeling Teams About the authors Sashi Padarthy Assistant Vice President, Cognizant Consulting’s Healthcare Practice Sashi Padarthy leads Cognizant’s Digital Strategy and Transformation service line. For more than 20 years, Sashi has helped healthcare organizations in the areas of digital strategy, digital transformation, innovation, technology-enabled strategy, new product development, value-based care and operational improvement. He is a Sloan Fellow from the London Business School. Sashi can be reached at Sashi.Padarthy@cognizant.com | www.linkedin.com/in/sashipadarthy/. Kristin Knudson Consulting Manager, Cognizant Consulting’s Healthcare Practice Kristin Knudson is a Manager within Cognizant Consulting’s Healthcare Practice. She has experience in implementing digital transformation solutions and integrating machine intelligence into care delivery. Kristin holds an MBA in business strategy and a master’s degree in biomedical engineering, allowing her to provide a unique perspective on the advantages of leveraging technology in the healthcare industry. Kristin’s current interest is exploring applications of machine intelligence to advance digital insights and improve the consumer’s overall experience. Kristin can be reached at Kristin.Knudson@cognizant.com | www.linkedin. com/in/kristin-knudson-4266b22a/. Jacqueline Parzivand (Zelener) Senior Consultant, Cognizant Consulting’s Healthcare Practice Jacqueline Parzivand (Zelener) is a Senior Consultant in Cognizant Consulting’s Healthcare Practice. She has valuable experience in payer operational processes, process optimization, business readiness, and medical management.Jacqueline holds a master in health administration and has worked in operations and clinical research, which provides her with a dynamic outlook on the intersection of business outcomes, technology, and healthcare.Jacqueline can be reached at Jacqueline.Parzivand@cognizant.com | www.linkedin.com/in/ jacquelinezelener/. Mary Helen Turnage Senior Consultant, Cognizant Consulting’s Healthcare Practice Mary Helen Turnage is a Senior Consultant within Cognizant Consulting’s Healthcare Practice. She has experience in payer operational readiness as well as process optimization. Mary Helen has an MBA in strategy and finance and has worked in clinical recruitment and strategic execution. She is especially interested in how technology is changing the workforce landscape in healthcare. Mary Helen can be reached at Mary. Turnage@cognizant.com | www.linkedin.com/in/maryhelenturnage/.
  • 8. World Headquarters 500 Frank W. Burr Blvd. Teaneck, NJ 07666 USA Phone: +1 201 801 0233 Fax: +1 201 801 0243 Toll Free: +1 888 937 3277 European Headquarters 1 Kingdom Street Paddington Central London W2 6BD England Phone: +44 (0) 20 7297 7600 Fax: +44 (0) 20 7121 0102 India Operations Headquarters #5/535 Old Mahabalipuram Road Okkiyam Pettai, Thoraipakkam Chennai, 600 096 India Phone: +91 (0) 44 4209 6000 Fax: +91 (0) 44 4209 6060 APAC Headquarters 1 Changi Business Park Crescent, Plaza 8@CBP # 07-04/05/06, Tower A, Singapore 486025 Phone: + 65 6812 4051 Fax: + 65 6324 4051 About Cognizant Healthcare Cognizant’s Healthcare business unit works with healthcare organizations to provide collaborative, innovative solutions that address the industry’s most pressing IT and business challenges — from rethinking new business models, to optimizing operations and enabling technology innovation. As a global leader in healthcare, our industry-specific services and solutions support leading payers, providers and pharmacy benefit managers worldwide. For more information, visit www.cognizant.com/healthcare. About Cognizant Digital Business We help clients build digital businesses and innovate products that create new value — by using sensing, insights, software and experience to deliver on what customers demand in the digital age. Through IoT we connect the digital and physical worlds to make smart, efficient and safe products, operations and enterprises. Leveraging data, analytics and AI we drive intelligent decisions and anticipate where markets and customers are going next. Then we use those insights, combining design and software to deliver the experiences that consumers expect of their brands. Learn more about how we’re engineering the modern enterprise at www.cognizant.com/digitalbusiness. About Cognizant Cognizant (Nasdaq-100: CTSH) is one of the world’s leading professional services companies, transforming clients’ business, operating and technology models for the digital era. Our unique industry-based, consultative approach helps clients envision, build and run more innovative and efficient businesses. Headquartered in the U.S., Cognizant is ranked 194 on the Fortune 500 and is consistently listed among the most admired companies in the world. Learn how Cognizant helps clients lead with digital at www.cognizant.com or follow us @Cognizant. © Copyright 2020, Cognizant. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the express written permission from Cognizant. The information contained herein is subject to change without notice. All other trademarks mentioned herein are the property of their respective owners. Codex 5914