Screening for Cancer Using a Learning
Internet Advertising System
Publication date: March 2020.
ACM Journals:
ACM Transactions on Computing for Healthcare
Durr-e-Nayab
Mphil (02072213017)
ABSTRACT
2
“
Using search engine queries and advertising systems
to identify individuals likely to have suspected cancer
improves the effectiveness of cancer diagnosis and
healthcare utilization.
3
INTRODUCTION
Search engine queries have proven to be valuable data for
understanding:
o real-world experiences,
o including medical concerns,
o utilized for tracking infectious diseases,
o exploring the link between diet and chronic pain,
o and identifying early indicators of diseases.
4
5
Medical data collection often relies on anonymous sources, making inferences
about users' medical status necessary.
SIU disclose their condition in search
queries
Consisting of more females and
younger people
6
SIU
Geographic
Variability
to identify potential medical
conditions in users. While these
methods allowed for a larger and
more diverse cohort, they lacked
clinical information about the
users, limiting the certainty of
disease inference.
The current work demonstrates for the first
time that the ads-serving platform can be
used to target populations at risk for early
diagnosis
7
CONTRIBUTION
8
9
Clinically verified questionnaires used to estimate
likelihood of suspected cancer diagnosis
Correlation between questionnaire scores and past
search engine queries demonstrated potential
prediction of suspected cancer
Learning capabilities of advertising systems utilized to
identify more individuals likely to have suspected
cancer
METHOD
10
FOCUSED ON THREE TYPES OF CANCER
Lung Breast Colon
11
12
Users recruited through targeted ads related
to specific cancer diagnosis
Click on ads
Redirected to a dedicated website
Administer clinically validated
questionnaires
Collect completed questionnaire scores
First Study
 Bing ads system
 privileged access to search
system data
 Budget 15$ per day
Second Study
 Google ads system
 Not privileged access to
search system data
 Budget 15$ per day
13
RECRUITMENT
14
15
Users were recruited through ads displayed using the respective ads
system.
Recruitment ad
“symptoms of <cancer type>”
“signs of <cancer type>”
“Diagnosis <cancer type>”
“Questionnaire <cancer type>”
“ quiz <cancer type>”
Were shown when
16
Ads contained one of the following three titles:
“<cancer type> Do you have it?”
“<cancer type>Think you have it?”
“<cancer type>- Worried you have
it?”
The text of the ads was “Click here to check if you should see
a doctor” (or physician)
All ads were shown with equal
probability
Questionnaires
▣ People who clicked on these ads were referred to a
specially designed website
▣ Questionnaire developed based on the UK National
Institute for Health and Care Excellence (NICE)
17
18
ALGORITHM 1: A simplified flow chart of the cancer questionnaire, based on NICE guideline
_______________________________________________________________________________
Result: Suspected cancer score (SCS)
if you are 30 years or over then
if you have <symptoms according to specific cancer i.e. lung, colon, and breast cancer>then
SCS = High;
else
SCS = Low;
end
end
if you are 50 years or over then
if you have <symptoms according to specific cancer i.e. lung, colon, and breast cancer>then
SCS = High;
else
SCS = Low;
end
end
SCS RATE
▣ High SCS
advised to consult an oncologist
within 2 weeks
▣ Low SCS
advised that their symptoms
were not commonly associated
with cancer but that they should
see a medical doctor if the
symptoms were persistent or
worrying.
19
STUDY 1
20
Campaign performance:
Recruitment ad
Were shown
(159,170)
Clicked
2,899
1,285 questionnaires were started 681 were
completed
21
Fig. 1. Clickthrough rates on ads by cancer type and age
group.
22
It took an average of 126 seconds to complete the
questionnaires.
Total data analyzed
288 people
185
lung
81
colon
22
breast
23
Prediction of questionnaire
outcome
A combined model was
initially used to screen for all
cancers together. The ROC
curve showed that the area
under the curve (AUC) was
0.66, indicating that it is
possible to identify
individuals who are highly
likely to have the suspected
cancer.
When Trained Separately
0.74
colon
0.50
breast
0.56
lung
24
AUCs for the different cancers
25
STUDY 2
25
Campaign performance:
Recruitment ad
Were shown
(70,586)
Clicked
(6,484)
Clickthrough Rates
2,917 people began the questionnaire and 1,049 completed it
The study focused on tracking the conversion rate over
time, which refers to the percentage of individuals who
clicked on the ads and were found to have a high SCS.
Approximately 1 out of 9 individuals who clicked on the ads
were suspected to have cancer.
27
The campaign analyzed the keywords that triggered the ads and
the demographic information of the users.
The average conversion rates
 Breast 11%,
 Colon 9%, and
 Lung 9%, respectively.
28
Study 2 demonstrated that the advertising system
successfully learned to identify individuals with suspected
cancer and optimized its performance through keyword
selection and demographic targeting.
29
The suggested approach has the potential to screen for
severe medical conditions in marginalized populations
and alleviate concerns for individuals who do not have
suspected cancer but are anxious about their symptoms.
Nevertheless, additional research is required to validate
its effectiveness and assess its cost-efficiency.

PresentationDW.pptx

  • 1.
    Screening for CancerUsing a Learning Internet Advertising System Publication date: March 2020. ACM Journals: ACM Transactions on Computing for Healthcare Durr-e-Nayab Mphil (02072213017)
  • 2.
  • 3.
    “ Using search enginequeries and advertising systems to identify individuals likely to have suspected cancer improves the effectiveness of cancer diagnosis and healthcare utilization. 3
  • 4.
    INTRODUCTION Search engine querieshave proven to be valuable data for understanding: o real-world experiences, o including medical concerns, o utilized for tracking infectious diseases, o exploring the link between diet and chronic pain, o and identifying early indicators of diseases. 4
  • 5.
    5 Medical data collectionoften relies on anonymous sources, making inferences about users' medical status necessary. SIU disclose their condition in search queries Consisting of more females and younger people
  • 6.
    6 SIU Geographic Variability to identify potentialmedical conditions in users. While these methods allowed for a larger and more diverse cohort, they lacked clinical information about the users, limiting the certainty of disease inference.
  • 7.
    The current workdemonstrates for the first time that the ads-serving platform can be used to target populations at risk for early diagnosis 7
  • 8.
  • 9.
    9 Clinically verified questionnairesused to estimate likelihood of suspected cancer diagnosis Correlation between questionnaire scores and past search engine queries demonstrated potential prediction of suspected cancer Learning capabilities of advertising systems utilized to identify more individuals likely to have suspected cancer
  • 10.
  • 11.
    FOCUSED ON THREETYPES OF CANCER Lung Breast Colon 11
  • 12.
    12 Users recruited throughtargeted ads related to specific cancer diagnosis Click on ads Redirected to a dedicated website Administer clinically validated questionnaires Collect completed questionnaire scores
  • 13.
    First Study  Bingads system  privileged access to search system data  Budget 15$ per day Second Study  Google ads system  Not privileged access to search system data  Budget 15$ per day 13
  • 14.
  • 15.
    15 Users were recruitedthrough ads displayed using the respective ads system. Recruitment ad “symptoms of <cancer type>” “signs of <cancer type>” “Diagnosis <cancer type>” “Questionnaire <cancer type>” “ quiz <cancer type>” Were shown when
  • 16.
    16 Ads contained oneof the following three titles: “<cancer type> Do you have it?” “<cancer type>Think you have it?” “<cancer type>- Worried you have it?” The text of the ads was “Click here to check if you should see a doctor” (or physician) All ads were shown with equal probability
  • 17.
    Questionnaires ▣ People whoclicked on these ads were referred to a specially designed website ▣ Questionnaire developed based on the UK National Institute for Health and Care Excellence (NICE) 17
  • 18.
    18 ALGORITHM 1: Asimplified flow chart of the cancer questionnaire, based on NICE guideline _______________________________________________________________________________ Result: Suspected cancer score (SCS) if you are 30 years or over then if you have <symptoms according to specific cancer i.e. lung, colon, and breast cancer>then SCS = High; else SCS = Low; end end if you are 50 years or over then if you have <symptoms according to specific cancer i.e. lung, colon, and breast cancer>then SCS = High; else SCS = Low; end end
  • 19.
    SCS RATE ▣ HighSCS advised to consult an oncologist within 2 weeks ▣ Low SCS advised that their symptoms were not commonly associated with cancer but that they should see a medical doctor if the symptoms were persistent or worrying. 19
  • 20.
    STUDY 1 20 Campaign performance: Recruitmentad Were shown (159,170) Clicked 2,899 1,285 questionnaires were started 681 were completed
  • 21.
    21 Fig. 1. Clickthroughrates on ads by cancer type and age group.
  • 22.
    22 It took anaverage of 126 seconds to complete the questionnaires. Total data analyzed 288 people 185 lung 81 colon 22 breast
  • 23.
    23 Prediction of questionnaire outcome Acombined model was initially used to screen for all cancers together. The ROC curve showed that the area under the curve (AUC) was 0.66, indicating that it is possible to identify individuals who are highly likely to have the suspected cancer.
  • 24.
  • 25.
    25 STUDY 2 25 Campaign performance: Recruitmentad Were shown (70,586) Clicked (6,484)
  • 26.
    Clickthrough Rates 2,917 peoplebegan the questionnaire and 1,049 completed it The study focused on tracking the conversion rate over time, which refers to the percentage of individuals who clicked on the ads and were found to have a high SCS. Approximately 1 out of 9 individuals who clicked on the ads were suspected to have cancer.
  • 27.
    27 The campaign analyzedthe keywords that triggered the ads and the demographic information of the users. The average conversion rates  Breast 11%,  Colon 9%, and  Lung 9%, respectively.
  • 28.
    28 Study 2 demonstratedthat the advertising system successfully learned to identify individuals with suspected cancer and optimized its performance through keyword selection and demographic targeting.
  • 29.
    29 The suggested approachhas the potential to screen for severe medical conditions in marginalized populations and alleviate concerns for individuals who do not have suspected cancer but are anxious about their symptoms. Nevertheless, additional research is required to validate its effectiveness and assess its cost-efficiency.

Editor's Notes

  • #23 ampaign performance. Recruitment ads were shown 159,170 times and clicked 2,899 times during this experiment. Clickthrough rates for different conditions were similar, ranging from 1.2% (breast cancer) to 4.8% (colon cancer). Females and males were similarly likely to click on the ads for colon and lung cancer, but females were 2.0 times more likely to click on ads for breast cancer. Clickthrough rates on ads, by cancer type and age group, are shown in Figure 1. As the figure shows, although the range of clickthrough rates are similar, older people tended to click more on ads, with the exception of breast cancer, which was also clicked by younger people. Throughout the experiment, 1,285 questionnaires were started and 681 were completed (53% completion rate). It took an average of 126 seconds to complete the questionnaires. After excluding people who did not consent to the participate in the study and people who did not have a query history of at least 14 days, the data from 288 remaining people (185 lung, 81 colon, and 22 breast) were analyzed.
  • #24 . The model's performance was evaluated using a receiver operating curve (ROC) analysis, specifically for the detection of the three cancers in question.
  • #27 The conversion rate, which indicates the percentage of people who took the desired action (in this case, completing the questionnaire), was tracked over time.
  • #30 Previous studies have demonstrated the potential of using search engine queries to screen for different types of cancer, but this study uses clinically verified questionnaires to identify people with suspected cancer and correlate their queries with the questionnaire outcomes.