This session was recorded in San Francisco on February 9th, 2019 and can be viewed here: https://youtu.be/dE2ntPX9WeQ
In this talk, Oleksii Barash PhD, IVF Laboratory Research Director at the Reproductive Science Center of the San Francisco Bay Area, will discuss his team’s approach to applying machine learning for decision making during infertility treatment. Oleksii will also give a quick overview of how he uses Driverless AI to build models for predicting IVF outcomes.
Bio: Oleksii believes that evidence-based clinical decisions will greatly improve the efficiency and safety of the medicine. He received his Master degree in Clinical Embryology from University of Leeds (UK) and PhD in Cell Biology. The ultimate goal of his findings is to essentially transform medical records into medical knowledge.
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Oleksii Barash, Reproductive Science Center of Bay Area - Machine Learning in Reproductive Science
1. ML in Reproductive
Science:
human embryo
selection and beyond
Oleksii Barash, PhD
IVF Laboratory Research Director
Reproductive Science Center of SF Bay Area
@oleksii.barash
#H2OWORLD
2. What is infertility?
Scope of the problem
• Infertility affects 12% of the reproductive age population in the US (≈12
million people)
• Infertility affects men and women equally
• More than 50% of infertility patients will have a baby with IVF (In Vitro
Fertilization) treatment
• Over 1.5M IVF cycles per year worldwide (≈ 200,000 in USA) in 2014
• Cost of one IVF cycle in US: $10K – $100K
• Global IVF market $30-40bn
3. IVF is essentially manufacturing
• Complex multidimensional process;
• Constant intake flow of the patients;
• Cutting edge labor and equipment;
• Hundreds of contributing factors (Lab + Clinical);
• Every patient is unique – limited standardization
Ultimate goal is single healthy baby
5. Data is too large to handle it manually
• Wide Electronic Medical Records
adoption (2004 - 2015);
• IoT devices – sensors, incubators,
microscopes, lasers
• Morpho-kinetics (time-lapse)
• Preimplantation Genetic Testing
• “Omics” era is coming
6. Life in vitro – up to 6-7 days
• From 0 to 30+ embryos per IVF cycle (≈15 000 embryos per year at RSC)
• Many features per embryo
• Critical choice – no second chance
7. Non-invasive imaging and predictions
• Xtend algorithm:
• over 1,000 combinations of potential parameters
• includes egg age, cell count and Post P3 analysis – which measures cell activity after the four
cell stage
• Post P3 is the result of a proprietary analysis based on 74 computer-based attributes that are
combined into one parameter
• each embryo gets a developmental potential score ranging from 1 (highest) to 5 (lowest).
• 84% specificity vs 52% by traditional assessment
• The odds ratio of predicting blastocyst formation is 2.57 vs 1.67 by traditional assessment
10. Live birth rate
Embryo
_Age
Blastula
tion_ra
te
Donor_
eggs Euploid
y_rate
Numbe
r_of_no
rmal
d5_to_t
otal_rat
io Total_d
ay_5_b
x Total_d
ay_6_b
x Total_f
or_bios
y
Bx_Day
Emb_Ex
pansion
ICM
TE
Gender
Best_E
mbryo_
For_ET
Elective
_SET
Cycle_n
umber
Numbe
r_of_Fo
llicles
Zygotes
Fert_ra
te
Unfert
M2
M1
GV
ATR
Multi_P
N
PN_1
Degene
rated
Cleaved
Cleavag
e_rate
Numbe
r_ext_c
ultureGood_e
xt_cult
ureNumbe
r_to_blNumbe
r_CryoGood_d
3_rateTVA_M
D
Numbe
r_of_ta
rnsfers
_to_del
ivery
Semen
_Sourc
e
Fresh_F
rosen_s
p
BMI
PATIEN
TTYPET
EXT
NO_OF
_DAYS
SUMSTI
M
ASPIRA
TED_O
OCYTES
HCG_D
RUG
TOTAL2
PN
GRAVID
ITY
PREM
TERM
SAB
BIOCHE
MICAL
LIFETIM
E_SMO
KED
PRIORI
VF
PRIORF
ET
PRIORI
UI
HEIGHT
WEIGH
T
PRIMA
RYDIAG
NOSIS
SEMEN
SOURC
E
FSHLEV
EL
NEARES
T_AMH
MED1
Peak_E
2
TOTALI
US
FOLLICL
ES_BIG
GER_T
HAN_1
4
ASPIRA
TED_O
OCYTES
NO_FR
OZEN
NO_VIT
INITIAL
CONSU
LT_PRE
M
INITIAL
CONSU
LT_GRA
VIDITY
INITIAL
CONSU
LT_SAB
INITIAL
CONSU
LT_TER
M
INITIAL
CONSU
LT_BIO
CHEMI
CAL
Stim
protoco
l
Factors affecting clinical outcomes
More factors?
Bias?
Reproducibility?
Live
birth
rate
Maternal age
Number of
embryos for
biopsy
Morphology of
the embryos
SET vs eSET
D5 vs D6
Biopsy
Total
gonadotropin
dosage
Number of
previous failed
cycles
Number of
normal
embryos per
cycle
Number of
eggs
Euploidy rate
Presented by RSC team: ASRM 2016, 2015, 2014; ESHRE 2015, 2016;
PCRS 2014, 2015, 2016; PGDIS 2015, 2017
11. What if we can evaluate
ALL available factors?
12. What if we can assess ALL available factors?
20 factors:
202 = 400 plots
381 factors
3812 = 145,161 plots
20 x 20
Machine Learning
13. Lab + Clinical factors, 11k embryos, >2000 patients
Pregnant, %Non-Pregnant, %
% of total SETs
Presented by RSC team at ASRM 2017
IVF lab
Embryo_Age
Blastulation_rate
Donor_eggs
Euploidy_rate
Number_of_normal
d5_to_total_ratio
Total_day_5_bx
Total_day_6_bx
Total_for_biopsy
Bx_Day
Embryo_Morphology
Expansion
ICM
TE
Gender
Clinical_Outcome
BEST_ EMBRYO_FOR_ET
ELECTIVE_SET
Number_of_tarnsfers_to_delivery
Biopsy tech
CYCLE #
PEAK E2
TVA MD
TVA TECH
# Follicles >12 mm
# EGGS
# INSEM
# 2PN
% FERT
# UNFERT
#M2 or mature
# INT
# IMM
# ATR
# > 2PN
# 1PN
# DEG
FERT CK TECH
ICSI TECH
SEMEN SOURCE
FRESH/FROZEN SP
CLEAVED
% CLEAVED
HATCH TECH
# EXT CULTURE
# GOOD EXT CULT
# TO BLAST
# CRYO
% OF GOOD QUALITY EMBRYOS
…
clinical
BMI
PRIMARY_DX
PATIENTTYPETEXT
LUPRON
STIM
GNRHA
MED1
SUMSTIM
TRANSFER_DATE
HCG_DRUG
GRAVIDITY
PREM
TERM
SAB
BIOCHEMICAL
PATIENTRACE
LIFETIME_SMOKED
SMOKING_FREQ
PRIORIVF
PRIORFET
PRIORIUI
HEIGHT
WEIGHT
STIMPROTOCOL
LUPRONPROTOCOL
PRIMARYDIAGNOSIS
SECONDARYDIAGNOSIS
TERTIARYDIAGNOSIS
SEMENSOURCE
PATIENTTYPE
FSHLEVEL
E2LEVEL
NEAREST_AMH
AFC
MED1
MED2
MED3
MED4
MAX_E2
TOTALIUS
FERT_METHOD_ICSI
FERT_METHOD_IVF
INITIALCONSULT_PREM
INITIALCONSULT_GRAVIDITY
INITIALCONSULT_SAB
INITIALCONSULT_TERM
INITIALCONSULT_BIOCHEMICAL
Stim protocol
…
320 variables per patient:
15. Building the model to predict IVF outcome
Only weak predictors are present
Relatively small sample size
A lot of features (>300)
Accuracy of predictions = 0.8412
AUC = 0.8236
16. Building the model to predict IVF outcome
• Benchmark AUC – Starting point
• Feature engineering
• Feature importance
• Feature transformations
• Non-important features
• Model interpretation
• Time – series
17. ReproScore (the probability of positive outcome )
Patient
Name
Embryo
Morphology
Genetics Reproscore FET date
Patient A 3AA Euploid 0.692727 12/17/2017
3AB
45, XX;
Monosomy 7 0.692415
5B-B-
47, XY;
Tri/polysomy 16 0.648626
5BB Euploid 0.674588 6/4/2015
2B-B-
47, XY;
Tri/polysomy 9 0.647992
5B-B
47, XY;
Tri/polysomy 6 0.666277
Patient B 2BB Euploid 0.407558 5/18/2018
5AA
47, XY;
Tri/polysomy 16 0.372037
5AB Euploid 0.364438
3AB Euploid 0.364438 6/6/2017
0
100
200
300
400
500
600
0.1-0.2 0.2-0.3 0.3-0.4 0.4-0.5 0.5-0.6 0.6-0.7 0.7-0.8 0.8-0.9
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Numberofpatients
Predicted probability of Positive outcome
ActualclinicalPR,%
Actual clinical PR, % Number of patients
18. What lies beyond?
Personalized decisions
• Where I am:
• Can I have a baby (age, medical history, genetic profile)?
• What are my chances?
• Can I afford it?
• How to choose treatment plan:
• Hormonal Stimulation protocol / dosage / duration
• Lutheal support, etc…
• How many embryos to transfer (1, 2 or 3)
• Which embryo to transfer:
• Morphological screening
• Genetic screening
• Gender
20. Conclusion
1. Machine learning is not yet widely used in clinical practice
2. Augmented decision making with machine learning
3. Auto ML for rapid experimentation knowledge discovery
4. Transition from knowledge driven to data driven care
5. This is a personal revolution as much as analytical