3. INTRODUCTION & MOTIVATION
▪ Spread of Coronavirus – Possibility of a 2nd
wave, country attempting to fully unlock
▪ Forecasting COVID-19 is difficult – given the unreasonable modelling assumption and lack of good
quality data
▪ Developed a real time monitoring dashboard with risk metrics to provide actionable insights
▪ Difference in trends for the risk metrics across states
3
4. A GLIMPSE OF THE DASHBOARD
Source: https://covid-isical.tech/ 4
7. METHODOLOGY (1/2)
▪ Some states/ regions were more affected compared to others
▪ Motivation to identify underlying structures within COVID-19 cases
▪ Performed Euclidean measure based vanilla k-means clustering on the top 20 most affected states
▪ Value of k was chosen iteratively experimenting and checking the standard silhouette score and elbow
metric
▪ Performed the hierarchical clustering on the cases using the Ward distance
1 Cluster Analysis
7
8. METHODOLOGY (2/2)
▪ Used the data of testing date and the date of discharge from hospitals for individual patients in
Karnataka to calculate the length of stay in hospitals
▪ Calculated the days progress using Kaplan-Meier estimator of survival function
▪ Carried out cohort selection with Age and Gender as the variates
▪ Calculated the Gender and Age stratified Kaplan-Meier estimate
▪ Performed the standard non-parametric Log Rank Test to statistically compare the difference between
the survival probability of various stratum
2 Survival Analysis
8
9. RESULTS AND DISCUSSIONS
1 Cluster Analysis
▪ MH is the worst affected state while KL,DL,UP,WB are among the second worst-hit states.
▪ Such clustering analysis, performed across the last 7-8 months, gives a better view of the shift of the states in the cluster
Cluster Analysis based on Active Cases
9
10. 10
RESULTS AND DISCUSSIONS
2 Survival Analysis
▪ In our cohort, the mean age is 45 years and the mean/ median time to recovery is 8 days.
Ages & Stay distribution of patients
11. RESULTS AND DISCUSSIONS
2 Survival Analysis
▪ Vanilla KM estimate on the entire cohort shows median survival time as 16 days.
Survival Analysis on the entire cohort
11
12. RESULTS AND DISCUSSIONS
2 Survival Analysis
▪ We stratified the cohort based on gender and found an interesting difference on median survival time across male and
female was varying by 7 days
Survival Analysis on the Gender stratified cohort
n events median 0.95LCL 0.95UCL
Sex=F 9290 3531 21 19 23
Sex=M 17451 7979 14 14 15
12
13. RESULTS AND DISCUSSIONS
2 Survival Analysis
▪ We stratified the cohort based on age and found that older people have lesser median survival time of 4 days and adults
have significantly high median survival time of 33 days.
Survival Analysis on the Age stratified cohort
n events median 0.95LCL 0.95UCL
<18 yr 2295 37 NA NA NA
18-60 yr 16380 4571 33 31 37
>60 yr 8066 6902 4 4 4
13
15. RESULTS AND DISCUSSIONS
2 Survival Analysis
Cohort Characteristics of Gender and Age stratified KM estimate
▪ For older people (>60 years), there is not much difference in the median survival time across gender but for adult cohort
(18-60 years) median survival time is varying by 4 days across gender stratum.
15
16. RESULTS AND DISCUSSIONS
2 Survival Analysis
Cox Proportional Hazard
▪ Fitted a semiparametric cox proportional hazard model to quantify the effect of age on mortality
▪ Cox-ph model gives Hazard Ratio, which is interpretated as ratio of hazards between two groups
(Male & Female) at any particular point of time.
▪ The estimated Hazard ratio came out to be 1.26 with 95% confidence interval which implies Male
gender is associated with 1.26 times increased risk or decreased survival.
▪ The male population in our cohort is more vulnerable and the reason could be traced to higher
comorbidity and exposure to virus due to relatively higher contact with riskier environment.
16