SlideShare a Scribd company logo
1 of 7
Download to read offline
EXL EXCELLENCE
QUOTIENT 2021
TEAM- THE GIMMICKS, BIRLA INSTITUTE OF
TECHNOLOGY, MESRA
Understanding Problem Statement
■ Problem statement: The problem statement by EXL Excellence Quotient 2021 required us to develop a tool for a
US public health client to predict daily Covid-19 cases at a county level.
■ Establishing its need for a solution: As COVID-19 outbreak evolves, accurate forecasting plays an extremely
important role in informing policy decisions. Forecasts can help us to contribute to the distribution of medical
supplies in the worst effected areas across the country by determining the medical supply need for individual
hospitals. We hope that our forecasts and data repository can help improve hospitality and vaccination strategy,
guide necessary county-specific decision making and help countries prepare for their continued fight against
COVID-19.
■ The desired outcomes and impact on society: Analysis of the data, predicts the evolution trend of the existing
pandemic data. The predicted forecast will enable an investigation into the connection between
socioeconomic and demographic information with health resource data (number of ICU beds, medical staff )
that will help in gaining a focused understanding of the severity of the pandemic thus, provide guidance about its
prevention and control at the county level. This will overcome one of the greatest challenges of equitable and
efficient allocation of scarce resources and also predict future medical and public health workforce
requirements according to the Identified geographical areas in which staff shortages are most likely to develop.
■ Approach for the solution: The data given with information updated to January 2021 is a time series as it was
collected over a period of time and over a series of constant, regular intervals which are the key characteristics.
We have used deep learning technique to apply ARIMA model of Time series forecasting. It is best suited over
all other methods as it predicts a given time series based on its own past values, utilizing grid search it
recognizes a lot of boundaries that delivers the best-fit model for our time series data. It can be used for any
nonseasonal series of numbers that exhibits patterns and is not a series of random events.
■ Constraints: The dataset with which we were provided with contains many NaN values. We dropped these rows
which resulted in the loss of data. During the training we ignored this point
Solution Design
Forecasting/Modelling Tool Development Process
• The provided dataset with information updated to January 2021 to predict daily Covid-19 cases, is a Time series. Time-series
forecasting is widely used for such dataset as models are capable to predict future values based
on previously observed values.
• Taking the stationarity test by Augmented Dickey-Fuller test, the time series exhibited trending behaviour or non-stationarity
in the mean. This step is very important because the whole results of the regression might be fabricated.  
• Long short-term memory (LSTM) is RNN architecture, used in the field of deep learning. While using an LSTM model we are
free and able to decide what information will be stored and what discarded using ‘gates’. These models are able to store
information over a period of time which is extremely useful when we deal with Time-Series Data. Hence, we have built
a multi-layer LSTM recurrent neural network to predict the predict daily Covid-19 cases from a sequence of values. The
modules used are : Keras, TensorFlow, Pandas, Scikit-Learn & NumPy.
• To boost the performance, the next step in the process was to normalize the data before model fitting. We did this by
implementing Min-Max scaling in Python using scikit-learn and then split the data into training and test sets to
avoid overfitting and to be able to investigate the generalization ability of our model.
• We next built a neural network architecture, a RNN-LSTM architecture was trained on our training data. After the model was
trained, we assigned 1 neuron in the output layer for predicting the the number of confirmed cases on county level. To achieve
good results fast, we have used MSE loss function and the Adam stochastic gradient descent optimizer as it overcomes
redundancy by performing a parameter update for each training by one update at a time.
Data collection
(influenza cases,
COVID-19 cases) and
pre-processing is
carried out.
Apply ARIMA model
of Time series
forecasting.
Build and learn LSTM
model.
Conduct performance
evaluation of
prediction algorithms
on test data and
Covid-19 data.
Forecast Covid
trends in US for
Future.
Validation Results of
Forecasting/Modeling
Tool
ARIMA model: We have used ARIMA
(Auto-Regressive Integrated Moving Average), a class
of models that has the principal objective to correctly
recognize the stochastic mechanism of the time series
i.e., its own lags and the lagged forecast errors, so
that equation can be used to forecast future values.
Performance Evaluation Criteria – Mean Square
Root Error
We have used Root Mean Square Error (RMSE) as a
standard statistical metric to measure model
performance by calculating difference between the
value predicted by the model and the value observed
in the actual environment.
Fig. Methodology to apply the ARIMA model and LSTM algorithm for
Top drivers of the model
RNN Modules: We developed an
ensemble framework that combines
multiple RNN-based deep learning
models using LSTM algorithms that
show high performance in time series
forecasts and are capable to learn
long term dependencies by replacing
the hidden layers of RNN with
memory cells.
Validation Results: The mean squared error (MSE) is
close to zero.
1. To Identify top 5 States expected to be worst affected by COVID-19 pandemic by 30th April 2021
considering the population demographics.
Additional Analysis
▪ Data Source : We have used data of COVID-19 cases in the United States as of March 11, 2021, by state available publicly at
https://www.statista.com/statistics/1102807/coronavirus-covid19-cases-number-us-americans-by-stat
▪ Statistical models used : We consider the exponential model, the logistic model, and the Susceptible Infectious Susceptible (SIS)
model for COVID-19 pandemic prediction at the state level. These models have already been used to predict epidemics like COVID-19
around the world, including China, Ebola outbreak in 2014
The exponential model will estimate the upper bound of the total number of infected people by 30th
April.
The logistic model-based prediction will capture the effect of preventive measures that have already been taken by the respective
state government.
The SIS model will reflect the effect of the major preventive measure like recent crowd gatherings for protests.
▪ State-wise Analysis and Prediction Report: Dependent on inputs from the exponential, logistic, and SIS model along with daily
infection-rates for each state, we interpret the results from different models jointly to conclude :
The top 5 states worst affected by April,30 are:
1. California: As of today California has 4,740 confirmed cases on average in one week. The growth rate is -5%.
2. Texas: As of today Texas has 6,190 confirmed cases on average in this week. The growth rate is -5%.
3. Florida: As of today Florida has 5,136 confirmed cases on average in this week. The growth rate is -5%.
4. New-York: As of today New-York has 7,279 confirmed cases on average in this week. The growth rate is +5%.
5. Illinois: As of today Illionis has 6,190 confirmed cases on average in this week. The growth rate is -5%.
▪ Discussion:The number of confirmed cases alone is not a good metric for estimating risk of exposure because it doesn’t take
population density into account whereas Incidence Rate controls for population density to give number of confirmed cases per
100,000 people.
Drawbacks for predicting the trend of COVID-19 disease and analysing state’s conditions to explain the situation, there is lack of
information about social and political measurements and reactions to illness spread, state categorized data is not available for the
public completely, many algorithms do not work in this restricted dataset due to its time-series nature.
Vaccination Strategy to optimize overall cost of
transport/distribution to minimize the number of new
cases
Aim of the strategy is to save lives and reduce the chain of transmission with AI enabled prioritization for a
fast and effective vaccine delivery. The solution prioritizes the population, considering the limited availability of
vaccines initially and at the same time, limiting the spread of COVID-19.
Expected Results
Prioritization of vaccine delivery for human wellbeing
will help various sectors of community like:
► Health workers at high or very high risk
► Employment categories unable to physically
distance
► Groups living in dense urban neighbourhoods
► Groups living in multigenerational households
It will help to reduce societal and economic
disruption:
► School-aged children to minimize disruption of
education and socioemotional development
► Workers in non-essential but economically critical
sectors, particularly in occupations that do not
permit remote work or physical distancing while
working
Additional variables that could have be used or
explored to improve the model prediction accuracy
■ Bus operators in the UK and Europe have announced a halt to
services, while other operators worldwide are running reduced
services as a result of the latest wave of COVID-19.
■ Many of the London bus drivers who died from coronavirus were
suffering with underlying health conditions particularly high blood
pressure.
■ Since, the population diversity is related to the number of
confirmed cases. For eg: Europeans and American are more prone
to COVID-19 than Asians. The coronavirus (COVID-19) pandemic
has revealed deep-seated inequities in health care for communities
of color and amplifies social and economic factors that contribute
to poor health outcomes. Recent news reports indicate that the
pandemic disproportionately impacts communities of color,
compounding longstanding racial disparities.
■ FirstGroup, the UK’s largest bus company and operator of rail
franchises, has suggested it may not be able to continue trading
because of the impact of COVID-19 on its passenger numbers.
Relaxing the social distance rule from two meters to one meter
allows an underground train to accommodate more people, it was
still limited to 208 people, said Ron Kalifa, board member of
Transport for London.
Through the above facts and reports ,it's evident that transport and
mobility data and population diversity is very much correlated to the
number of COVID-19 confirmed cases. Therefore, transport and
mobility data can be used to make our model perform better.

More Related Content

Similar to EXL Analytics

Leveraging Analytics to Identify High Risk Patients
Leveraging Analytics to Identify High Risk PatientsLeveraging Analytics to Identify High Risk Patients
Leveraging Analytics to Identify High Risk PatientsCitiusTech
 
PARR-combined-predictive-model-final-report-dec06
PARR-combined-predictive-model-final-report-dec06PARR-combined-predictive-model-final-report-dec06
PARR-combined-predictive-model-final-report-dec06Nadya Filipova
 
PREDICTION OF COVID-19 USING MACHINE LEARNING APPROACHES
PREDICTION OF COVID-19 USING MACHINE LEARNING APPROACHESPREDICTION OF COVID-19 USING MACHINE LEARNING APPROACHES
PREDICTION OF COVID-19 USING MACHINE LEARNING APPROACHESIRJET Journal
 
Learning from the Care Quality Commission
Learning from the Care Quality CommissionLearning from the Care Quality Commission
Learning from the Care Quality CommissionNuffield Trust
 
How Carle Health Effectively Integrated Augmented Intelligence
How Carle Health Effectively Integrated Augmented IntelligenceHow Carle Health Effectively Integrated Augmented Intelligence
How Carle Health Effectively Integrated Augmented IntelligenceHealth Catalyst
 
Machine Learning Final presentation
Machine Learning Final presentation Machine Learning Final presentation
Machine Learning Final presentation AyanaRukasar
 
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSCOVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSIRJET Journal
 
2016 indicator reference guide priority pop prevention services
2016 indicator reference guide   priority pop prevention services2016 indicator reference guide   priority pop prevention services
2016 indicator reference guide priority pop prevention services#GOMOJO, INC.
 
ONE BG 9307554036.pdf
ONE BG 9307554036.pdfONE BG 9307554036.pdf
ONE BG 9307554036.pdftanvirali26
 
ONE BG 9307554036.pdf
ONE BG 9307554036.pdfONE BG 9307554036.pdf
ONE BG 9307554036.pdftanvirali26
 
THE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONS
THE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONSTHE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONS
THE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONSManishReddy706923
 
Worcestershire Office of Data Analytics pilot use case overview
Worcestershire Office of Data Analytics pilot use case overviewWorcestershire Office of Data Analytics pilot use case overview
Worcestershire Office of Data Analytics pilot use case overviewNeill Crump
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITOMarcoMellia
 
OS20 - The use of foot-and-mouth disease models in disease free and endemic s...
OS20 - The use of foot-and-mouth disease models in disease free and endemic s...OS20 - The use of foot-and-mouth disease models in disease free and endemic s...
OS20 - The use of foot-and-mouth disease models in disease free and endemic s...EuFMD
 
Proposed requirements of the management dashboard at the national level
Proposed requirements of the management dashboard at the national levelProposed requirements of the management dashboard at the national level
Proposed requirements of the management dashboard at the national levelarmabadi
 
[20240401_LabSeminar_Huy]TransferGNN.pptx
[20240401_LabSeminar_Huy]TransferGNN.pptx[20240401_LabSeminar_Huy]TransferGNN.pptx
[20240401_LabSeminar_Huy]TransferGNN.pptxthanhdowork
 
Covid-19 Data Analysis and Visualization
Covid-19 Data Analysis and VisualizationCovid-19 Data Analysis and Visualization
Covid-19 Data Analysis and VisualizationIRJET Journal
 

Similar to EXL Analytics (20)

Leveraging Analytics to Identify High Risk Patients
Leveraging Analytics to Identify High Risk PatientsLeveraging Analytics to Identify High Risk Patients
Leveraging Analytics to Identify High Risk Patients
 
PARR-combined-predictive-model-final-report-dec06
PARR-combined-predictive-model-final-report-dec06PARR-combined-predictive-model-final-report-dec06
PARR-combined-predictive-model-final-report-dec06
 
PREDICTION OF COVID-19 USING MACHINE LEARNING APPROACHES
PREDICTION OF COVID-19 USING MACHINE LEARNING APPROACHESPREDICTION OF COVID-19 USING MACHINE LEARNING APPROACHES
PREDICTION OF COVID-19 USING MACHINE LEARNING APPROACHES
 
Learning from the Care Quality Commission
Learning from the Care Quality CommissionLearning from the Care Quality Commission
Learning from the Care Quality Commission
 
How Carle Health Effectively Integrated Augmented Intelligence
How Carle Health Effectively Integrated Augmented IntelligenceHow Carle Health Effectively Integrated Augmented Intelligence
How Carle Health Effectively Integrated Augmented Intelligence
 
Machine Learning Final presentation
Machine Learning Final presentation Machine Learning Final presentation
Machine Learning Final presentation
 
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSCOVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
 
Covics-19
Covics-19Covics-19
Covics-19
 
2016 indicator reference guide priority pop prevention services
2016 indicator reference guide   priority pop prevention services2016 indicator reference guide   priority pop prevention services
2016 indicator reference guide priority pop prevention services
 
Covics 19 final
Covics 19 finalCovics 19 final
Covics 19 final
 
ONE BG 9307554036.pdf
ONE BG 9307554036.pdfONE BG 9307554036.pdf
ONE BG 9307554036.pdf
 
ONE BG 9307554036.pdf
ONE BG 9307554036.pdfONE BG 9307554036.pdf
ONE BG 9307554036.pdf
 
THE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONS
THE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONSTHE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONS
THE REACTION DATA ANALYSIS OFCOVID-19 VACCINATIONS
 
Worcestershire Office of Data Analytics pilot use case overview
Worcestershire Office of Data Analytics pilot use case overviewWorcestershire Office of Data Analytics pilot use case overview
Worcestershire Office of Data Analytics pilot use case overview
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITO
 
OS20 - The use of foot-and-mouth disease models in disease free and endemic s...
OS20 - The use of foot-and-mouth disease models in disease free and endemic s...OS20 - The use of foot-and-mouth disease models in disease free and endemic s...
OS20 - The use of foot-and-mouth disease models in disease free and endemic s...
 
Quality system
Quality systemQuality system
Quality system
 
Proposed requirements of the management dashboard at the national level
Proposed requirements of the management dashboard at the national levelProposed requirements of the management dashboard at the national level
Proposed requirements of the management dashboard at the national level
 
[20240401_LabSeminar_Huy]TransferGNN.pptx
[20240401_LabSeminar_Huy]TransferGNN.pptx[20240401_LabSeminar_Huy]TransferGNN.pptx
[20240401_LabSeminar_Huy]TransferGNN.pptx
 
Covid-19 Data Analysis and Visualization
Covid-19 Data Analysis and VisualizationCovid-19 Data Analysis and Visualization
Covid-19 Data Analysis and Visualization
 

Recently uploaded

Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 

Recently uploaded (20)

Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 

EXL Analytics

  • 1. EXL EXCELLENCE QUOTIENT 2021 TEAM- THE GIMMICKS, BIRLA INSTITUTE OF TECHNOLOGY, MESRA
  • 2. Understanding Problem Statement ■ Problem statement: The problem statement by EXL Excellence Quotient 2021 required us to develop a tool for a US public health client to predict daily Covid-19 cases at a county level. ■ Establishing its need for a solution: As COVID-19 outbreak evolves, accurate forecasting plays an extremely important role in informing policy decisions. Forecasts can help us to contribute to the distribution of medical supplies in the worst effected areas across the country by determining the medical supply need for individual hospitals. We hope that our forecasts and data repository can help improve hospitality and vaccination strategy, guide necessary county-specific decision making and help countries prepare for their continued fight against COVID-19. ■ The desired outcomes and impact on society: Analysis of the data, predicts the evolution trend of the existing pandemic data. The predicted forecast will enable an investigation into the connection between socioeconomic and demographic information with health resource data (number of ICU beds, medical staff ) that will help in gaining a focused understanding of the severity of the pandemic thus, provide guidance about its prevention and control at the county level. This will overcome one of the greatest challenges of equitable and efficient allocation of scarce resources and also predict future medical and public health workforce requirements according to the Identified geographical areas in which staff shortages are most likely to develop. ■ Approach for the solution: The data given with information updated to January 2021 is a time series as it was collected over a period of time and over a series of constant, regular intervals which are the key characteristics. We have used deep learning technique to apply ARIMA model of Time series forecasting. It is best suited over all other methods as it predicts a given time series based on its own past values, utilizing grid search it recognizes a lot of boundaries that delivers the best-fit model for our time series data. It can be used for any nonseasonal series of numbers that exhibits patterns and is not a series of random events. ■ Constraints: The dataset with which we were provided with contains many NaN values. We dropped these rows which resulted in the loss of data. During the training we ignored this point
  • 3. Solution Design Forecasting/Modelling Tool Development Process • The provided dataset with information updated to January 2021 to predict daily Covid-19 cases, is a Time series. Time-series forecasting is widely used for such dataset as models are capable to predict future values based on previously observed values. • Taking the stationarity test by Augmented Dickey-Fuller test, the time series exhibited trending behaviour or non-stationarity in the mean. This step is very important because the whole results of the regression might be fabricated.   • Long short-term memory (LSTM) is RNN architecture, used in the field of deep learning. While using an LSTM model we are free and able to decide what information will be stored and what discarded using ‘gates’. These models are able to store information over a period of time which is extremely useful when we deal with Time-Series Data. Hence, we have built a multi-layer LSTM recurrent neural network to predict the predict daily Covid-19 cases from a sequence of values. The modules used are : Keras, TensorFlow, Pandas, Scikit-Learn & NumPy. • To boost the performance, the next step in the process was to normalize the data before model fitting. We did this by implementing Min-Max scaling in Python using scikit-learn and then split the data into training and test sets to avoid overfitting and to be able to investigate the generalization ability of our model. • We next built a neural network architecture, a RNN-LSTM architecture was trained on our training data. After the model was trained, we assigned 1 neuron in the output layer for predicting the the number of confirmed cases on county level. To achieve good results fast, we have used MSE loss function and the Adam stochastic gradient descent optimizer as it overcomes redundancy by performing a parameter update for each training by one update at a time. Data collection (influenza cases, COVID-19 cases) and pre-processing is carried out. Apply ARIMA model of Time series forecasting. Build and learn LSTM model. Conduct performance evaluation of prediction algorithms on test data and Covid-19 data. Forecast Covid trends in US for Future.
  • 4. Validation Results of Forecasting/Modeling Tool ARIMA model: We have used ARIMA (Auto-Regressive Integrated Moving Average), a class of models that has the principal objective to correctly recognize the stochastic mechanism of the time series i.e., its own lags and the lagged forecast errors, so that equation can be used to forecast future values. Performance Evaluation Criteria – Mean Square Root Error We have used Root Mean Square Error (RMSE) as a standard statistical metric to measure model performance by calculating difference between the value predicted by the model and the value observed in the actual environment. Fig. Methodology to apply the ARIMA model and LSTM algorithm for Top drivers of the model RNN Modules: We developed an ensemble framework that combines multiple RNN-based deep learning models using LSTM algorithms that show high performance in time series forecasts and are capable to learn long term dependencies by replacing the hidden layers of RNN with memory cells. Validation Results: The mean squared error (MSE) is close to zero.
  • 5. 1. To Identify top 5 States expected to be worst affected by COVID-19 pandemic by 30th April 2021 considering the population demographics. Additional Analysis ▪ Data Source : We have used data of COVID-19 cases in the United States as of March 11, 2021, by state available publicly at https://www.statista.com/statistics/1102807/coronavirus-covid19-cases-number-us-americans-by-stat ▪ Statistical models used : We consider the exponential model, the logistic model, and the Susceptible Infectious Susceptible (SIS) model for COVID-19 pandemic prediction at the state level. These models have already been used to predict epidemics like COVID-19 around the world, including China, Ebola outbreak in 2014 The exponential model will estimate the upper bound of the total number of infected people by 30th April. The logistic model-based prediction will capture the effect of preventive measures that have already been taken by the respective state government. The SIS model will reflect the effect of the major preventive measure like recent crowd gatherings for protests. ▪ State-wise Analysis and Prediction Report: Dependent on inputs from the exponential, logistic, and SIS model along with daily infection-rates for each state, we interpret the results from different models jointly to conclude : The top 5 states worst affected by April,30 are: 1. California: As of today California has 4,740 confirmed cases on average in one week. The growth rate is -5%. 2. Texas: As of today Texas has 6,190 confirmed cases on average in this week. The growth rate is -5%. 3. Florida: As of today Florida has 5,136 confirmed cases on average in this week. The growth rate is -5%. 4. New-York: As of today New-York has 7,279 confirmed cases on average in this week. The growth rate is +5%. 5. Illinois: As of today Illionis has 6,190 confirmed cases on average in this week. The growth rate is -5%. ▪ Discussion:The number of confirmed cases alone is not a good metric for estimating risk of exposure because it doesn’t take population density into account whereas Incidence Rate controls for population density to give number of confirmed cases per 100,000 people. Drawbacks for predicting the trend of COVID-19 disease and analysing state’s conditions to explain the situation, there is lack of information about social and political measurements and reactions to illness spread, state categorized data is not available for the public completely, many algorithms do not work in this restricted dataset due to its time-series nature.
  • 6. Vaccination Strategy to optimize overall cost of transport/distribution to minimize the number of new cases Aim of the strategy is to save lives and reduce the chain of transmission with AI enabled prioritization for a fast and effective vaccine delivery. The solution prioritizes the population, considering the limited availability of vaccines initially and at the same time, limiting the spread of COVID-19. Expected Results Prioritization of vaccine delivery for human wellbeing will help various sectors of community like: ► Health workers at high or very high risk ► Employment categories unable to physically distance ► Groups living in dense urban neighbourhoods ► Groups living in multigenerational households It will help to reduce societal and economic disruption: ► School-aged children to minimize disruption of education and socioemotional development ► Workers in non-essential but economically critical sectors, particularly in occupations that do not permit remote work or physical distancing while working
  • 7. Additional variables that could have be used or explored to improve the model prediction accuracy ■ Bus operators in the UK and Europe have announced a halt to services, while other operators worldwide are running reduced services as a result of the latest wave of COVID-19. ■ Many of the London bus drivers who died from coronavirus were suffering with underlying health conditions particularly high blood pressure. ■ Since, the population diversity is related to the number of confirmed cases. For eg: Europeans and American are more prone to COVID-19 than Asians. The coronavirus (COVID-19) pandemic has revealed deep-seated inequities in health care for communities of color and amplifies social and economic factors that contribute to poor health outcomes. Recent news reports indicate that the pandemic disproportionately impacts communities of color, compounding longstanding racial disparities. ■ FirstGroup, the UK’s largest bus company and operator of rail franchises, has suggested it may not be able to continue trading because of the impact of COVID-19 on its passenger numbers. Relaxing the social distance rule from two meters to one meter allows an underground train to accommodate more people, it was still limited to 208 people, said Ron Kalifa, board member of Transport for London. Through the above facts and reports ,it's evident that transport and mobility data and population diversity is very much correlated to the number of COVID-19 confirmed cases. Therefore, transport and mobility data can be used to make our model perform better.