DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
CONSTRUCTION PROJECT ASSIG. III.pptx
1. WEL COME TO MY PRESENTATION
SCHOOL OF CIVIL ENGINEERING AND ARCHITECTURE
COTM DEPARTMENT
CONSTRUCTION PROJECT MANAGEMENT ASSIGNMENT III
PREDICTION OF RISK DELAY IN CONSTRUCTION PROJECTS
USING A HYBRID ARTIFICIAL INTELLIGENCE MODEL
NAME: EDRIS ABDELLA
SUBMITTED TO: Dr. MOHAMMADZEN H. (PhD)
DECEMBER 30 / 2021, DIRE DAWA, ETHIOPIA
2. 1. INTRODUCTION
1.1 Research Background
The success of construction sector is measured by time, cost and
quality performance of construction projects.
Project delays are the major problems tackled by the construction
sector owing to the associated complexity and uncertainty in the
construction activities. (Assaf, S.A.; S. Int. J. Proj. Manag).
Artificial Intelligence (AI) models have evidenced their capacity to
solve dynamic, uncertain and complex tasks.
The aim of this current study is to develop a hybrid artificial
intelligence model called integrative Random Forest classifier with
Genetic Algorithm optimization(RF-GA)for delay problem prediction.
Project delay has an adverse effect on the project performance, which
leads to cost overruns and productivity reduction.
3. 1.2 LITERATURE REVIEW
Recently, artificial intelligence (AI) models have been widely applied in
different fields of engineering and science (Desai, V.S.; Joshi, S. NJ, USA, 2010)
Several studies used AI models in their research on construction project
management.
Elazouni (2006) Used an Artificial Neural Network (ANN) during the
prequalification process before awarding a contract.
The model was used to classify contractors into several groups depending on
their performance.
Chao and Chien (2009) developed a neural network model to estimate the S-
curve in a construction project, which represented by polynomial parameter.
The results displayed that the model can be useful for contractors and owners
in the early planning phase of a construction project.
Desai and Joshi (2010) used a decision tree to analyze labor productivity in
construction projects.
Shin (2011) proposed the AdaBoost algorithm for the selection of a framework
system in construction projects (Kvaraciejiene. J. Civ. Eng. Manag. 2015)
4. Evolution of AI in project management (M.Lahmann, AI will
transform Pm,2018)
.
5. 1.2 LITERATURE REVIEWCONT…
The algorithm was compared with an ANN model and the results revealed that
the AdaBoost algorithm performed with better accuracy than the ANN model.
In another study, Chou and Lin (2012) applied an ensemble method for
disputing problems in public–private partnership projects.
The study used four regression and classification trees and two statistical
techniques to compare the performance of the utilized method.
The study revealed that the ensemble models provided better accuracy that the
individual models.
Rudžianskaite et al. (2015) Integrated a Random Forest algorithm with the
modeling of public–private partnerships in infrastructure projects.
The method performed with good accuracy in the prediction process for public
and private projects.
Heravi and Eslamdoost (2015) investigated the potential of an ANN model for
the prediction of labor productivity in construction projects.
6. 1.2 LITERATURE REVIEWCONT…
The results discovered that the ANN model showed better modeling of labor
productivity.
Gerassis et al. (2016) applied Bayesian networks to analyze the causes of accidents in
embankment construction .
By recalling the related literature review studies, AI model application is still a new
methodology in the field of construction management research and delay risk
prediction [Naji, H. Civ. Eng. J. 2018].
Few studies used AI models in risk prediction and classification.
The authors evidenced the capacity of the decision tree has higher accuracy by 79.41%
over the Naive Bayes model, which showed a lower accuracy value of 73.52% , but
93% by AI (Yaseen, Z.;S.Q.; Al-Ansari, 2019 )
Based on the reported studies in the literature, the current research is established with
the aim of providing a reliable methodology for delay risk prediction that will
contribute to the baseline knowledge of construction management.
the current study is adopted based on the integration of a nature-inspired optimization
algorithm called Genetic Algorithm (GA) with a Random Forest (RF) model.
7. 1.3 Research Objectives
To explore and develop an effective tool to predict delay risk problems
based on delay sources using previous construction projects data.
To provide an accurate methodology that can assist in the prediction of
future durations and monitor risk levels, based on these projects.
To achieve this aim, sources and factors of delay risk are extracted from
the literature, and then related data to the delay risk problems are
collected from previous construction projects.
Based on the complex nature of the construction process and the
associated uncertainties of the delay sources,
a hybrid model based on the integration of the Random Forest and
Genetic Algorithm (RF-GA) is developed in order to analyze the data
of completed previous projects.
The performance of the developed model is studied statistically and
discussed comprehensively.
The potential of the proposed RF-GA model is validated against the
classic RF model.
8. 2. RESEARCH METHODOLOGY
2.1 Random Forest Model
The RF model was first developed by Breiman (2001) based on the
combination of decision tree classifiers.
Each tree provides a prediction for the class label and the algorithm
selects the classes that have the most choices.
Random Forest is a very popular tool that uses the bootstrapping
method to train dataset samples and construct multiple random trees.
The algorithm gained significant important because it is invariant under
scaling and it is robust to the inclusion of irrelevant features.
Several studies examined the application of Random Forest in
engineering applications and demonstrated its feasibility in prediction
processes (Breiman, ASan Diego 2012, Mach. Learn).
9. 2.1 Random Forest Model cont…
Under the bootstrapping method, the data during the training phase are
selected randomly
and independently to develop an RF model, and the data that are not
involved in the selection process are named “out-of-bag” .
The capacity of the random forest has been approved by several
engineering problems such as (Liaw, A. 2002, Holland, J.H. ):
During this process, the out-of-bag data are changed and the prediction
error is measured to estimate the importance of input variables.
Random Forests contain several parameters that need to be optimized,
such as number of trees, minimum gain and maximum tree depth.
10. 2.2 The Hybrid RF-GA Model
Genetic algorithm is a popular technique used to optimize problems in
complex systems based on natural selection .
To solve a problem in a GA algorithm, random solutions are generated, and
then the selection of the population is done to develop the model.
GA has been widely applied in different fields, such as image processing,
pattern recognition and controlling systems (Senouci, A.).
The GA model was used in different research in construction management
such as:
resources optimization , project scheduling , optimizing of time and cost in
construction projects and dispute classification(Autom. Constr. 2009).
In the present study, a GA model is presented to optimize parameters of the RF
model, including number of trees, min gain and max tree depth.
11. 2.3 Model Development Procedure
The experts were involved in different parties, which include client, engineer
and the other experts of these projects. (distributed to 300 experts )
The collected projects were divided into two phases:
70% of the total projects (28 projects) were used for the training phase and
30% (12 projects) for the testing phase of the hybrid intelligence model.
The genetic algorithm was applied to optimize the Random Forest classifier,
and the hybrid RF-GA was used to predict the time performance in
construction projects based on their risk levels by dividing the projects into a
class label describing the predicted time delay.
At first, the model describes the data with a set of records and variable values.
The rows represent the individual projects and the columns represent the value
of each project.
The input variables included owner, designer, contractor, project, material,
equipment, labor and external factors.
The hybrid model was used for learning the dataset in order to predict the time
delay at different levels of time overrun. (see figure 2)
13. 2.4 Model Performance Measures
The performance of the predicted model was evaluated by using class
performance and overall performance measures.
Class performance was measured by precision, sensitivity and
specificity.
The overall performance of the predicted model was evaluated by
accuracy, classification error and kappa statistics.
The kappa coefficient (k) was used in statistics to measure the quality
of an item based on inter classifier agreement (Tabassum,GA optimization
solutions. Int. J. Digit, Kappa. Int. J. Remote Sens. 2011).
14. 3. RESULTS AND DISCUSSION
Analysis of collected data based on 40 projects was conducted to identify the
sources of delay problems effectively. Figure 3 and 4.
15. 4. CONCLUSION
Two approaches were adopted in this study. First, delay sources and factors were collected from a
literature review and identified by an expert meeting.
The collected data included historical records of previous projects that were investigated, and in
order to extract the measure of delay risk in construction projects a questionnaire was prepared
and distributed to 300 experts so as to extract the information about delay sources in construction
projects.
An analysis of data results and distribution of delay sources among the collected previous projects
was implemented in order to better understand delay factors in the construction sector.
Secondly, a hybrid RF-GA model was developed to deal with the complex and dynamic nature of
data in the construction sector.
The RF-GA model was evaluated by performance measure indices and compared with the
classical RF model.
Based on the analysis results, RF-GA revealed a better performance than the RF model.
The RF-GA provided values of accuracy, classification error and Kappa were 91.67%, 8.3%, and
87%, respectively.
These results reflect the ability of the model to handle the nonlinearity and complexity of data in
the construction sector.
The results also revealed the capability of the genetic algorithm in solving problems with multiple
solutions.