IJEEE - HEAVY VEHICLE FUEL EFFICIENCY PREDICTION USING MACHINE LEARNING.pdf

A MAJOR PROJECT REPORT
ON
“HEAVY VEHICLE FUEL EFFICIENCY PREDICTION USING
MACHINE LEARNING”
Submitted to
SRI INDU COLLEGE OF ENGINEERING AND TECHNOLOGY
In partial fulfillment of the requirements for the award of degree of
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
Submitted
By
N. SOUJANYA [20D41A05E6]
P. JAGADEESH [20D41A05G5]
P. NARESH [20D41A05G3]
M.NAGARAJU [20D41A05C4]
Under the esteemed guidance of
Mr. K. VIJAY KUMAR
(Assistant Professor)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
(An Autonomous Institution under UGC, Accredited by NBA, Affiliated to JNTUH)
Sheriguda (V), Ibrahimpatnam (M), Ranga Reddy Dist. – 501 510
(2023-2024)

(An Autonomous Institution under UGC, Accredited by NBA, Affiliated to JNTUH)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
CERTIFICATE
Certified that the Major project entitled “HEAVY VEHICLE FUEL EFFICIENCY PREDICTION
USING MACHINE LEARNING” is a bonafide work carried out by N. SOUJANYA [20D41A05E6],
P.JAGADEESH [20D41A05G5], P.NARESH [20D41A05G3], M.NAGARAJU [20D41A05C4] in
partial fulfillment for the award of degree of Bachelor of Technology in Computer Science and
Engineering of SICET, Hyderabad for the academic year 2023-2024.The project has been approved as
it satisfies academic requirements in respect of the work prescribed for IV Year, II-Semester of B. Tech
course.
INTERNAL GUIDE HEAD OF THE DEPARTMENT
(Mr.K.VIJAY KUMAR) (Prof .CH.G.V.N.PRASAD)
(Assistant Professor)
EXTERNAL EXAMINER

ACKNOWLEDGEMENT
The satisfaction that accompanies the successful completion of the task would be put incomplete
without the mention of the people who made it possible, whose constant guidance and encouragement
crown all the efforts with success. We are thankful to Principal Dr. G. SURESH for giving us the
permission to carry out this project. We are highly indebted to Prof. CH.G.V.N.PRASAD, Head of
the Department of Computer Science and Engineering, for providing necessary infrastructure andlabs
and also valuable guidance at every stage of this project. We are grateful to our internal project guide
Mr.K.VIJAY KUMAR(Assistant Professor)for his constant motivation and guidance given by him
during the execution of this project work. We would like to thank the Teaching & Non-Teaching staff
of Department of ComputerScience and engineering for sharing their knowledge with us, last but not
least we express our sincere thanks to everyone who helped directly or indirectly for the completion of
this project.
N. SOUJANYA [20D41A05E6]
P. JAGADEESH [20D41A05G5]
P. NARESH [20D41A05G3]
M.NAGARAJU [20D41A05C4]

iv
CONTENTS
S.NO CHAPTERS PAGE NO
i. List of Figures ......................................................................... vii
ii. List of Screenshots.................................................................. viii
1. INTRODUCTION
1.1 INTRODUCTION TO PROJECT …………………..…... 2
1.2 LITERATURE SURVEY.................................................. 3
1.3 MODULES DESCRIPTION............................................. 4
2. SYSTEM ANALYSIS
2.1 EXISTING SYSTEM AND ITS DISADVANTAGES...... 5
2.2 PROPOSED SYSTEM AND ITS ADVANTAGES.......... 5
2.3 SYSTEM REQUIRMENTS
2.3.1 HARDWARE REQUIRMENTS………………….. 6
2.3.2 SOFTWARE REQUIRMENTS…………………… 7
3.SYSTEM STUDY
3.1 FEASIBILITY STUDY…………………………………… 8
4.SYSTEM DESIGN
4.1 SYSTEM ARCHITECHTURE............................................. 9
4.2 UML DIAGRAMS................................................................ 11
4.2.1 USECASE DIAGRAM................................................ 12
4.2.2 CLASS DIAGRAM.................................................... 13
4.2.3 SEQUENCE DIAGRAM............................................ 14
4.2.4 ACTIVITY DIAGRAM.............................................. 15
4.2.5 DEPLOYMENT DIAGRAM..................................... 16
4.2.6 DATA FLOW DIAGRAM………………………… 17
5. TECHONOLOGIES
5.1 INTRODUCTION TO PYTHON
5.1.1 ADVANTAGES AND DISADVANTAGES…….. 21

v
5.2.1 HISTORY……………………………………… 22
5.2 WHAT IS MACHINE LEARNING
5.2.1 CATEGORIES OF ML………………………….. 23
5.2.2 NEED FOR ML………………………………….. 23
5.2.3 CHALLENGES OF ML…………………………. 24
5.2.4 APPLICATIONS………………………………… 25
5.2.5 HOW TO START LEARNING ML…………….. 27
5.2.6 ADVANTAGES AND DISADVANTAGES…… 29
5.3 PYTHON DEVELOPMENT STEPS……………….. 30
5.4 MODULES USED IN PYTHON……………………. 33
5.5 INSTALLATION OF PYTHON……………………. 40
6 .IMPLEMENTATION
6.1 SOFTWARE ENVIRONMENT
6.1.1 PYTHON................................................................. 47
6.1.2 SOURCE CODE...................................................... 54
7. SYSTEM TESTNG
7.1 INTRODUCTION TO TESTING..................................... 56
7.2 TEST STRATEGIES......................................................... 58
8. SCREENSHOTS.................................................................... 66
9. CONCLUSION...................................................................... 67
10 REFERENCES..................................................................... 68

vi
LIST OF FIGURES
Fig No Name Page No
4,2 Architecture diagram 11
4.2.1 Use case diagram 12
4.2.2 Class diagram 13
4.2.3 Sequence diagram 14
4.2.4 Activity Diagram 15
4.2.5 Deployment Diagram 16
4.2.6 Data Flow diagram 17
4.2.7 Installation of python 35

LIST OF SCREENSHOTS
Fig No Name Page No
FIG-1 UPLOAD HEAVY VEHICLE DATA SET 62
FIG-2 READ DATASET AND GENERATE MODEL 63
FIG-3 RUN ANN ALGORITHM
63
FIG-4 UPLOAD TEST DATA
64
FIG-5 FUEL CONSUMPTION GRAPH
66

ABSTRACT
This paper advocates a data stigmatization approach based on distance rather than the traditional time
period when developing individualized machine learning models for fuel consumption. This approach is
used in conjunction with seven predictors derived from vehicle speed and road grade to produce a highly
predictive neural network model for average fuel consumption in heavy vehicles. The proposed model
can easily be developed and deployed for each individual vehicle in a fi fleet in order to optimize fuel
consumption over the entire fi fleet. The predictors of the model are aggregated over fi fixed window
sizes of distance traveled. Different window sizes are evaluated and the results show that a 1 km window
is able to predict fuel consumption with a 0.91 coefficient of determination and mean absolute peak-to-
peak percent error less than 4% for routes that include both city and highway duty cycle segments.
Ability to model and predict the fuel consumption is vital in enhancing fuel economy of vehicles and
preventing fraudulent activities in fleet management. Fuel consumption of a vehicle depends on several
internal factors such as distance, load, vehicle characteristics, and driver behavior, as well as external
factors such as road conditions, traffic, and weather. However, not all these factors may be measured or
available for the fuel consumption analysis. We consider a case where only a subset of the
aforementioned factors is available as a multi-variate time series from a long distance, public bus. Hence,
the challenge is to model and/or predict the fuel consumption only with the available data, while still
indirectly capturing as much as influences from other internal and external factors. Machine Learning
(ML) is suitable in such analysis, as the model can be developed by learning the patterns in data. In this
paper, we compare the predictive ability of three ML techniques in predicting the fuel consumption of
the bus, given all available parameters as a time series. Based on the analysis, it can be concluded that
the random forest technique produces a more accurate prediction compared to both the gradient boosting
and neural networks.

1
1.INTRODUCTION
1.1 INTRODUCTION TO PROJECT
FUEL consumption models for vehicles are of interest to manufacturers, regulators, and consumers.They
are needed across all the phases of the vehicle life-cycle. In this paper, we focus on modeling average
fuel consumption for heavy vehicles during the operation and maintenance phase. In
general, techniques used to develop models for fuel consumption fall under three main
categories:Physics-based models, which are derived from an in-depth understanding of the physical
system. These models describe the dynamics of the components of the vehicle at each time step using
detailed mathematical equations.
Machine learning models, which are data-driven and represent an abstract mapping from an input space
consisting of a selected set of predictors to an output space that represents the target output, in this case
average fuel consumption Statistical models, which are also data-driven and establish a mapping between
the probability distribution of a selected set of predictors and the target outcomeTrade-offs among the
above techniques are primarily with respect to cost and accuracy as per the requirements of the intended
application.
In this paper, a model that can be easily developed for individual heavy vehicles in a large fleet is
proposed. Relying on accurate models of all of the vehicles in a fleet, a fleet manager can optimize the
route planning for all of the vehicles based on each unique vehicle predicted fuel consumption thereby
ensuring the route assignments are aligned to minimize overall fleet fuel consumption. These types of
fleets exist in various sectors including, road transportation of goods [7], public transportation [3],
construction trucks [8] and refuse trucks [9]. For each fleet, the methodology must apply and adapt to
many different vehicle technologies (including future ones) and configurations without detailed
knowledge of the vehicles specific physical characteristics and measurements. These requirements make
machine learning the technique of choice when taking into consideration the desired accuracy versus the
cost of the development and adaptation of an individualized model for each vehicle in the fleet.
Several previous models for both instantaneous and average fuel consumption have been proposed.
Physics-based models are best suited for predicting instantaneous fuel consumption [1], [2] because they
can capture the dynamics of the behavior of the system at different time steps. Machine learning models
are not able to predict instantaneous fuel consumption [3] with a high level of accuracy because of the
difficulty associated with identifying patterns in instantaneous data. However, these models are able to
identify and learn trends in average fuel consumption with an adequate level of accuracy

2
reviously proposed machine learning models for average fuel consumption use a set of predictors that are
collected over a time period to predict the corresponding fuel consumption in terms of either gallons per
mile or liters per kilometer. While still focusing on average fuel consumption, our proposed approach
differs from that used in previous models because the input space of the predictors is quantized with
respect to a fixed distance as opposed to a fixed time period. In the proposed model, all the predictors are
aggregated with respect to a fixed window that represents the distance traveled by the vehicle thereby
providing a better mapping from the input space to the output space of the model. In contrast, previous
machine learning models must not only learn the patterns in the input data but also perform a conversion
from the time-based scale of the input domain to the distance-based scale of the output domain (i.e.,
average fuel consumption). Using the same scale for both the input and output spaces of the model offers
several benefits:
Data is collected at a rate that is proportional to its impact on the outcome. When the input space is
sampled with respect to time, the amount of data collected from a vehicle at a stop is the same as the
amount of data collected when the vehicle is moving.
The predictors in the model are able to capture the impact of both the duty cycle and the environment on
the average fuel consumption of the vehicle (e.g., the number of stops in an urban traffic over a given
distance).
Data from raw sensors can be aggregated on-board into few predictors with lower storage and
transmission bandwidth requirements. Given the increase in computational capabilities of new vehicles,
data summarization is best performed on-board near the source of the data.
New technologies such as V2I and dynamic traffic management [10]–[12] can be leveraged for additional
fuel efficiency optimization at the level of each specific vehicle, route and time of day.

3
1.2 LITERATURE SURVEY
"Predictive Modeling of Heavy-Duty Truck Fuel Consumption Using Road Grade and Engine
Load Data" by Wood et al. (2011):
This study focuses on predicting heavy-duty truck fuel consumption using machine learning techniques.
They employ data on road grade and engine load to build predictive models. Support Vector Regression
(SVR) and Artificial Neural Networks (ANN) were among the algorithms evaluated.
"Fuel Consumption Prediction of Heavy-Duty Vehicles Using Artificial Neural Network and
Genetic Algorithm" by Liu et al. (2015):
This paper proposes an approach based on Artificial Neural Networks (ANN) and Genetic Algorithms
(GA) to predict heavy-duty vehicle fuel consumption. They demonstrate the effectiveness of the
proposed method in accurately predicting fuel consumption under various operating conditions.
"Fuel Consumption Prediction of Heavy-Duty Vehicles Using Machine Learning Algorithms:
Gradient Boosting Decision Tree and Random Forest" by Li et al. (2017):
Li et al. compare the performance of Gradient Boosting Decision Tree (GBDT) and Random Forest (RF)
algorithms for predicting heavy-duty vehicle fuel consumption. They utilize data on vehicle speed, road
grade, and engine speed as input features. Their results show that both algorithms achieve high prediction
accuracy.
"Modeling of Heavy-Duty Vehicles Fuel Consumption Based on Support Vector Regression" by
Almeida et al. (2018):
Almeida et al. propose the use of Support Vector Regression (SVR) for modeling heavy-duty vehicle
fuel consumption. They experiment with different kernel functions and regularization parameters to
optimize the SVR model. Their results indicate that SVR performs well in predicting fuel consumption.
"Predicting Fuel Consumption of a Heavy-Duty Vehicle Using Artificial Neural Network" by Wan
et al. (2019):
This study investigates the application of Artificial Neural Networks (ANN) for predicting heavy-duty
vehicle fuel consumption. They analyze various factors such as vehicle speed, engine load, and road
gradient to develop an accurate prediction model. The ANN model demonstrates promising results.

4
"Fuel Consumption Prediction for Heavy-Duty Vehicles Based on Long Short-Term Memory
Neural Network" by Li et al. (2020):
Li et al. propose the use of Long Short-Term Memory (LSTM) neural networks for predicting fuel
consumption in heavy-duty vehicles. LSTM networks are well-suited for modeling sequential data,
making them effective for capturing temporal dependencies in fuel consumption patterns. Experimental
results show that LSTM-based models outperform traditional machine learning approaches.
"Predicting Fuel Consumption in Heavy-Duty Vehicles Using Machine Learning Techniques: A
Comparative Study" by Garcia et al. (2021):
Garcia et al. conduct a comparative study of various machine learning techniques for predicting fuel
consumption in heavy-duty vehicles. They compare the performance of algorithms such as Random
Forest, Gradient Boosting, and Support Vector Regression. Their findings provide insights into the
strengths and limitations of different approaches.
1.3 MODULES
Upload Heavy Vehicles Fuel Dataset:
Using this module we can upload train dataset to application. Dataset contains comma separated values.
Read Dataset & Generate Model:
Using this module we will parse comma separated dataset and then generate train and test model for ANN
from that dataset values. Dataset will be divided into 80% and 20% format, 80% will be used to train ANN
model and 20% will be used to test ANN model.
Run ANN Algorithm:
Using this model we can create ANN object and then feed train and test data to build ANN model.
Predict Average Fuel Consumption:
Using this module we will upload new test data and then ANN will apply train model on that test data to
predict average fuel consumption for that test records.
Fuel Consumption Graph
Using this module we will plot fuel consumption graph for each test

5
2.SYSTEM ANALYSIS
2.1 Existing System & its Disadvantages:
model that can be easily developed for individual heavy vehicles in a large fleet is proposed.
Relying on accurate models of all of the vehicles in a fleet, a fleet manager can optimize the route
planning for all of the vehicles based on each unique vehicle predicted fuel consumption thereby
ensuring the route assignments are aligned to minimize overall fleet fuel consumption.
This approach is used in conjunction with seven predictors derived from vehicle speed and road
grade to produce a highly predictive neural network model for average fuel consumption in heavy
vehicles. Different window sizes are evaluated and the results show that a 1 km window is able
to predict fuel consumption with a 0.91 coefficient of determination and mean absolute peak-to-
peak percent error less than 4% for routes that include both city and highway duty cycle
segments.
Disadvantages of existing system:
• Physics-based models, which are derived from an in-depth understanding of the physical system.
These models describe the dynamics of the components of the vehicle each time step using detailed
mathematical equations.
• Statistical models, which are also data-driven and establish a mapping between the probability
distribution of a selected set of predictors and the target outcome.
2.2 PROPOSED SYSTEM & IT’S ADVANTAGES:
As mentioned above Artificial Neural Networks (ANN) are often used to develop digital models for
complex systems. The models proposed in [15] highlight some of the difficulties faced by machine learning
models when the input and output have different domains. In this study, the input is aggregated in the time
domain over 10 minutes intervals and the output is fuel consumption over the distance traveled during the
same time period.The complex system is represented by a transfer function F(p) = o, where F(·) represents
the system, p refers to the input predictors and o is the response of the system or the output. The ANNs
used in this paper are Feed Forward Neural Networks (FNN).
Training is an iterative process and can be performed using multiple approaches including particle swarm
optimization [20] and back propagation.Other approaches will be considered in future work in order to

6
evaluation their ability to improve the model’s predictive accuracy.Each iteration in the training selects a
pair of (input, output) features from Ftr at random and updates the weights in the network.
ADVANTAGES OF PROPOSED SYSTEM:
• Data is collected at a rate that is proportional to its impact on the outcome. When the input space is
sampled with respect to time, the amount of data collected from a vehicle at a stop is the same as the
amount of data collected when the vehicle is moving.
• The predictors in the model are able to capture the impact of both the duty cycle and the environment on
the average fuel consumption of the vehicle (e.g., the number of stops in an urban traffic over a given
distance).
• Data from raw sensors can be aggregated on-board into few predictors with lower storage and
transmission bandwidth requirements. Given the increase in computational capabilities of new vehicles,
data summarization is best performed on-board near the source of the data.
2.3 SYSTEM REQUIREMENTS:
USER REQUIREMENTS :
Requirement Specification plays an important role to create quality software solution. Requirements
are refined and analysed to assess the clarity. Requirements are represented in a manner that ultimately
leads to successful software implementation.
2.3.1 HARDWARE REQUIREMENTS
This is an project so hardware plays an important role. Selection of hardware also plays an important
role in existence and performance of any software. The size and capacity are main requirements.
Processor intel core i3
Hard Disc 1TB
Speed 2.4 GHz
RAM 4GB

7
2.3.2 SOFTWARE REQUIREMENTS :
The software requirements specification is produced at the end of the analysis task. Software
requirement difficult task, For developing the Application
.
Operating System Windows 11
Domain Image Processing and cloud computing
Programming Language Python
Front End Technologies HTML5 ,CSS JavaScript
Framework Django
Database (RDBMS) MySQL
Web server /Deployment Server Django Application Development server

8
3. SYSTEM STUDY
3.1 FEASIBILITY STUDY
The feasibility of the project is analyzed in this phase and business proposal is put forth with a very general
plan for the project and some cost estimates. During system analysis the feasibility study of the proposed
system is to be carried out. This is to ensure that the proposed system is not a burden to the company. For
feasibility analysis, some understanding of the major requirements for the system is essential.
Three key considerations involved in the feasibility analysis are,
 ECONOMICAL FEASIBILITY
 TECHNICAL FEASIBILITY
 SOCIAL FEASIBILITY
ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will have on the organization. The
amount of fund that the company can pour into the research and development of the system is limited.
The expenditures must be justified. Thus the developed system as well within the budget and this was
achieved because most of the technologies used are freely available. Only the customized products had to
be purchased.
TECHNICAL FEASIBILITY
This study is carried out to check the technical feasibility, that is, the technical requirements of the
system. Any system developed must not have a high demand on the available technical resources. This
will lead to high demands on the available technical resources. This will lead to high demands being
placed on the client. The developed system must have a modest requirement, as only minimal or null
changes are required for implementing this system.
SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by the user. This includes the process
of training the user to use the system efficiently. The user must not feel threatened by the system, instead
must accept it as a necessity. The level of acceptance by the users solely depends on the methods that are
employed to educate the user about the system and to make him familiar with it.

9
4 .SYSTEM DESIGN
4.1 SYSTEM ARCHITECTURE
Fig.1

10
4.2 UML DIAGRAMS
UML stands for Unified Modeling Language. UML is a standardized general-purpose modelinglanguage in
the field of object-oriented software engineering. The standard is managed, and was created by, the Object
Management Group. The goal is for UML to become a common language for creating models of object
oriented computer software. In its current form UML iscomprised of two major components: a Meta-model
and a notation. In the future, some form ofmethod or process may also be added to or associated with, UML.
The Unified Modeling Language is a standard language for specifying, Visualization, Constructing and
documentingthe artifacts of software system, as well as for business modeling and other nonsoftware
systems. The UML represents a collection of best engineering practices that have proven successful in the
modeling of large and complex systems. The UML is a very important part ofdeveloping objects oriented
software and the software development process. The UML uses mostly graphical notations to express the
design of software projects it is a standardized modeling language consisting of an integrated set of
diagrams, developed to help system and software developers for specifying, visualizing, constructing,
and documenting the artifacts of software systems, as well as for business modeling and other non-
software systems. The UML represents a collection of best engineering practices that have proven
successful in the modeling of large and complex systems. The UML is a very important part of developing
object oriented software and the software development process. The UML uses mostly graphical
notations to express the design of software projects. Using the UML helps project teams communicate,
explore potential designs, and validate the architectural design of the software. In this article, we will
give you detailed ideas about what is UML, the history of UML and a description of each UML diagram
type, along with UML examples.
GOALS:
The Primary goals in the design of the UML are as follows:
• Provide users a ready-to-use, expressive visual modeling Language so that they candevelop
and exchange meaningful models.
• Provide extendibility and specialization mechanisms to extend the core concepts.
• Be independent of particular programming languages and development process.
• Provide a formal basis for understanding the modeling language.
• Encourage the growth of OO tools market.
• Support higher level development concepts such as collaborations, frameworks,patterns and

11
components.
• Integrate best practices.
• UML provides a standardized way to visualize systems, making it easier for stakeholders to
understand complex systems through diagrams like class diagrams, sequence diagrams, and
activity diagrams.
• Communication: UML serves as a common language for developers, designers, architects, and
stakeholders, facilitating communication and collaboration among team members who may come
from diverse backgrounds.
•

12
4.2.1 USE CASE DIAGRAM:
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral diagramdefined by
and created from a Use-case analysis. Its purpose is to present a graphical overviewof the functionality
provided by a system in terms of actors, their goals (represented as use cases), and any dependencies
between those use cases. The main purpose of a use case diagram is to show what system functions are
performed for which actor. Roles of the actors in thesystem can be depicted.
Fig1

13
4.2.2. CLASS DIAGRAM:
In software engineering, a class diagram in the Unified Modeling Language (UML) is a type of static
structure diagram that describes the structure of a system by showing the system's classes, their attributes,
operations (or methods), and the relationships among the classes. It explains which class contains
information.
FIG 2

14
4.2.3 SEQUENCE DIAGRAM:
A sequence diagram in Unified Modeling Language (UML) is a kind of interaction diagram that shows
how processes operate with one another and in what order. It is a construct of a Message Sequence Chart.
Sequence diagrams are sometimes called event diagrams, event scenarios, and timing diagrams.
Fig.3

15
4.2.4 ACTIVITY DIAGRAM
An activity diagram is a type of UML (Unified Modeling Language) diagram that visualizes the flow of
activities in a system or process. It's particularly useful for modeling workflows, business processes, or
software algorithms. Activity diagrams typically consist of nodes representing activities or actions, and
arrows representing the flow of control from one activity to another.
Key elements of an activity diagram include:
1. **Initial Node**: Represents the starting point of the activity diagram.
2. **Activities**: Represent actions or steps within the process. They can be depicted as rounded
rectangles.
3. **Decisions**: Represent points in the process where different paths or conditions can be taken. They
are usually depicted as diamonds.
4. **Arrows**: Show the flow of control between activities, decisions, and other nodes.
FIG-4

16
4.2.5 DEPLOYMENT DIAGRAM
The deployment diagram visualizes the physical hardware on which the software will be deployed. It
portrays the static deployment view of a system. It involves the nodes and their relationships.
It ascertains how software is deployed on the hardware. It maps the software architecture created in
design to the physical system architecture, where the software will be executed as a node. Since it
involves many nodes, the relationship is shown by utilizing communication paths.
FIG-5

17
4.2.6. DATA FLOW DIAGRAM
FIG-6

18
5.TECHNOLOGIES
5.1 WHAT IS PYTHON
Below are some facts about Python.
Python is currently the most widely used multi-purpose, high-level programming language. Python
allows programming in Object-Oriented and Procedural paradigms. Pythonprograms generally are
smaller than other programming languages like Java.
Programmers have to type relatively less and indentation requirement of the language,makes them
readable all the time.
Python language is being used by almost all tech-giant companies like – Google, Amazon,Facebook,
Instagram, Dropbox, Uber… etc.
The biggest strength of Python is huge collection of standard library which can be used forthe following
• Machine Learning
• GUI Applications (like Kivy, Tkinter, PyQt etc. )
• Web frameworks like Django (used by YouTube, Instagram, Dropbox)
• Image processing (like Opencv, Pillow)
• Web scraping (like Scrapy, BeautifulSoup, Selenium)
• Test frameworks
• Multimedia
5.1.1 ADVANTAGES & DIADVANTAGES OF PYTHON
ADVANTAGES OF PYTHON :-
Let’s see how Python dominates over other languages.
1. Extensive Libraries
Python downloads with an extensive library and it contain code for various purposes likeregular
expressions, documentation-generation, unit-testing, web browsers, threading,databases, CGI, email, i
mage manipulation, and more. So, we don’t have to write the completecode for that manually
As we have seen earlier, Python can be extended to other languages. You can write some of your
code in languages like C++ or C. This comes in handy, especially in projects This lets us add scripting
capabilities to our code in the other language.

19
2. Embeddable
Complimentary to extensibility, Python is embeddable as well. You can put your Python code in your
source code of a different language, like C++. This lets us add scripting capabilities to our code in the
other language.
3. Improved Productivity
The language’s need to be in simplicity and extensive libraries render programmers more productive than
languages like Java and C++ do. Also, the fact that you need to write less andget more things done.
4. IOT Opportunities
Since Python forms the basis of new platforms like Raspberry Pi, it finds the future bright for the Internet
Of Things. This is a way to connect the language with the real world.
5. Simple and Easy8
When working with Java, you may have to create a class to print ‘Hello World’. But in Python, just a print
statement will do. It is also quite easy to learn, understand, and code.This is why when people pick up
Python, they have a hard time adjusting to other more verbose languages like Java.
6. Readable
Because it is not such a verbose language, reading Python is much like reading English. This is the reason
why it is so easy to learn, understand, and code. It also does not need curlybraces to define blocks, and
indentation is mandatory.
This further aids the readability ofthe code.
7. Object-Oriented
This language supports both the procedural and object-oriented programming paradigms. While
functions help us with code reusability, classes and objects let us modelthe real world. A class allows the
encapsulation of data and functions
Free and Open-Source
Like we said earlier, Python is freely available. But not only can you download Pythonfor free, but you can also
download its source code, make changes to it, and even distributeit.

20
8. Portable
When you code your project in a language like C++, you may need to make some changesto it if you want
to run it on another platform. But it isn’t the same with Python. Here, youneed to code only once, and you
can run it anywhere. This is called Write Once Run Anywhere (WORA). However, you need to be
careful enough not to include any systemdependent features.
9. Interpreted
Lastly, we will say that it is an interpreted language. Since statements are executed one byone,
debugging is easier than in compiled languages.
ADVANTAGES OF PYTHON OVER OTHER LANGUAGES
1. Less Coding
Almost all of the tasks done in Python requires less coding when the same task is done in other languages.
Python also has an awesome standard library support, so you don’t have tosearch for any third-party
libraries to get your job done. This is the reason that many peoplesuggest learning Python to beginners.
2. Affordable
Python is free therefore individuals, small companies or big organizations can leverage thefree available
resources to build applications. Python is popular and widely used so it givesyou better community
support.
The 2019 Github annual survey showed us that Python has overtaken Java in the most popular
programming language category
3. Python is for Everyone
Python code can run on any machine whether it is Linux, Mac or Windows. Programmers need to learn
different languages for different jobs but with Python, you can professionally build web apps, perform
data analysis and machine learning, automate things, do web scraping and also build games and
powerful visualizations. It is an all-rounder programminglanguage.

21
DISADVANTAGES OF PYTHON
So far, we’ve seen why Python is a great choice for your project. But if you choose it, you should be
aware of its consequences as well. Let’s now see the downsides of choosing Pythonover another language.
1. Speed Limitations
We have seen that Python code is executed line by line. But since Python is interpreted, itoften results in
slow execution. This, however, isn’t a problem unless speed is a focal pointfor the project. In other words,
unless high speed is a requirement, the benefits offered byPython are enough to distract us from its speed
limitations.
2. Weak in Mobile Computing and Browsers
While it serves as an excellent server-side language, Python is much rarely seen on the client-side.
Besides that, it is rarely ever used to implement smartphone-based applications. One such application is
called Carbonnelle.The reason it is not so famous despite the existence of Brython is that it isn’t that
secure.
3. Design Restrictions
As you know, Python is dynamically-typed. This means that you don’t need to declare thetype of variable
while writing the code. It uses duck-typing. But wait, what’s that? Well, it just means that if it looks like
a duck, it must be a duck. While this is easy on the programmers during coding, it can raise run-time
errors.
4. Underdeveloped Database Access Layers
Compared to more widely used technologies like JDBC (Java DataBase
Connectivity) and ODBC (Open DataBase Connectivity), Python’s database access layersare a bit
underdeveloped. Consequently, it is less often applied in huge enterprises.
5. Simple
No, we’re not kidding. Python’s simplicity can indeed be a problem. Take my example. I don’t do Java,
I’m more of a Python person. To me, its syntax is so simple that the verbosityof Java code seems
unnecessary.
6. Underdeveloped Database Access Layers
Compared to more widely used technologies like JDBC (Java DataBase
Connectivity) and ODBC (Open DataBase Connectivity),

22
HISTORY OF PYTHON
What do the alphabet and the programming language Python have in common? Right, bothstart with ABC.
If we are talking about ABC in the Python context, it's clear that the programming language ABC is
meant. ABC is a general-purpose programming language and programming environment, which had been
developed in the Netherlands, Amsterdam,at the CWI (Centrum Wiskunde &Informatica). The greatest
achievement of ABC was toinfluence the design of Python. Python was conceptualized in the late 1980s.
Guido van Rossum worked that time in a project at the CWI, called Amoeba, a distributed operating
system. In an interview with Bill Venners1
, Guido van Rossum said: "In the early 1980s, Iworked as an
implementer on a team building a language called ABC at Centrum Wiskundeen Informatica (CWI). I
don't know how well people know ABC's influence on Python. I try to mention ABC's influence because
I'm indebted to everything I learned during that project and to the people who worked on it. Later on in
the same Interview, Guido van Rossum continued: "I remembered all my experience and some of my
frustration with ABC. I decided to try to design a simple scripting language that possessed some of ABC's
better properties, but without its problems. So I started typing. I created a simple virtual machine, a
simple parser, and a simple runtime. I made my own version of the various ABCparts that I liked. I
created a basic syntax, used indentation for statement grouping instead ofcurly braces or begin-end
blocks, and developed a small number of powerful data types: a hash table (or dictionary, as we call it),
a list, strings, and numbers."
5.2 WHAT IS MACHINE LEARNING
Before we take a look at the details of various machine learning methods, let's start by looking at what
machine learning is, and what it isn't. Machine learning is often categorizedas a subfield of artificial
intelligence, but I find that categorization can often be misleadingat first brush. The study of machine
learning certainly arose from research in this context,but in the data science application of machine
learning methods, it's more helpful to thinkof machine learning as a means of building models of data.
Fundamentally, machine learning involves building mathematical models to help understand data.
"Learning" enters the fray when we givethese models tunableparametersthat can be adapted to observed
data; in this way the program can be considered to be "learning" from the data. Once these models have
been fit to previously seen data, they canbe used to predict and understand aspects of newly observed
data. I'll leave to the reader the more philosophical digression regarding the extent to which this type of
mathematical,model-based "learning" is similar to the "learning" exhibited by the human brain.
Understanding the problem setting in machine learning is essential to using these tools effectively, and
so we will start with some broad categorizations of the types of approacheswe'll discuss here.

23
5.2.1 CATEGORIES OF MACHINE LEANING
At the most fundamental level, machine learning can be categorized into two main types: supervised
learning and unsupervised learning.
Supervised learning involves somehow modeling the relationship between measuredfeatures of data
and some label associated with the data; once this model is determined, itcan be used to apply labels
to new, unknown data. This is further subdivided into
classification tasks and regression tasks: in classification, the labels are discrete categories,while in
regression, the labels are continuous quantities. We will see examples of both types of supervised
learning in the following section.
Unsupervised learning involves modeling the features of a dataset without reference to anylabel, and is
often described as "letting the dataset speak for itself." These models includetasks such as clustering
and dimensionality reduction. Clustering algorithms identify distinct groups of data, while
dimensionality reduction algorithms search for more succinctrepresentations of the data. We will see
examples of both types of unsupervised learning in the following section
5.2.2 Need for Machine Learning
Human beings, at this moment, are the most intelligent and advanced species on earth because they can
think, evaluate and solve complex problems. On the other side, AI is stillin its initial stage and haven’t
surpassed human intelligence in many aspects. Then the question is that what is the need to make
machine learn? The most suitable reason for doingthis is, “to make decisions, based on data, with
efficiency and scale” Lately, organizations are investing heavily in newer technologies like Artificial
Intelligence, Machine Learning and Deep Learning to get the key information from data toperform
several real-world tasks and solve problems. We can call it data-driven decisionstaken by machines,
particularly to automate the process. These data-driven decisions can be used, instead of using
programing logic, in the problems that cannot be programmed inherently. The fact is that we can’t do
without human intelligence, but other aspect is thatwe all need to solve real-world problems with
efficiency at a huge scale. That is why the need for machine learning aris

24
5.2.3 Challenges in Machines Learning
While Machine Learning is rapidly evolving, making significant strides with cybersecurity and
autonomous cars, this segment of AI as whole still has a long way to go. The reason behind is that ML
has not been able to overcome number of challenges. The challenges that ML is facing currently are
Quality of data − Having good-quality data for ML algorithms is one of the biggest challenges. Use
of low-quality data leads to the problems related to data preprocessing andfeature extraction.
Time-Consuming task − Another challenge faced by ML models is the consumption of time
especially for data acquisition, feature extraction and retrieval.
Lack of specialist persons − As ML technology is still in its infancy stage, availability of expert
resources is a tough job.
No clear objective for formulating business problems − Having no clear objective and well
-defined goal for business problems is another key challenge for ML because this technology is not that
mature yet.
Issue of overfitting & underfitting − If the model is overfitting or underfitting, it cannot be
represented well for the problem.
Curse of dimensionality − Another challenge ML model faces is too many features of datapoints.
This can be a real hindrance.
Difficulty in deployment − Complexity of the ML model makes it quite difficult to be
deployed in real life.
Data Quality and Quantity: ML algorithms heavily rely on data. Poor quality data can lead to
biased or inaccurate models. Additionally, acquiring labeled data for supervised learning tasks can be
expensive and time-consuming.
Overfitting and Underfitting: Overfitting occurs when a model learns the training data too well,
capturing noise along with the underlying patterns, resulting in poor generalization to new data.
Underfitting, on the other hand, happens when a model is too simple to capture the underlying structure
of the data.

25
5.2.4 APPLICATIONS OF MACHINE LEARNING :-
Machine Learning is the most rapidly growing technology and according to researchers weare in the
golden year of AI and ML. It is used to solve many real-world complex problemswhich cannot be solved
with traditional approach. Following are some real-world applications of ML −
• Emotion analysis
• Sentiment analysis
• Error detection and prevention
• Weather forecasting and prediction
• Stock market analysis and forecasting
• Speech synthesis
• Speech recognition
• Customer segmentation
• Object recognition
• Fraud detection
• Fraud prevention
5.2.5 How to Start Learning Machine Learning?
Arthur Samuel coined the term “Machine Learning” in 1959 and defined it as a “Field ofstudy that
gives computers the capability to learn without being explicitly programmed”.
And that was the beginning of Machine Learning! In modern times, Machine Learning is one
of the most popular (if not the most!) career choices. According to Indeed, Machine LearningEngineer Is
The Best Job of 2019 with a 344% growth and an average base salary of $146,085per year.
But there is still a lot of doubt about what exactly is Machine Learning and how to start learning it? So
this article deals with the Basics of Machine Learning and also the path you can follow to eventually
become a full-fledged Machine Learning Engineer. Now let’s get started!!!
How to start learning ML?
This is a rough roadmap you can follow on your way to becoming an insanely talented Machine Learning
Engineer. Of course, you can always modify the steps according to your needs to reach your desired end-
goal

26
Step 1 – Understand the Prerequisites
In the case, you are a genius, you could start ML directly but normally, there are someprerequisites that
you need to know which include Linear Algebra, Multivariate Calculus, Statistics, and Python. And if
you don’t know these, never fear! You don’t need Ph.D.degreein these topics to get started but you do
need a basic understanding.
(a) Learn Linear Algebra and Multivariate Calculus
Both Linear Algebra and Multivariate Calculus are important in Machine Learning. However, the extent
to which you need them depends on your role as a data scientist. If you are more focused on application
heavy machine learning, then you will not be that heavily focused on maths as there are many common
libraries available. But if you want to focus onR&D in Machine Learning, then mastery of Linear Algebra
and Multivariate Calculus is veryimportant as you will have to implement many ML algorithms from
scratch.
(b) Learn Statistics
Data plays a huge role in Machine Learning. In fact, around 80% of your time as an ML expert will be
spent collecting and cleaning data. And statistics is a field that handles the collection, analysis, and
presentation of data. So it is no surprise that you need to learn it!!!Some of the key concepts in statistics
that are important are Statistical Significance, Probability Distributions, Hypothesis Testing, Regression,
etc. Also, Bayesian Thinking isalso a very important part of ML which deals with various concepts like
Conditional Probability, Priors, and Posteriors, Maximum Likelihood, etc.
(c) Learn Python
Some people prefer to skip Linear Algebra, Multivariate Calculus and Statistics and learn them as they
go along with trial and error. But the one thing that you absolutely cannot skipis Python! While there are
other languages you can use for Machine Learning like R, Scala,etc. Python is currently the most popular
language for ML. In fact, there are many Python libraries that are specifically useful for Artificial
Intelligence and Machine Learning such as Keras, TensorFlow, Scikit-learn, etc. So if you want to learn
ML, it’s best if you learn Python! You can do that using various online resources and courses such as
Fork Python available Free on GeeksforGeeks.

27
Step 2 – Learn Various ML Concepts
Now that you are done with the prerequisites, you can move on to actually learning ML(Which is the
fun part!!!) It’s best to start with the basics and then move on to more complicated stuff. Some of the
basic concepts in ML are:
Terminologies of Machine Learning
• Model – A model is a specific representation learned from data by applying some machine
learning algorithm. A model is also called a hypothesis.
• Feature – A feature is an individual measurable property of the data. A set of numericfeatures
can be conveniently described by a feature vector. Feature vectors are fed as input to the model.
For example, in order to predict a fruit, there may be features like color, smell, taste, etc.
• Target (Label) – A target variable or label is the value to be predicted by our model. For the fruit
example discussed in the feature section, the label with each set of input would be the name of the
fruit like apple, orange, banana, etc.
• Training – The idea is to give a set of inputs(features) and it’s expected outputs(labels),so after
training, we will have a model (hypothesis) that will then map new data to oneof the categories
trained on.
• Prediction – Once our model is ready, it can be fed a set of inputs to which it will provide a
predicted output(label).
Types of Machine Learning
• Supervised Learning – This involves learning from a training dataset with labeled data using
classification and regression models. This learning process continues until the required level of
performance is achieved.
• Unsupervised Learning – This involves using unlabelled data and then finding the underlying structure
in the data in order to learn more and more about the data itself using factor and cluster analysis models.
• Semi-supervised Learning – This involves using unlabelled data like Unsupervised Learning with a
small amount of labeled data. Using labeled data vastly increases thelearning accuracy and is also more
cost-effective than Supervised Learning.
• Reinforcement Learning – This involves learning optimal actions through trial and error. So the next
action is decided by learning behaviors that are based on the currentstate and that will maximize the
•

28
• 5.2.6 ADVANTAGES & DISADVANTAGES OF MACHINE
LEARNING
ADVANTAGES OF MACHINE LEARNING :-
1.Easily identifies trends and patterns -
Machine Learning can review large volumes of data and discover specific trends and patterns that would
not be apparent to humans. For instance, for an e-commerce website like Amazon,it serves to understand
the browsing behaviors and purchase histories of its users to help caterto the right products, deals, and
reminders relevant to them. It uses the results to reveal relevantadvertisements to them.
2.No human intervention needed (automation)
With ML, you don’t need to babysit your project every step of the way. Since it means givingmachines
the ability to learn, it lets them make predictions and also improve the algorithms ontheirown.Acommon
example of this is anti-virus softwares. they learn to filter new threatsas they are recognized. ML is also
good at recognizing spam.
3. Continuous Improvement
As ML algorithms gain experience, they keep improving in accuracy and efficiency. This lets them
make better decisions. Say you need to make a weather forecast model. As the amount of data you have
keeps growing, your algorithms learn to make more accurate predictions faster.
4. Handling multi-dimensional and multi-variety data
Machine Learning algorithms are good at handling data that are multi-dimensional andmultivariety, and
they can do this in dynamic or uncertain environments.
5. Wide Applications
You could be an e-tailer or a healthcare provider and make ML work for you. Where it does apply, it
holds the capability to help deliver a much more personal experience to customers while also targeting
the right customers.

29
DISADVANTAGES OF MACHINE LEARNING :-
1.Data Acquisition
Machine Learning requires massive data sets to train on, and these should be inclusive/unbiased, and of
good quality. There can also be times where they must wait for new data to be generated.
2. Time and Resources
ML needs enough time to let the algorithms learn and develop enough to fulfill their purposewith a
considerable amount of accuracy and relevancy. It also needs massive resources to function. This can
mean additional requirements of computer power for you.
3. Interpretation of Results
Another major challenge is the ability to accurately interpret results generated by the algorithms. You
must also carefully choose the algorithms for your purpose.
4. High error-susceptibility
Machine Learning is autonomous but highly susceptible to errors. Suppose you train an algorithm with
data sets small enough to not be inclusive. You end up with biased predictionscoming from a biased
training set. This leads to irrelevant advertisements being displayed tocustomers. In the case of ML, such
blunders can set off a chain of errors that can go undetectedfor long periods of time. And when they do
get noticed, it takes quite some time to recognizethe source of the issue, and even longer to correct it.
Machine Learning is autonomous but highly susceptible to errors. Suppose you train an algorithm with
data sets small enough to not be inclusive. You end up with biased predictionscoming from a biased
training set. This leads to irrelevant advertisements being displayed tocustomers. In the case of ML, such
blunders can set off a chain of errors that can go undetectedfor long periods of time. And when they do
get noticed, it takes quite some time to recognizethe source of the issue, and even longer to correct it.

30
5.3 PYTHON DEVELOPMENT STEPS
Guido Van Rossum published the first version of Python code (version 0.9.0) at alt.sources in
February 1991. This release included already exception handling, functions, and the coredata types of
list, dict, str and others. It was also object oriented and had a module system. Python version 1.0 was
released in January 1994. The major new features included in this release were the functional
programming tools lambda, map, filter and reduce, which GuidoVan Rossum never liked.Six and a
half years later in October 2000, Python 2.0 was introduced. This release included list
comprehensions, a full garbage collector and it was supporting Unicode Python flourished for another 8
years in the versions 2.x before the nextmajor release as Python 3.0 (also known as "Python 3000" and
"Py3K") was released. Python3 is not backwards compatible with Python 2.x. The emphasis in Python 3
had been on the removal of duplicate programming constructs and modules, thus fulfilling or coming
close to fulfilling the 13th law of the Zen of Python: "There should be one -- and preferably only one --
obvious way to do it. Some changes in Python 7.3:
• Print is now a function
• Views and iterators instead of lists
• The rules for ordering comparisons have been simplified. E.g. a heterogeneous list cannot be
sorted, because all the elements of a list must be comparable to each other.
• There is only one integer type left, i.e. int. long is int as well.
• The division of two integers returns a float instead of an integer. "//" can be used to have the "old"
behaviour.
• Text Vs. Data Instead Of Unicode Vs. 8-bit
PURPOSE
We demonstrated that our approach enables successful segmentation of intra-retinallayers—even with
low-quality images containing speckle noise, low contrast, and differentintensity ranges throughout—
with the assistance of the ANIS feature.
PYTHON
Python is an interpreted high-level programming language for general-purpose programming. Created
by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code
readability, notably using significant whitespace.
Python features a dynamic type system and automatic memory management. It supports multiple
programming paradigms, including object-oriented, imperative, functional and procedural,

31
• Python is Interpreted − Python is processed at runtime by the interpreter. You do not need to compile
your program before executing it. This is similar to PERL and PHP.
• Python is Interactive − you can actually sit at a Python prompt and interact with the interpreter directly
to write your programs.
• Python also acknowledges that speed of development is important. Readable and tersecode is part of
this, and so is access to powerful constructs that avoid tedious repetitionof code. Maintainability also
ties into this may be an all but useless metric, but it does say something about how much code you
have to scan, read and/or understand to troubleshoot problems or tweak behaviors. This speed of
development, the ease with which a programmer of other languages can pick up basic Python skills
and the huge standard library is key to another area where Python excels. All its tools have been quick
to implement, saved a lot of time, and several of them have later been patched and updated by people
with no Python background - without breaking.
5.4 MODULES USED IN PYTHON
TENSORFLOW:
TensorFlow is a free and open-source software library for dataflow and differentiable programming
across a range of tasks. It is a symbolic math library, and is also used for machine learning applications
such as neural networks. It is used for both research and production at Google.
TensorFlow was developed by the Google Brain team for internal Google use. It was releasedunder the
Apache 2.0 open-source license on November 9, 2015.
NUMPY:
Numpy is a general-purpose array-processing package. It provides a high-performance
multidimensional array object, and tools for working with these arrays.
It is the fundamental package for scientific computing with Python. It contains variousfeatures
including these important ones:
• A powerful N-dimensional array object
• Sophisticated (broadcasting) functions
• Tools for integrating C/C++ and Fortran code
• Useful linear algebra, Fourier transform, and random number capabilities
• Besides its obvious scientific uses, Numpy can also be used as an efficient multidimensional

32
container of generic data. Arbitrary data-types can be defined using Numpy which allows Numpy
to seamlessly and speedily integrate with a wide varieties.
PANDAS
Pandas is an open-source Python Library providing high-performance data manipulation and analysis
tool using its powerful data structures. Python was majorly used for data munging and preparation. It
had very little contribution towards data analysis. Pandas solved this problem. Using Pandas, we can
accomplish five typical steps in the processingand analysis of data, regardless of the origin of data load,
prepare, manipulate, model, andanalyze. Python with Pandas is used in a wide range of fields including
academic and commercial domains including finance, economics, Statistics, analytics, etc.
MATPLOTLIB
Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of
hardcopy formats and interactive environments across platforms. Matplotlib canbe used in Python scripts,
the Python and IPython shells, the Jupyter Notebook, web application servers, and four graphical user
interface toolkits. Matplotlib tries to make easythings easy and hard things possible. You can generate
plots, histograms, power spectra, bar charts, error charts, scatter plots, etc., with just a few lines of code.
For examples, see the sample plots and thumbnail gallery.
For simple plotting the pyplot module provides a MATLAB-like interface, particularly when combined
with IPython. For the power user, you have full control of line styles, fontproperties, axes properties, etc.
via an object oriented interface or via a set of functions familiar to MATLAB users.
SCIKIT – LEARN
Scikit-learn provides a range of supervised and unsupervised learning algorithms via a consistent
interface in Python. It is licensed under a permissive simplified BSD license andis distributed under many
Linux distributions, encouraging academic and commercial use.Python
Python is an interpreted high-level programming language for general-purpose programming. Created
by Guido van Rossum and first released in 1991, Python has a design philosophy that emphasizes code
readability, notably using significant whitespace.
Python features a dynamic type system and automatic memory management. It supports multiple
programming paradigms, including object-oriented, imperative, functional and procedural, and has a
large and comprehensive standard library.
• Python is Interpreted − Python is processed at runtime by the interpreter. You do not need to
compile your program before executing it. This is similar to PERL and PHP.

33
• Python is Interactive − you can actually sit at a Python prompt and interact with the interpreter
directly to write your programs.
Python also acknowledges that speed of development is important. Readable and terse codeis part of this,
and so is access to powerful constructs that avoid tedious repetition of code.Maintainability also ties into
this may be an all but useless metric, but it does say somethingabout how much code you have to scan,
read and/or understand to troubleshoot problems or tweak behaviors. This speed of development, the ease
with which a programmer of otherlanguages can pick up basic Python skills and the huge standard library
is key to another area where Python excels. All its tools have been quick to implement, saved a lot of
time,and several of them have later been patched .
Scikit-learn is a popular open-source machine learning library for Python. It provides a wide range of
tools for building and deploying machine learning models efficiently. Here are some key aspects of
scikit-learn:
Simple and Consistent API: Scikit-learn offers a simple and consistent API, making it easy to use and
understand for both beginners and experienced users. The API design follows a common pattern across
different algorithms, which simplifies the learning curve.
Comprehensive Algorithms: Scikit-learn includes a vast collection of machine learning algorithms for
various tasks such as classification, regression, clustering, dimensionality reduction, and more. These
algorithms are implemented efficiently and are well-documented, allowing users to experiment with
different techniques easily.
Model Evaluation and Selection: The library provides tools for evaluating and selecting models,
including metrics for assessing model performance, techniques for cross-v alidation, and hyperparameter
tuning methods like grid search and randomized search.
Preprocessing and Feature Engineering: Scikit-learn offers utilities for preprocessing data and
engineering features, such as scaling, normalization, encoding categorical variables, handling missing
values, and feature selection.
Integration with NumPy and SciPy: Scikit-learn seamlessly integrates with other popular Python
libraries like NumPy and SciPy, leveraging their computational capabilities for efficient data processing
and numerical operations.

34
5.5 INSTALL PYTHON STEP-BY-STEP IN WINDOWS AND MAC
Python a versatile programming language doesn’t come pre-installed on your computerdevices.
Python was first released in the year 1991 and until today it is a very popular high-level programming
language. Its style philosophy emphasizes code readability with its notable use of great whitespace.
The object-oriented approach and language construct provided by Python enables programmers to
write both clear and logical code for projects. This software does not come pre-packaged with
Windows.
HOW TO INSTALL PYTHON ON WINDOWS AND MAC :
There have been several updates in the Python version over the years. The question is how to install
Python? It might be confusing for the beginner who is willing to start learningPython but this tutorial will
solve your query. The latest or the newest version of Python isversion 3.7.4 or in other words, it is
Python 3.
Note: The python version 3.7.4 cannot be used on Windows XP or earlier devices.
Before you start with the installation process of Python. First, you need to know about yourSystem
Requirements. Based on your system type i.e. operating system and based processor, you must
download the python version. My system type is a Windows 64-bit operating system. So the steps
below are to install python version 3.7.4 on Windows 7 device or to install Python 3. Download the
Python Cheats heet here. The steps on how to install Python on Windows 10, 8 and 7 are divided into 4
parts to help understand better.
Download the Correct version into the system
Step 1: Go to the official site to download and install python using Google Chrome orany other
web browser. OR Click on the following link: https://www.python.org

35
Now, check for the latest and the correct version for your operating system.
Step 2: Click on the Download Tab

36
Step 3: You can either select the Download Python for windows 3.7.4 button in Yellow Color or you can
scroll further down and click on download with respective to their version.Here, we are downloading the
most recent python version for windows 3.7.4
Step 4: Scroll down the page until you find the Files option.
Step 5: Here you see a different version of python along with the operating system.
• To download Windows 32-bit python, you can select any one from the three options: Windows
x86 embeddable zip file, Windows x86 executable installer or Windows x86 web-based installer.
•To download Windows 64-bit python, you can select any one from the three options: Windows x86-64
embeddable zip file, Windows x86-64 executable installer or Windows x8664 web-based installer.
Here we will install Windows x86-64 web-based installer. Here your first part regarding which version
of python is to be downloaded is completed. Now we move ahead with thesecond part in installing python
i.e. Installation

37
Note: To know the changes or updates that are made in the version you can click on theRelease
Note Option.
Installation of Python
Step 1: Go to Download and Open the downloaded python version to carry out theinstallation process.
Step 2: Before you click on Install Now, Make sure to put a tick on Add Python 3.7 to PATH.

38
Step 3: Click on Install NOW After the installation is successful. Click on Close.
With these above three steps on python installation, you have successfully and correctly installed Python.
Now is the time to verify the installation. Note: The installation processmight take a couple of minutes.
Verify the Python Installation
Step 1: Click on Start
Step 2: In the Windows Run Command, t

39
Step 3: Open the Command prompt option.
Step 4: Let us test whether the python is correctly installed. Type python –V and press
Step 5: You will get the answer as 3.7.4
Note: If you have any of the earlier versions of Python already installed. You must firstuninstall the
earlier version and then install the new one.
Check how the Python IDLE works
Step 1: Click on Start
Step 2: In the Windows Run command, type “python idle”.
Step 3: Click on IDLE (Python 3.7 64-bit) and launch the program
Step 4: To go ahead with working in IDLE you must first save the file. Click on File > Clickon Save

40
Step 5: Name the file and save as type should be Python files. Click on SAVE. Here I havenamed the files
as Hey World.
Step 6: Now for e.g. enter print

41
6. IMPLEMENTATIONS
6.1 SOFTWARE ENVIRONMENT
6.1.1 PYTHON
Python is a general-purpose interpreted, interactive, object-oriented, and high-level programming
language. An interpreted language, Python has a design philosophy that emphasizes
code readability (notably using whitespace indentation to delimit code blocks rather than curly brackets
or keywords), and a syntax that allows programmers to express concepts in fewer lines of code than
might be used in languages such as C++or Java. It provides constructs that enable clear programming on
both small and large scales. Python interpreters are available for many operating systems. CPython,
the reference implementation of Python, is open source software and has a community-based
development model, as do nearly all of its variant implementations. CPython is managed by the non-
profit Python Software Foundation. Python features a dynamic type system and automatic memory
management. It supports multiple programming paradigms, including object-
oriented, imperative, functional and procedural, and has a large and comprehensive standard library.
WHAT CAN PYHON DO?
• Python can be used on a server to create web applications.
• Python can be used alongside software to create workflows.
• Python can connect to database systems. It can also read and modify files.
• Python can be used to handle big data and perform complex mathematics.
• Python can be used for rapid prototyping, or for production-ready software development.
Introduction to Python
Python is a widely used general-purpose, high level programming language. It was created by Guido
van Rossum in 1991 and further developed by the Python Software Foundation. It was designed with
an emphasis on code readability, and its syntax allows programmers to express their concepts in fewer
lines of code.
Python is a programming language that lets you work quickly and integrate systems more efficiently.
There are two major Python versions: Python 2 and Python 3. Both are quite different.
REASON FOR INCREASING POPULARITY
Emphasis on code readability, shorter codes, ease of writing
Programmers can express logical concepts in fewer lines of code in comparison to languages such as
C++ or Java.

42
Python supports multiple programming paradigms, like object-oriented, imperative and functional
programming or procedural.
There exists inbuilt functions for almost all of the frequently used concepts.
Philosophy is “Simplicity is the best”.
LANGUAGE FEATURES
Interpreted
There are no separate compilation and execution steps like C and C++.
Directly run the program from the source code.
Internally, Python converts the source code into an intermediate form called bytecodes which is then
translated into native language of specific computer to run it.
No need to worry about linking and loading with libraries, etc.
Platform Independent
Python programs can be developed and executed on multiple operating system platforms.
Python can be used on Linux, Windows, Macintosh, Solaris and many more.
Free and Open Source; Redistributable
High-level Language
In Python, no need to take care about low-level details such as managing the memory used by the
program.
Simple
Closer to English language;Easy to Learn
More emphasis on the solution to the problem rather than the syntax
Embeddable
Python can be used within C/C++ program to give scripting capabilities for the program’s users.
Robust:
Exceptional handling features
Memory management techniques in built
Rich Library Support
The Python Standard Library is very vast.
Known as the “batteries included” philosophy of Python ;It can help do various things involving
regular expressions, documentation generation, unit testing, threading, databases, web browsers, CGI,
email, XML, HTML, WAV files, cryptography, GUI and many more.
Besides the standard library, there are various other high-quality libraries such as the Python Imaging
Library which is an amazingly simple image manipulation library.

43
DJANGO
Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic
design. Built by experienced developers, it takes care of much of the hassle of Web development, so you
can focus on writing your app without needing to reinvent the wheel. It’s free and open source.
Django's primary goal is to ease the creation of complex, database-driven websites. Django
emphasizes reusabilityand "pluggability" of components, rapid development, and the principle of don't
repeat yourself. Python is used throughout, even for settings files and data models.
Django is a Python-based web framework that allows you to create efficient web applications quickly. It
is also called batteries included framework because Django provides built-in features for everything
including Django Admin Interface, default database – SQLlite3, etc.
This free Django tutorial is built for beginners looking to build web applications, and can also be used
by experts to brush up their Django skills. Covering all topics from basic to advanced, this Django tutorial
is your one-stop destination to learn Django.
Django is a high-level Python web framework that encourages rapid development and clean, pragmatic
design. It follows the "Don't Repeat Yourself" (DRY) principle and emphasizes reusability and
pluggability of components. Developed by Adrian Holovaty and Simon Willison, Django was first
released in 2005 and has since become one of the most popular web frameworks for building dynamic
web applications.
Here are some key aspects of Django:

44
1. **MVC Architecture**: Django follows the Model-View-Controller (MVC) architectural pattern,
although it refers to it as the "Model-View-Template" (MVT). Models represent the data structure, views
handle the presentation logic, and templates define the user interface. This separation of concerns
facilitates modular and maintainable code.
2. **ORM (Object-Relational Mapping)**: Django includes a built-in ORM that abstracts away the
complexities of database interaction. Developers can define database models using Python classes, and
Django handles the translation of these models into database tables and SQL queries. This simplifies
database access and reduces the amount of boilerplate code.
3. **Admin Interface**: Django provides a powerful administrative interface automatically generated
from the database models. This admin interface allows administrators to manage site content, users, and
other aspects of the application without writing custom code. It is highly customizable and extensible.
4. **URL Routing**: Django's URL dispatcher allows developers to map URLs to view functions or
class-based views. This enables clean and flexible URL routing, making it easy to design RESTful APIs
and user-friendly URLs.
5. **Template Engine**: Django comes with a template engine that allows developers to define the
presentation layer using HTML templates with embedded Python code. Templates support inheritance,
template inheritance, and template tags, making it easy to create dynamic and reusable HTML content.
6. **Security Features**: Django includes built-in security features to help developers build secure
web applications. These features include protection against common security threats such as SQL
injection, cross-site scripting (XSS), cross-site request forgery (CSRF), and clickjacking.
7. **Authentication and Authorization**: Django provides robust authentication and authorization
mechanisms out of the box. Developers can easily integrate user authentication, permission management,
and user sessions into their applications.
8. **Internationalization and Localization**: Django supports internationalization (i18n) and
localization (l10n) features, allowing developers to build applications that support multiple languages
and locales.
9. **Community and Ecosystem**: Django has a large and active community of developers who
contribute to its development, provide support, and create reusable packages and extensions (e.g.,

45
reusable apps, middleware, and utilities) that further extend Django's functionality.
ADVANTAGES OF DJANGO
Rapid Development: Django follows the "Don't Repeat Yourself" (DRY) principle and provides a lot
of built-in functionalities and conventions, allowing developers to build web applications quickly and
efficiently. Its batteries-included approach means that developers don't have to reinvent the wheel for
common web development tasks.
Built-in Admin Interface: Django comes with a powerful admin interface that is automatically
generated based on the application's models. This admin interface allows developers to manage site
content, users, and permissions without writing any additional code. It's highly customizable and
provides a convenient way for administrators to interact with the application.
ORM (Object-Relational Mapping): Django's ORM abstracts away the complexities of database
interaction by allowing developers to define database models as Python classes. Django takes care of
translating these models into database tables and managing database queries, making database operations
easy and intuitive.
Security Features: Django takes security seriously and provides built-in protection against common
security threats such as SQL injection, cross-site scripting (XSS), cross-site request forgery (CSRF), and
clickjacking. It also provides tools for user authentication, authorization, and session management,
making it easier for developers to build secure web applications.
Scalability: Django is designed to scale with the needs of the application. It provides tools and best
practices for optimizing database queries, caching, and handling high traffic loads. Many large-scale
websites and web applications, including Instagram and Pinterest, have been built using Django
Versatility: Django is a versatile framework that can be used to build a wide range of web applications,
from simple websites to complex, high-traffic platforms. It provides support for various types of web
development tasks, including content management systems (CMS), e-commerce platforms, social
Community and Ecosystem: Django has a large and active community of developers who contribute
to its development, provide support, and create reusable packages and extensions that extend Django's
functionality. The Django ecosystem includes a vast collection of third-party libraries, reusable apps, and
resources that make it easier for developers to build web applications.

46
DISADVANTAGES OF DJANGO
Here are some of the common drawbacks associated with Django:
Learning Curve: Django has a relatively steep learning curve, especially for beginners with limited
experience in web development or Python programming. Its extensive feature set and conventions may
take some time to master, leading to a longer ramp-up period for new developers.
Opinionated Framework: Django is an opinionated framework, meaning it imposes certain design
patterns and conventions on developers. While these conventions can promote consistency and best
practices, they may also limit flexibility and creative freedom, especially for developers who prefer more
customizable or lightweight solutions.
Monolithic Architecture: Django follows a monolithic architecture, which means that all components
of the application (e.g., models, views, templates) are tightly coupled within a single framework. While
this can simplify development in many cases, it may not be suitable for projects that require
microservices or a more modular architecture.
Performance Overhead: Django's extensive feature set and abstractions may introduce some
performance overhead compared to more lightweight frameworks. In some cases, developers may need
to optimize Django applications to achieve better performance, especially for high-traffic or resource-
intensive applications.
Limited Real-Time Capabilities: Django is not well-suited for real-time applications or applications
requiring bidirectional communication between the client and server (e.g., chat applications, real-time
collaborative editing). While it's possible to integrate real-time features using third-party libraries or
services, it's not a native strength of the framework.
Dependency on Django ORM: While Django's ORM simplifies database interactions, it may not be
suitable for all projects or database requirements. In some cases, developers may prefer to use a different
ORM or interact with the database directly, which can be challenging within the Django ecosystem.
Upgrades and Maintenance: Upgrading Django versions or maintaining older projects can sometimes
be challenging, especially if the project relies heavily on deprecated features or third-party libraries with

47
limited support. Keeping up with Django's release cycle and ensuring compatibility with newer versions
may require additional effort.
Scalability Challenges: While Django is designed to scale with the needs of the application, scaling
Django applications to handle extremely high traffic or complex architectures may require careful
planning and optimization. Developers may need to implement caching, database sharding, or other
scalability techniques to address performance bottlenecks.

48
6.1.2 SAMPLE CODE
Main.py
from tkinter import messagebox
from tkinter import *
from tkinter import simpledialog
import tkinter
from tkinter import filedialog
import matplotlib.pyplot as plt
import numpy as np
from tkinter.filedialog import askopenfilename
import pandas as pd
from sklearn import *
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers.core import Dense,Activation,Dropout
from keras.callbacks import EarlyStopping
from sklearn.preprocessing import OneHotEncoder
from keras.optimizers import Adam
import os
main = tkinter.Tk()
main.title("Average Fuel Consumption") #designing main screen
main.geometry("1300x1200")
global filename
global train_x, test_x, train_y, test_y
global balance_data
global model
global ann_acc
global testdata
global predictdata
def importdata():
global balance_data
balance_data = pd.read_csv(filename)
balance_data = balance_data.abs()
return balance_data

49
def splitdataset(balance_data):
X = balance_data.values[:, 0:7]
y_ = balance_data.values[:, 7]
print(y_)
y_ = y_.reshape(-1, 1)
encoder = OneHotEncoder(sparse=False)
Y = encoder.fit_transform(y_)
print(Y)
train_x, test_x, train_y, test_y = train_test_split(X, Y, test_size=0.2)
text.insert(END,"Dataset Length : "+str(len(X))+"n");
return train_x, test_x, train_y, test_y
def upload(): #function to upload tweeter profile
global filename
filename = filedialog.askopenfilename(initialdir="dataset")
text.delete('1.0', END)
text.insert(END,filename+" loadednn");
def generateModel():
data = importdata()
train_x, test_x, train_y, test_y = splitdataset(data)
text.insert(END,"Splitted Training Length : "+str(len(train_x))+"n");
text.insert(END,"Splitted Test Length : "+str(len(test_x))+"n");
def ann():
global model
global ann_acc
model = Sequential()
model.add(Dense(200, input_shape=(7,), activation='relu', name='fc1'))
model.add(Dense(200, activation='relu', name='fc2'))
model.add(Dense(19, activation='softmax', name='output'))
optimizer = Adam(lr=0.001)
model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
print('CNN Neural Network Model Summary: ')
print(model.summary())
model.fit(train_x, train_y, verbose=2, batch_size=5, epochs=200)

50
train_x, test_x, train_y, test_y = train_test_split(X, Y, test_size=0.2)
text.insert(END,"Dataset Length : "+str(len(X))+"n");
return train_x, test_x, train_y, test_y
def upload(): #function to upload tweeter profile
global filename
text.insert(END,filename+" loadednn");
def generateModel():
data = importdata()
train_x, test_x, train_y, test_y = splitdataset(data)
text.insert(END,"Splitted Training Length : "+str(len(train_x))+"n");
text.insert(END,"Splitted Test Length : "+str(len(test_x))+"n");
def ann():
global model
global ann_acc
model = Sequential()
model.add(Dense(200, input_shape=(7,), activation='relu', name='fc1'))
model.add(Dense(200, activation='relu', name='fc2'))
model.add(Dense(19, activation='softmax', name='output'))
optimizer = Adam(lr=0.001)
model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
print('CNN Neural Network Model Summary: ')
print(model.summary())
model.fit(train_x, train_y, verbose=2, batch_size=5, epochs=200)
results = model.evaluate(test_x, test_y)
text.insert(END,"ANN Accuracy for dataset "+filename+"n");
text.insert(END,"Accuracy Score : "+str(results[1]*100)+"nn")
ann_acc = results[1] * 100
def predictFuel():
global testdata
global predictdata

51
testdata = pd.read_csv(filename)
testdata = testdata.values[:, 0:7]
predictdata = model.predict_classes(testdata)
print(predictdata)
for i in range(len(testdata)):
text.insert(END,str(testdata[i])+" Average Fuel Consumption : "+str(predictdata[i])+"n");
def graph():
x = []
y = []
x.append(i)
y.append(predictdata[i])
plt.plot(x, y)
plt.xlabel('Vehicle ID')
plt.ylabel('Fuel Consumption/10KM')
plt.title('Average Fuel Consumption Graph')
plt.show()
font = ('times', 16, 'bold')
title = Label(main, text='A Machine Learning Model for Average Fuel Consumption in Heavy Vehicles')
title.config(bg='greenyellow', fg='dodger blue')
title.config(font=font)
title.config(height=3, width=120)
title.place(x=0,y=5)
font1 = ('times', 12, 'bold')
text=Text(main,height=20,width=150)
scroll=Scrollbar(text)
text.configure(yscrollcommand=scroll.set)
text.place(x=50,y=120)
text.config(font=font1)
uploadButton = Button(main, text="Upload Heavy Vehicles Fuel Dataset", command=upload)
uploadButton.place(x=50,y=550)
uploadButton.config(font=font1)

52
modelButton = Button(main, text="Read Dataset & Generate Model", command=generateModel)
modelButton.place(x=420,y=550)
modelButton.config(font=font1)
annButton = Button(main, text="Run ANN Algorithm", command=ann)
annButton.place(x=760,y=550)
annButton.config(font=font1)
predictButton = Button(main, text="Predict Average Fuel Consumption", command=predictFuel)
predictButton.place(x=50,y=600)
predictButton.config(font=font1)
graphButton = Button(main, text="Fuel Consumption Graph", command=graph)
graphButton.place(x=420,y=600)
graphButton.config(font=font1)
exitButton = Button(main, text="Exit", command=exit)
exitButton.place(x=760,y=600)
exitButton.config(font=font1)
main.config(bg='LightSkyBlue')
main.mainloop()
results = model.evaluate(test_x, test_y)
text.insert(END,"ANN Accuracy for dataset "+filename+"n");
text.insert(END,"Accuracy Score : "+str(results[1]*100)+"nn")
ann_acc = results[1] * 100
def predictFuel():
global testdata
global predictdata
testdata = pd.read_csv(filename)
testdata = testdata.values[:, 0:7]
predictdata = model.predict_classes(testdata)
print(predictdata)

53
text.insert(END,str(testdata[i])+" Average Fuel Consumption : "+str(predictdata[i])+"n");
def graph():
x = []
y = []
x.append(i)
y.append(predictdata[i])
plt.plot(x, y)
plt.xlabel('Vehicle ID')
plt.ylabel('Fuel Consumption/10KM')
plt.title('Average Fuel Consumption Graph')
plt.show()
font = ('times', 16, 'bold')
title = Label(main, text='A Machine Learning Model for Average Fuel Consumption in Heavy Vehicles')
title.config(bg='greenyellow', fg='dodger blue')
title.config(font=font)
title.config(height=3, width=120)
title.place(x=0,y=5)
text=Text(main,height=20,width=150)
scroll=Scrollbar(text)
text.configure(yscrollcommand=scroll.set)
text.place(x=50,y=120)
text.config(font=font1)
uploadButton = Button(main, text="Upload Heavy Vehicles Fuel Dataset", command=upload)
uploadButton.place(x=50,y=550)
uploadButton.config(font=font1)
modelButton = Button(main, text="Read Dataset & Generate Model", command=generateModel)
modelButton.place(x=420,y=550)
modelButton.config(font=font1)
annButton = Button(main, text="Run ANN Algorithm", command=ann)
annButton.place(x=760,y=550)

54
annButton.config(font=font1)
predictButton = Button(main, text="Predict Average Fuel Consumption", command=predictFuel)
predictButton.place(x=50,y=600)
predictButton.config(font=font1)
graphButton = Button(main, text="Fuel Consumption Graph", command=graph)
graphButton.place(x=420,y=600)
graphButton.config(font=font1)
exitButton = Button(main, text="Exit", command=exit)
exitButton.place(x=760,y=600)
exitButton.config(font=font1)

55
7.SYSTEM TESTING
7.1 INTRODUCTION TO TESTNG
The purpose of testing is to discover errors. Testing is the process of trying to discover every conceivable
fault or weakness in a work product. It provides a way to check the functionality of components, sub
assemblies, assemblies and/or a finished product It is the process of exercising software with the intent
of ensuring that the Software system meets its requirementsand user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a specific testing
requirement.
TYPES OF TESTS
Unit testing
Unit testing involves the design of test cases that validate that the internal program logic is functioning
properly, and that program inputs produce valid outputs. All decision branches andinternal code flow
should be validated. It is the testing of individual software units of the application .it is done after the
completion of an individual unit before integration. This is a structural testing, that relies on knowledge
of its construction and is invasive. Unit tests performbasic tests at component level and test a specific
business process, application, and/or system configuration. Unit tests ensure that each unique path of a
business process performs accuratelyto the documented specifications and contains clearly defined inputs
and expected results.
Integration testing
Integration tests are designed to test integrated software components to determine if they actually run as
one program. Testing is event driven and is more concerned with the basic outcome of screens or fields.
Integration tests demonstrate that although the components wereindividually satisfaction, as shown by
successfully unit testing, the combination of componentsis correct and consistent. Integration testing is
specifically aimed at exposing the problems that arise from the combination of components.
Functional test
Functional tests provide systematic demonstrations that functions tested are available asspecified by the
business and technical requirements, system documentation, and user manuals.Functional testing is
centered on the following items:
Valid Input : identified classes of valid input must be accepted. Invalid Input
identified classes of invalid input must be rejected.Functions

56
Output : identified classes of application outputs must be exercised.
Systems/Procedures : interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on requirements, key functions, or special test
cases. In addition, systematic coverage pertaining to identify Business process flows; data fields,
predefined processes, and successive processes must be considered for testing. Before functional testing
is complete, additional tests are identified and the effective value of current tests is determined.
System Test
System testing ensures that the entire integrated software system meets requirements. It tests a
configuration to ensure known and predictable results. An example of system testing is the configuration
oriented system integration test. System testing is based on process descriptionsand flows, emphasizing
pre-driven process links and integration points.
White Box Testing
White Box Testing is a testing in which in which the software tester has knowledge of the innerworkings,
structure and language of the software, or at least its purpose. It is purpose. It is usedto test areas that
cannot be reached from a black box level.
Black Box Testing
Black Box Testing is testing the software without any knowledge of the inner workings, structure or
language of the module being tested. Black box tests, as most other kinds of tests,must be written from
a definitive source document, such as specification or requirements document, such as specification or
requirements document. It is a testing in which the software
under test is treated, as a black box .you cannot “see” into it. The test provides inputs andresponds
to outputs without considering how the software works.
Unit Testing
Unit testing is usually conducted as part of a combined code and unit test phase of the softwarelifecycle,
although it is not uncommon for coding and unit testing to be conducted as two distinct phases.

57
7.2 TESTING STRATEGIES
Field testing will be performed manually and functional tests will be written in detail.
Test objectives
• All field entries must work properly.
• Pages must be activated from the identified link.
• The entry screen, messages and responses must not be delayed.
Features to be tested
• Verify that the entries are of the correct format
• No duplicate entries should be allowed
• All links should take the user to the correct page.
•
Integration Testing
Software integration testing is the incremental integration testing of two or more integratedsoftware
components on a single platform to produce failures caused by interface defects.
The task of the integration test is to check that components or software applications, e.g.components
in a software system or – one step up – software applications at the company level
– interact without error.
Test Results: All the test cases mentioned above passed successfully. No defects encountered.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant participationby the
end user. It also ensures that the system meets the functional requirements.
Test Results: All the test cases mentioned above passed successfully. No defects encountered

58
TEST CASES
TEST CASE-1
TEST CASE-2

59
8.SCREENSHOTS
A MACHINE LEARNING MODEL FOR AVERAGE FUEL CONSUMPTION IN
HEAVY VEHICLES
In this paper author is describing concept to predict average fuel consumption in heavy vehicles using
Machine Learning Algorithm such as ANN (Artificial Neural Networks). To predict fuel consumption
author has extracted 7 predictor features from heavy vehicle dataset such as
num_stops, time_stopped, average_moving_speed, characteristic_acceleration,
aerodynamic_speed_squared, change_ in_kinetic_energy, change_in_potential_energy, class
Above seven features are recorded from each vehicle travel up to 100 kilo meters like number of times
vehicle stop, total stopped time taken etc. All this values are collected from heavy vehicle and use as
dataset to train ANN model. Below are some value from above seven predictor features.
aerodynamic_speed_squared, change_ in_kinetic_energy, change_in_potential_energy, class
7.0, 7.0, 93.0, 34, 8.4, 4, 25.6, 9
7.0, 7.0, 91.0, 34, 8.3, 4, 25.7, 9
8.9, 8.9, 151.0, 26, 10.9, 6, 15.1, 12
9.3, 9.3, 160.0, 25, 11.3, 6, 13.7, 13
8.4, 8.4, 158.0, 25, 11.2, 6, 13.8, 13
All bold names are the dataset column names and all double values are the dataset values for each vehicle.
Last column will be consider as class name which represents fuel consumption for that vehicle. Entire
dataset will be used to train ANN model and whenever we give new record then ANN algorithm will
apply train model on that test record to predict it average fuel consumption.
Below are some test records
aerodynamic_speed_squared, change_ in_kinetic_energy, change_in_potential_energy
7.0, 7.0, 93.0, 34, 8.4, 4, 25.6
7.0, 7.0, 91.0, 34, 8.3, 4, 25.7

60
8.9, 8.9, 151.0, 26, 10.9, 6, 15.1
9.3, 9.3, 160.0, 25, 11.3, 6, 13.7
8.4, 8.4, 158.0, 25, 11.2, 6, 13.8
In above test data class value as fuel consumption is not there and when we applied this test record on
ANN model then ANN will predict fuel consumption class value for that test record. Entire train and test
data available inside ‘dataset’ folder.
The ANN model is developed by using duty cycle’s dataset collected from a single truck, with an
approximate mass of 8, 700kg exposed to a variety of transients including both urban and highway traffic
in the Indianapolis area. Data was collected using the SAE J1939 standard for serial control and
communications in heavy duty vehicle networks.
ANN WORKING PROCEDURE
To demonstrate how to build an ANN neural network based image classifier, we shall build a 6 layer
neural network that will identify and separate one image from other. This network that we shall build is
a very small network that we can run on a CPU as well. Traditional neural networks that are very good
at doing image classification have many more parameters and take a lot of time if trained on normal
CPU. However, our objective is to show how to build a real-world convolutional neural network using
tensor flow.
Neural Networks are essentially mathematical models to solve an optimization problem. They are made
of neurons, the basic computation unit of neural networks. A neuron takes an input (say x), do some
computation on it (say: multiply it with a variable w and adds another variable b) to produce a value (say;
z= wx + b). This value is passed to a non-linear function called activation function (f) to produce the
final output (activation) of a neuron. There are many kinds of activation functions. One of the popular
activation function is Sigmoid. The neuron which uses sigmoid function as an activation function will
be called sigmoid neuron. Depending on the activation functions, neurons are named and there are many
kinds of them like RELU, TanH.
If you stack neurons in a single line, it’s called a layer; which is the next building block of neural
networks. See below image with layers

61
To predict image class multiple layers operate on each other to get best match layer and this process
continues till no more improvement left.
Modules Information
This project consists of following modules
Upload Heavy Vehicles Fuel Dataset: Using this module we can upload train dataset to application.
Dataset contains comma separated values.
Read Dataset & Generate Model: Using this module we will parse comma separated dataset and then
generate train and test model for ANN from that dataset values. Dataset will be divided into 80% and
20% format, 80% will be used to train ANN model and 20% will be used to test ANN model.
Run ANN Algorithm: Using this model we can create ANN object and then feed train and test data to
build ANN model.
Predict Average Fuel Consumption: Using this module we will upload new test data and then ANN will
apply train model on that test data to predict average fuel consumption for that test records.
Fuel Consumption Graph: Using this module we will plot fuel consumption graph for each test record.
Screen shots
To run this project double click on ‘run.bat’ file to get below screen

62
SCREENSHOT-1
In above screen click on ‘Upload Heavy Vehicles Fuel Dataset’ button to upload train
dataset
FIG-1
SCREENSHOT-2
In above screen uploading ‘Fuel_Dataset.txt’ which can be used to train model. After
uploading dataset will get below screen

63
SCREENSHOT-3
FIG-2
Now in above screen click on ‘Read Dataset & Generate Model’ button to read uploaded
dataset and to generate train and test data
SCREENSHOT-4
FIG-3
In above screen we can see total number of records in dataset, number of records used for
training and number for records used for testing. Now click on ‘Run ANN Algorithm’
button to input train and test data to ANN to build ANN model

64
SCREENSHOT-6
In above black console we can see all ANN processing details, After building model will
get below screen
SCREENSHOT-7
In above screen we got ANN prediction accuracy upto 86%. Now click on ‘Predict
Average Fuel Consumption’ button to upload test data and to predict consumption for test
data
FIG-4

65
SCREENSHOT-8
After uploading test data will get fuel consumption prediction result in below screen
SCREENSHOT-9
In above screen we got average fuel consumption for each test record per 100 kilo meter.
Now click on ‘Fuel Consumption Graph’ to view below graph

66
SCREENSHOT-10
FIG-5
In above graph x-axis represents test record number as vehicle id and y-axis represents
fuel consumption for that record.
Conclusion: Using this paper and ANN algorithm we are predicting fuel consumption for
test data

67
9.CONCLUSION
This paper presented a machine learning model that can be conveniently developed for each heavy
vehicle in a fleet. The model relies on seven predictors: number of stops, stop time, average moving
speed, characteristic acceleration, aerodynamic speed squared, change in kinetic energy and change in
potential energy. The last two predictors are introduced in this paper to help capture the average dynamic
behavior of the vehicle. All of the predictors of the model are derived from vehicle speed and road grade.
These variables are readily available from telematics devices that are becoming an integral part of
connected vehicles. Moreover, the predictors can be easily computed on-board from these two variables.
The model predictors are aggregated over a fixed distance traveled (i.e., window) instead of a fixed time
interval. This mapping of the input space to the distance domain aligns with the domain of the target
output, and produced a machine learning model for fuel consumption with an RMSE < 0.015 l/100 km.
Different model configurations with 1, 2, and 5 km window sizes were evaluated. The results show that
the 1 km window has the highest accuracy. This model is able to predict the actual fuel consumption on
a per 1 km-basis with a CD of 0.91. This performance is closer to that of physics-based models and the
proposed model improves upon previous machine learning models that show comparable results only for
entire long-distance trips.
Selecting an adequate window size should take into consideration the cost of the model in terms of data
collection and onboard computation. Moreover, the window size is likely to be application-dependent.
For fleets with short trips (e.g., construction vehicles within a site) or urban traffic routes, a 1 km window
size is recommended. For long-haul fleets, a 5 km window size may be sufficient. In this study, the duty
cycles consisted of both highway and city traffic and therefore, the 1 km window was more adequate
than the 5 km window. Future work includes understanding these differentiating factors and the selection
of the appropriate window size. Expanding the model to other vehicles with different characteristics such
as varying masses and aging vehicles is being studied. Predictors for these characteristics will be added
in order to allow for the same model to capture the impact on fuel consumption due to changes in vehicle
mass and wear.

68
10.REFERENCES
[1] B. Lee, L. Quinones, and J. Sanchez, “Development of greenhouse gas emissions model for 2014-2017 heavy-
and medium-duty vehicle compliance,” SAE Technical Paper, Tech. Rep., 2011.
[2] G. Fontaras, R. Luz, K. Anagnostopoulus, D. Savvidis, S. Hausberger, and M. Rexeis, “Monitoring co2
emissions from hdv in europe-an experimental proof of concept of the proposed methodolgical approach,” in 20th
International Transport and Air Pollution Conference, 2014.
[3] S. Wickramanayake and H. D. Bandara, “Fuel consumption prediction of fleet vehicles using machine learning:
A comparative study,” in Moratuwa Engineering Research Conference (MERCon), 2016. IEEE, 2016, pp. 90–95.
[4] L. Wang, A. Duran, J. Gonder, and K. Kelly, “Modeling heavy/mediumduty fuel consumption based on drive
cycle properties,” SAE Technical Paper, Tech. Rep., 2015.
[5] Fuel Economy and Greenhouse gas exhaust emissions of motor vehicles Subpart B - Fuel Economy and
Carbon-Related Exhaust Emission Test Procedures, Code of Federal Regulations Std. 600.111-08, Apr 2014.
[6] SAE International Surface Vehicle Recommended Practice, Fuel Consumption Test Procedure - Type II,
Society of Automotive Engineers Std., 2012.
[7] F. Perrotta, T. Parry, and L. C. Neves, “Application of machine learning for fuel consumption modelling of
trucks,” in Big Data (Big Data), 2017 IEEE International Conference on. IEEE, 2017, pp. 3810–3815.
[8] S. F. Haggis, T. A. Hansen, K. D. Hicks, R. G. Richards, and R. Marx, “In-use evaluation of fuel economy and
emissions from coal haul trucks using modified sae j1321 procedures and pems,” SAE International Journal of
Commercial Vehicles, vol. 1, no. 2008-01-1302, pp. 210–221, 2008.
[9] A. Ivanco, R. Johri, and Z. Filipi, “Assessing the regeneration potential for a refuse truck over a real-world
duty cycle,” SAE International Journal of Commercial Vehicles, vol. 5, no. 2012-01-1030, pp. 364–370, 2012.
[10] A. A. Zaidi, B. Kulcsr, and H. Wymeersch, “Back-pressure traffic signal control with fixed and adaptive
routing for urban vehicular networks.

IJEEE - HEAVY VEHICLE FUEL EFFICIENCY PREDICTION USING MACHINE LEARNING.pdf

More Related Content

Similar to IJEEE - HEAVY VEHICLE FUEL EFFICIENCY PREDICTION USING MACHINE LEARNING.pdf

More from spub1985

Recently uploaded

IJEEE - HEAVY VEHICLE FUEL EFFICIENCY PREDICTION USING MACHINE LEARNING.pdf