SlideShare a Scribd company logo
1 of 101
Download to read offline
A Project Report
on
ANDROID BASED FACE MASK DETECTION SYSTEM
Pantech Solution pvt Limited
Project Report Submitted By
APU KUMAR GIRI
4th Semester MCA
BPUT Reg. No. 2005280031
Biju Patnaik University of Technology, Rourkela, Odisha.
Under The Guidance of
Prof. PRAVAKAR MISHRA
(HOD,Dpartment of Master of Computer Application )
Fulfilment of the Requirement of Project Report Submitted to NIIS INSTITUTE OF BUSINESS
ADMINISTRATION in Partial 4th Semester MCA examinations of BPUT,ODISHA-2022
NIIS INSTITUTE OF BUSINESS ADMINISTRATION
Madanpur,Bhubaneswar,Odisha,Pin code-752054
www.niisgroup.org/www.niisinst.com
Srinivasan.N
CIN: U80902TN2021PTC141464
www.pantechelearning.com
PANHYD0660/MAJOR/ANDROID/2021-2022
COMPLETION CERTIFICATE
TO WHOMSOEVER IT MAY CONCERN
This is to certify that Mr./Mrs. APU KUMAR GIRI, Roll Number -(2005280031), who
is pursuing Master of Computer Applications, Department at NIIS INSTITUTE OF
BUSINESS ADMINISTRATION has successfully completed his/her Major Project at, Pantech
Solutions Pvt Limited on (“FACE MASK DETECTION”) and has submitted the report.
During the Major Project period, the candidate has shown keen interest and
commitment towards learning and his/her performance was good.
Yours truly,
Pantech E Learning Pvt. Ltd,
(Branch Manager)
Pantech E Learning Pvt Ltd
4th
Floor, Delta Chambers,
Behind Chennai Shopping Mall, Ameerpet, Hyderabad, Telangana – 500 016
Phone:91 040-40077960. | hr@pantechmail.com
1
NIIS INSTITUTE OF BUSINESS ADMINISTRATION
CERTIFICATE
This is to certify that the project work report entitled
“ANDROID BASED FACE MASK DETECTION” submitted
by APU KUMAR GIRI (Roll no-2005280031) have the
undertaken and successfully completed the project
submission in partial fulfilment of the requirement for
MASTER OF COMPUTER APPLICATION(MCA).
This work is original and being submitted as
a part of 4th semester project for MCA curriculum.
Signature of Internal guide Signature of External guide
22
Guidance Certificate
This is to certify that the Project is titled ANDROID BASED FACE MASK
DETECTION SYSTEM. This Project is Submitted by APU KUMAR GIRI
Student of MCA , 4th semester , regd. No.-2005280031 , NIIS
INSTITUTE OF BUSINESS ADMINISTRATION ,BHUBANESWAR in
fulfilment of the requirements for MASTER OF COMPUTER
APPLICATION. This project was an Excellent work done by him under
my guidance.
Signature of guidance
NIIS INSTITUTE OF BUSINESS ADMINISTRATION
DECLARATION
I, APU KUMAR GIRI ,The student of MCA 4th
Semester (2020-22) studying at NIIS
INSTITUTE OF BUSINESS ADMINISTRATION,Madanpur,Bhubaneswar solemnly
declare that the project titled “ANDROID BASED FACE MASK DETECTION
SYSTEM” was carried out by we partial fulfilment of MCA programme.
This programme was undertaken as a part of academic curriculum
according to the University rules and norms and by no commercial interest and
motives.
APU KUMAR GIRI
MCA 4TH
SEMSETER
ROLL NO-2005280031
NIIS INSTITUTE OF BUSINES
ADMINISTRATION
MADANPUR,BHUBANESWAR
ODISHA,752054
4
ACKNOWLEDGEMENT
I feel great pleasure for the completion of this project. At the very
outset I would express my sincere of thanks and deep sense of
gratitude to personnel who helped me during the collection of data
and gave me rare and valuable guidance for the preparation of this
report . I thank to my guide com supervisor Prof. PRAVAKAR MISHRA
for his Continuous patience and support.
I take this opportunity to express my deep sense of gratitude
and appreciation to my project guide Prof. PRAVAKAR MISHRA for his
assistance , motivation and being continual source of encouragement
for me.
I would like to thanks Prof. PRAVAKAR MISHRA (HOD of
MCA) for always helping me right from the beginning of this project.
I take opportunity to thanks all my friends and also thank all
the people who directly or indirectly concerned with this project. I also
express my gratitude to my parents who give a constant support and
love throughout my life and career.
APU KUMAR GIRI
MCA 4TH
SEMSETER
ROLL NO-2005280031
NIIS INSTITUTE OF BUSINES
ADMINISTRATION
Madanpur, Bhubaaneswar
Odisha-752054
5
CONTENTS-
Abstract
Chapter -1:
Overview
Introduction
Chapter-2:
Literature Survey
Chapter-3:
System Analysis
Chapter-4:
Block diagram
Chapter-5:
Modules
Chapter-6:
Softwsre requirements
Hardware requirements
Chapter-7:
UML diagram
Class diagram
Sequence diagram
Chapter-8:
Software description
Testing
Coding
Output
Reference
6
Android Based Face mask detection System
ABSTRACT:
In order to effectively prevent the spread of COVID19 virus, almost everyone
wears a mask during corona virus pandemic. This almost makes conventional
facial recognition technology ineffective in many cases, such as community
access control, face access control, facial attendance, facial security checks at
train stations, etc. Therefore, it is very urgent to improve the recognition
performance of the existing face recognition technology on the masked faces.
Most current advanced face recognition approaches are designed based on
deep learning, which depend on a large number of face samples. However, at
present, there are no publicly available masked face recognition datasets. To
this end, this work proposes three types of masked face datasets, including
Masked Face Detection Dataset (MFDD), Real-world Masked Face
Recognition Dataset (RMFRD) and Simulated Masked Face Recognition
Dataset (SMFRD). Among them, to the best of our knowledge, RMFRD is
currently the world’s largest real-world masked face dataset. These datasets
are freely available to industry and academia, based on which various
applications on masked faces can be developed.
CHAPTER 1:
OVERVIEW:
Face mask detection is a simple model to detect face mask .Due to COVID-
19 there is need of face mask detection application on many places like Malls
and Theatres for safety. By the development of face mask detection we can
detect if the person is wearing a face mask and allow their entry would be of
great help to the society.
7
Face Mask detection model is built using the Deep Learning technique called
asConvolutional Neural Networks (CNN). This CNN Model is built using the
Keras and Tensor Flow framework and the OpenCV library which is highly
used for real-time applications.
INTRODUCTION:
Face recognition is a promising area of applied computer vision. This
technique is used to recognize a face or identify a person automatically from
given images. In our daily life activates like, in a passport checking, smart
door, access control, voter verification, criminal investigation, and many other
purposes face recognition is widely used to authenticate a person correctly
and automatically. Face recognition has gained much attention as a unique,
reliable biometric recognition technology that makes it most popular than any
other biometric technique likes password, pin, fingerprint, etc. Many of the
governments across the world also interested in the face recognition system
to secure public places such as parks, airports, bus stations, and railway
stations, etc. Face recognition is one of the well-studied real life problems.
Excellent progress has been done against face recognition technology.
CHAPTER 2
LITERATURE SURVEY:
SNO TITLE AUTHOR YEAR drawback
1 Study of
Mask face
detection
gayathri
deore
2016 We can’t
See the Eye
line detection
8
2 Cascade
mask face
detection
Chengbin
peng
2017 It is difficult
to
Detect the
Masked face
3 Design of
Deep face
Detector
Using rcnn
Caner ozer 2019 In this Low
accuracy rate
will be
appeared
4 Face mask
detection
Using
mobilenet
Shree
prakash
2020 It has high
accuracy
CHAPTER 3:
SYSTEM ANALYSIS
EXISTING SYSTEM:
● Support vector machine
● Discrete wavelet transform
DRAWBACK:
● Existing face recognition solutions are no longer reliable when wearing
a mask.
● Time consuming Process
● Poor Detection
9
PROPOSED SYSTEM:
● Convolutional neural network
● Caffle dataset
ADAVNTAGE:
● Highly Security
● Its easily detection in mask
CHAPTER 4:
BLOCK DIAGRAM:
10
CHAPTER 5:
MODULES
● Pre-processing
● Discrete wavelet transform
● Cropping face image
● Caffle mobile net models
11
● NN CLASSIFIER
MODULES EXPLANATION:
PRE-PROCESSING
Image Pre-processing is a common name for operations with images at
the lowest level of abstraction. Its input and output are intensity images. The
aim of pre-processing is an improvement of the image data that suppresses
unwanted distortions or enhances some image features important for further
processing.
Image restoration is the operation of taking a corrupted/noisy image and
estimating the clean original image. Corruption may come in many forms such
as motion blur, noise, and camera misfocus. Image restoration is different
from image enhancement in that the latter is designed to emphasize features
of the image that make the image more pleasing to the observer, but not
necessarily to produce realistic data from a scientific point of view. Image
enhancement techniques (like contrast stretching or de-blurring by a nearest
neighbor procedure) provided by "Imaging packages" use no a priori model
of the process that created the image. With image enhancement noise can be
effectively be removed by sacrificing some resolution, but this is not
acceptable in many applications. In a Fluorescence Microscope resolution in
the z-direction is bad as it is. More advanced image processing techniques
must be applied to recover the object. De-Convolution is an example of image
restoration method. It is capable of: Increasing resolution, especially in the
axial direction removing noise increasing contrast.
Discrete wavelet transforms:
The CWT and the discrete wavelet transforms differ in how they discretize
12
the scale parameter. The CWT typically uses exponential scales with a base
smaller than 2, for example 21/12 . The discrete wavelet transform always
uses exponential scales with the base equal to 2. The scales in the discrete
wavelet transform are powers of 2. Keep in mind that the physical
intrepretation of scales for both the CWT and discrete wavelet transforms
requires the inclusion of the signal’s sampling interval if it is not equal to one.
For example, assume you are using the CWT and you set your base
to s0=21/12. To attach physical significance to that scale, you must multiply
by the sampling interval Δt, so a scale vector covering approximately four
octaves with the sampling interval taken into account is sj0Δt j=1,2,⋯48.
Note that the sampling interval multiplies the scales,
it is not in the exponent. For discrete wavelet
transforms the base scale is always 2.
The decimated and nondecimated discrete wavelet transforms differ in how
they discretize the translation parameter. The decimated discrete wavelet
transform (DWT), always translates by an integer multiple of the scale, 2jm .
The nondecimated discrete wavelet transform translates by integer shifts.
These differences in how scale and translation are discretized result in advantages
and disadvantages for the two classes of wavelet transforms. These differences
also determine use cases where one wavelet transform is likely to provide
superior results. Some important consequences of the discretization of the scale
and translation parameter are:
The DWT provides a sparse representation for many natural signals. In other
words, the important features of many natural signals are captured by a subset
of DWT coefficients that is typically much smaller than the original signal.
13
This “compresses” the signal. With the DWT, you always end up with the
same number of coefficients as the original signal, but many of the
coefficients may be close to zero in value. As a result, you can often throw
away those coefficients and still maintain a high-quality signal approximation.
With the CWT, you go from N samples for an N-length signal to a M-by-N
matrix of coefficents with M equal to the number of scales. The CWT is a
highly redundant transform. There is significant overlap between wavelets at
each scale and between scales. The computational resources required to
compute the CWT and store the coefficients is much larger than the DWT.
The nondecimated discrete wavelet transform is also redundant but the
redundancy factor is usually significantly less than the CWT, because the
scale parameter is not discretized so finely. For the nondecimated discrete
wavelet transform, you go from N samples to an L+1-by-N matrix of
coefficients where L is the level of the transform.
The strict discretization of scale and translation in the DWT ensures that the
DWT is an orthonormal transform (when using an orthogonal wavelet). There
are many benefits of orthonormal transforms in signal analysis. Many signal
models consist of some deterministic signal plus white Gaussian noise. An
orthonormal transform takes this kind of signal and outputs the transform
applied to the signal plus white noise. In other words, an orthonormal
transform takes in white Gaussian noise and outputs white Gaussian noise.
The noise is uncorrelated at the input and output. This is important in many
statistical signal processing settings. In the case of the DWT, the signal of
interest is typically captured by a few large-magnitude DWT coefficients,
while the noise results in many small DWT coefficients that you can throw
14
away. If you have studied linear algebra, you have no doubt learned many
advantages to using orthonormal bases in the analysis and representation of
vectors. The wavelets in the DWT are like orthonormal vectors. Neither the
CWT nor the nondecimated discrete wavelet transform are orthonormal
transforms. The wavelets in the CWT and nondecimated discrete wavelet
transform are technically called frames, they are linearly-dependent sets.
The DWT is not shift-invariant. Because the DWT downsamples, a shift in
the input signal does not manifest itself as a simple equivalent shift in the
DWT coefficients at all levels. A simple shift in a signal can cause a
significant realignment of signal energy in the DWT coefficients by scale. The
CWT and non decimated discrete wavelet transform are shift-invariant. There
are some modifications of the DWT such as the dual-tree complex discrete
wavelet transform that mitigate the lack of shift invariance in the DWT,
see Critically Sampled and Oversampled Wavelet Filter Banks for some
conceptual material on this topic and Dual-Tree Wavelet Transforms for an
example.
The discrete wavelet transforms are equivalent to discrete filter banks.
Specifically, they are tree-structured discrete filter banks where the signal is
first filtered by a lowpass and a highpass filter to yield lowpass and highpass
subbands. Subsequently, the lowpass subband is iteratively filtered by the
same scheme to yield narrower octave-band lowpass and highpass subbands.
In the DWT, the filter outputs are downsampled at each successive stage. In
15
the non decimated discrete wavelet transform, the outputs are not
downsampled. The filters that define the discrete wavelet transforms typically
only have a small number of coefficients so the transform can be implemented
very efficiently. For both the DWT and non decimated discrete wavelet
transform, you do not actually require an expression for the wavelet. The
filters are sufficient. This is not the case with the CWT. The most common
implementation of the CWT requires you have the wavelet explicitly defined.
Even though the non decimated discrete wavelet transform does not down
sample the signal, the filter bank implementation still allows for good
computational performance.
DWT
The discrete wavelet transforms provide perfect reconstruction of the signal
upon inversion. This means that you can take the discrete wavelet transform
of a signal and then use the coefficients to synthesize an exact reproduction
of the signal to within numerical precision. You can implement an inverse
CWT, but it is often the case that the reconstruction is not perfect.
Reconstructing a signal from the CWT coefficients is a much less stable
numerical operation.
The finer sampling of scales in the CWT typically results in a higher-fidelity
signal analysis. You can localize transients in your signal, or characterize
oscillatory behaviour better with the CWT than with the discrete wavelet
transforms.
16
CROPPING FACE IMAGE:
It is mainly used to crop the image from nose to jaw it checks whether the
user is wearing a mask or not .
Caffle mobile net models:
It is one type of dataset model and mainly used store it in database.
Neural network:
Neural networks are predictive models loosely based on the action of
biological neurons.
The selection of the name “neural network” was one of the great PR
successes of the Twentieth Century. It certainly sounds more exciting than a
technical description such as “A network of weighted, additive values with
nonlinear transfer functions”. However, despite the name, neural networks are
far from “thinking machines” or “artificial brains”. A typical artificial neural
network might have a hundred neurons. In comparison, the human nervous
system is believed to have about 3x1010
neurons. We are still light years from
“Data”.
17
The original “Perceptron” model was developed by Frank Rosenblatt in
1958. Rosenblatt’s model consisted of three layers, (1) a “retina” that
distributed inputs to the second layer, (2) “association units” that combine the
inputs with weights and trigger a threshold step function which feeds to the
output layer, (3) the output layer which combines the values. Unfortunately,
the use of a step function in the neurons made the perceptions difficult or
impossible to train. A critical analysis of perceptrons published in 1969 by
Marvin Minsky and Seymore Paper pointed out a number of critical
weaknesses of perceptrons, and, for a period of time, interest in perceptrons
waned.
Interest in neural networks was revived in 1986 when David Rumelhart,
Geoffrey Hinton and Ronald Williams published “Learning Internal
Representations by Error Propagation”. They proposed a multilayer neural
network with nonlinear but differentiable transfer functions that avoided the
pitfalls of the original perceptron’s step functions. They also provided a
reasonably effective training algorithm for neural networks. Types of Neural
Networks:
1) Artificial Neural Network
2) Probabilistic Neural Networks
3) General Regression Neural Networks
DTREG implements the most widely used types of neural networks:
18
a) Multilayer Perceptron Networks (also known as multilayer feed-forward
network),
b) Cascade Correlation Neural Networks,
c) Probabilistic Neural Networks (NN)
d) General Regression Neural Networks (GRNN).
Radial Basis Function Networks:
a) Functional Link Networks,
b) Kohonen networks,
c) Gram-Charlier networks,
d) Hebb networks,
e) Adaline networks,
f) Hybrid Networks.
The Multilayer Perceptron Neural Network Model
The following diagram illustrates a perceptron network with three layers:
19
This network has an input layer (on the left) with three neurons, one hidden
layer (in the middle) with three neurons and an output layer (on the right)
with three neurons.
There is one neuron in the input layer for each predictor variable. In the case
of categorical variables, N-1 neurons are used to represent the N categories of
the variable.
Input Layer — A vector of predictor variable values (x1...xp) is presented to
the input layer. The input layer (or processing before the input layer)
standardizes these values so that the range of each variable is -1 to 1. The input
layer distributes the values to each of the neurons in the hidden layer. In
addition to the predictor variables, there is a constant input of 1.0, called the
bias that is fed to each of the hidden layers; the bias is multiplied by a weight
and added to the sum going into the neuron.
Hidden Layer — Arriving at a neuron in the hidden layer, the value from
each input neuron is multiplied by a weight (wji), and the resulting weighted
values are added together producing a combined value uj. The weighted sum
20
(uj) is fed into a transfer function, σ, which outputs a value hj. The outputs
from the hidden layer are distributed to the output layer.
Output Layer — Arriving at a neuron in the output layer, the value from each
hidden layer neuron is multiplied by a weight (wkj), and the resulting weighted
values are added together producing a combined value vj. The weighted sum
(vj) is fed into a transfer function, σ, which outputs a value yk. The y values
are the outputs of the network.
If a regression analysis is being performed with a continuous target variable,
then there is a single neuron in the output layer, and it generates a single y
value. For classification problems with categorical target variables, there are
N neurons in the output layer producing N values, one for each of the N
categories of the target variable.
Neural Networks (NN):
Neural Network (NN) and General Regression Neural Networks (GRNN)
have similar architectures, but there is a fundamental difference: networks
perform classification where the target variable is categorical, whereas
general regression neural networks perform regression where the target
variable is continuous. If you select a NN/GRNN network, DTREG will
automatically select the correct type of network based on the type of target
variable.
21
Architecture of a NN:
All NN networks have four layers:
1. Input layer — There is one neuron in the input layer for each predictor
variable. In the case of categorical variables, N-1 neurons are used
where N is the number of categories. The input neurons (or processing
before the input layer) standardizes the range of the values by
subtracting the median and dividing by the interquartile range. The
input neurons then feed the values to each of the neurons in the hidden
layer.
2. Hidden layer — This layer has one neuron for each case in the training
data set. The neuron stores the values of the predictor variables for the
case along with the target value. When presented with the x vector of
input values from the input layer, a hidden neuron computes the
22
Euclidean distance of the test case from the neuron’s center point and
then applies the RBF kernel function using the sigma value(s). The
resulting value is passed to the neurons in the pattern layer.
3. Pattern layer / Summation layer — The next layer in the network is
different for NN networks and for GRNN networks. For NN networks
there is one pattern neuron for each category of the target variable. The
actual target category of each training case is stored with each hidden
neuron; the weighted value coming out of a hidden neuron is fed only
to the pattern neuron that corresponds to the hidden neuron’s category.
The pattern neurons add the values for the class they represent (hence,
it is a weighted vote for that category).
For GRNN networks, there are only two neurons in the pattern layer.
One neuron is the denominator summation unit the other is the
numerator summation unit. The denominator summation unit adds up
the weight values coming from each of the hidden neurons. The
numerator summation unit adds up the weight values multiplied by the
actual target value for each hidden neuron.
4. Decision layer — The decision layer is different for NN and GRNN
networks. For NN networks, the decision layer compares the weighted
23
votes for each target category accumulated in the pattern layer and uses
the largest vote to predict the target category.
For GRNN networks, the decision layer divides the value accumulated
in the numerator summation unit by the value in the denominator
summation unit and uses the result as the predicted target value.
The following diagram is actual diagram or propose network used in
our project
1) Input Layer:
The input vector, denoted as p, is presented as the black vertical bar.Its
dimension is R × 1. In this paper, R = 3.
2) Radial Basis Layer:
In Radial Basis Layer, the vector distances between input vector p and the
weight vector made of each row of weight matrix W are calculated. Here, the
vector distance is defined as the dot product between two vectors [8]. Assume
24
the dimension of W is Q×R. The dot product between p and the i-th row of
W produces the i-th element of the distance vector ||W-p||, whose dimension
is Q×1. The minus symbol, “-”, indicates that it is the distance between
vectors. Then, the bias vector b is combined with ||W- p|| by an element-by-
element multiplication, .The result is denoted as n = ||W- p|| ..p. The transfer
function in NN has built into a distance criterion with respect to a center. In
this paper, it is defined as radbas(n) = 2 n e- (1) Each element of n is
substituted into Eq. 1 and produces corresponding element of a, the output
vector of Radial Basis Layer. The i-th element of a can be represented as ai
= radbas(||Wi - p|| ..bi) (2) where Wi is the vector made of the i-th row of W
and bi is the i-th element of bias vector b.
Some characteristics of Radial Basis Layer:
The i-th element of a equals to 1 if the input p is identical to the ith row of
input weight matrix W. A radial basis neuron with a weight vector close to
the input vector p produces a value near 1 and then its output weights in the
competitive layer will pass their values to the competitive function. It is also
possible that several elements of a are close to 1 since the input pattern is
close to several training patterns. .
3) Competitive Layer:
There is no bias in Competitive Layer. In Competitive Layer, the vector
a is firstly multiplied with layer weight matrix M, producing an output vector
d. The competitive function, denoted as C in Fig. 2, produces a 1
corresponding to the largest element of d, and 0’s elsewhere. The output
vector of competitive function is denoted as c. The index of 1 in c is the
number of tumor that the system can classify. The dimension of output vector,
K, is 5 in this paper.
25
How NN network work:
Although the implementation is very different, neural networks are
conceptually similar to K-Nearest Neighbor (k-NN) models. The basic idea is
that a predicted target value of an item is likely to be about the same as other
items that have close values of the predictor variables. Consider this figure:
Assume that each case in the training set has two predictor variables, x and y.
The cases are plotted using their x,y coordinates as shown in the figure. Also
assume that the target variable has two categories, positive which is denoted
by a square and negative which is denoted by a dash. Now, suppose we are
trying to predict the value of a new case represented by the triangle with
26
predictor values x=6, y=5.1. Should we predict the target as positive or
negative?
Notice that the triangle is position almost exactly on top of a dash representing
a negative value. But that dash is in a fairly unusual position compared to the
other dashes which are clustered below the squares and left of center. So it
could be that the underlying negative value is an odd case.
The nearest neighbor classification performed for this example depends on
how many neighboring points are considered. If 1-NN is used and only the
closest point is considered, then clearly the new point should be classified as
negative since it is on top of a known negative point. On the other hand, if 9-
NN classification is used and the closest 9 points are considered, then the
effect of the surrounding 8 positive points may overbalance the close negative
point.
A neural network builds on this foundation and generalizes it to consider all
of the other points. The distance is computed from the point being evaluated
to each of the other points, and a radial basis function (RBF) (also called a
kernel function) is applied to the distance to compute the weight (influence)
for each point. The radial basis function is so named because the radius
distance is the argument to the function.
Weight = RBF (distance)
The further some other point is from the new point, the less influence it has.
27
28
Radial Basis Function
Different types of radial basis functions could be used, but the most common
is the Gaussian function:
Advantages and disadvantages of NN networks:
1. It is usually much faster to train a NN/GRNN network than a multilayer
a. perceptron network.
2. NN/GRNN networks often are more accurate than multilayer
perceptron
a. networks.
3. NN/GRNN networks are relatively insensitive to outliers (wild points).
4. NN networks generate accurate predicted target probability scores.
5. NN networks approach Bayes optimal classification.
6. NN/GRNN networks are slower than multilayer perceptron networks at
29
7. classifying new cases.
8. NN/GRNN networks require more memory space to store the model.
Removing unnecessary neurons
One of the disadvantages of NN models compared to multilayer
perceptron networks is that NN models are large due to the fact that there is
one neuron for each training row. This causes the model to run slower than
multilayer perceptron networks when using scoring to predict values for new
rows.
DTREG provides an option to cause it remove unnecessary neurons from the
model after the model has been constructed.
Removing unnecessary neurons has three benefits:
1. The size of the stored model is reduced.
2. The time required to apply the model during scoring is reduced.
3. Removing neurons often improves the accuracy of the model.
The process of removing unnecessary neurons is an iterative process. Leave-
one-out validation is used to measure the error of the model with each neuron
removed. The neuron that causes the least increase in error (or possibly the
largest reduction in error) is then removed from the model. The process is
repeated with the remaining neurons until the stopping criterion is reached.
When unnecessary neurons are removed, the “Model Size” section of the
analysis report shows how the error changes with different numbers of
30
neurons. You can see a graphical chart of this by clicking Chart/Model size.
There are three criteria that can be selected to guide the removal of neurons:
1. Minimize error – If this option is selected, then DTREG removes
neurons as long as the leave-one-out error remains constant or
decreases. It stops when it finds a neuron whose removal would cause
the error to increase above the minimum found.
2. Minimize neurons – If this option is selected, DTREG removes neurons
until the leave-one-out error would exceed the error for the model with
all neurons.
3. # of neurons – If this option is selected, DTREG reduces the least
significant neurons until only the specified number of neurons remain.
4. Document classification is the task of grouping documents into
categories based upon their content - never before has it been as
important as it is today. The exponential growth of unstructured data
combined with a marked increase in litigation, security and privacy
rules have left organizations utterly unable to cope with the conflicting
demands of the business, lawyers and regulators. The net is escalating
costs and risks, with no end in sight. Without tools to facilitate
automated, content based classification, organizations have little hope
of catching up, let alone getting ahead of the problem. Technology has
created the problem, and technology will be needed to address it.
Manual classification is out of the question due to the volume of data,
while naïve automatic approaches such as predefined search terms have
performed poorly due to the complexity of human language. Many
31
advanced approaches have been proposed to solve this problem,
however over the last several years Support Vector Machines (SVM)
classification has come to the forefront. SVM’s deep academic roots,
accuracy, computational scalability, language independence and ease
of implementation make it ideally suited to tackling the document
classification challenges faced by today’s large organizations.
SVM (support vector machine)
SVM is a group of learning algorithms primarily used for classification
tasks on complicated data such as image classification and protein
structure analysis. SVM is used in a countless fields in science and
industry, including Bio-technology, Medicine, Chemistry and
Computer Science. It has also turned out to be ideally suited for
categorization of large text repositories such as those housed in
virtually all large, modern organizations. Introduced in 1992, SVM
quickly became regarded as the state-of-the-art method for
classification of complex, high-dimensional data. In particular its
ability to capture trends observed in a small training set and to
generalize those trends against a broader corpus have made it useful
across a large number of applications. SVM uses a supervised learning
approach, which means it learns to classify unseen data based on a set
of labeled training data, such as corporate documents. The initial set of
training data is typically identified by domain experts and is used to
build a model that can be applied to any other data outside the training
set. The effort required to construct a high quality training set is quite
modest, particularly when compared to the volume of data that may be
ultimately classified against it. This means that learning algorithms
such as SVM offer an exceptionally cost effective method of text
classification for the massive volumes of documents produced by
modern organizations. The balance of this paper covers the inner
workings of SVM, its application in science and industry, the legal
defensibility of the method as well as classification accuracy compared
to manual classification.
32
5. Overview SUPPORT VECTOR MACHINE :
6. SVM is built upon a solid foundation of statistical learning theory.
Early classifiers were proposed by Vladimir Vapnik and Alexey
Chervonenkis more than 40 years ago. In 1992 Boser, Guyon and
Vapnik proposed an improvement that considerably extended the
applicability of SVM. From this point on SVM began to establish its
reputation as the stateof-the-art method for data categorization. Starting
with handwriting recognition tasks SVM showed results that were
superior to all other methods of classification. It was quickly shown that
SVM was able to beat even Artificial Neural Networks that were
considered to be the strongest categorization algorithms at the time.
Thousands of researchers applied SVM to a large number of machine
learning problems and the results have contributed to the acceptance of
this technology as the state-of-the-art for machine classification.
Numerous studies (Joachmis 1998, Dumais et al. 1998, Drucker et al.
1999) have shown the superiority of SVM over other machine learning
methods for text categorization problems. For example, Joachmis
reported 86% accuracy of SVM on classification of the Reuters news
dataset, while the next best method, a significantly slower k-
NearestNeighbor algorithm was only able to achieve 82% accuracy.
Today SVM is widely accepted in industry as well as in the academia.
For example, Health Discovery Corporation uses SVM in a medical
image analysis tool currently licensed to Pfizer, Dow Chemical uses
SVM in their research for outlier detection and Reuters uses SVM for
text classification.
7. Under the Hood
8. The typical approach for text classification using SVM. The model is
trained using a set of documents labeled by domain experts. The
validation procedure computes the expected accuracy of the model on
unclassified data. The labeled data itself is used in the accuracy
evaluation and therefore the error estimates take into account the
specifics of particular data. Once a model is constructed, it can then be
33
used to efficiently and quickly classify new unseen documents in real
time.
9. Model construction
10.SVM is most commonly used to split a single input set of documents
into two distinct subsets. For example, we could classify documents
into privileged and non-privileged or record and non-record sets. The
SVM algorithm learns to distinguish between the two categories based
on a training set of documents that contains labeled examples from both
categories. Internally, SVM manipulates docum The separating line
(also called "the model") is recorded and used for classification of new
documents. New documents are mapped and classified based on their
position with respect to the model. There are many ways to reduce a
document to a vector representation that can be used for classification.
For example, we could count the number of times particular terms,
characters or substrings appeared. We may also consider the lengths of
sentences or the amount of white space. Example: Say we are trying to
classify documents into "work" and "fun" categories. To represent the
document as a vector, we could count the number of times the words
"meeting" and "play" occurred. In this case the document will be
represented as a vector of size two or simply as a point on a
twodimensional plane (like in Figure 2). Similarly, if we count
appearance of three or more different words, then the document would
be represented in three or more dimensions. The dimensions are also
called features. There are many kinds of features we can compute on a
document including the frequency of appearance of a particular
character, the ratio of upper case letters to lower case, the average size
of the sentence, etc. Some features are more useful than others, while
some are simply noise. Within SVM there exist good algorithms that
evaluate how well a particular feature helps in classification. Typically,
when documents are prepared for classification the features are
extracted, analyzed and the noisiest ones automatically removed. Once
data is preprocessed and a multi-dimensional representation of a
document is generated, SVM then finds the optimum hyper-plane to
separate the data. As shown in Figure 3 there may be several possible
separators, however the SVM algorithm must pick the best one. Figure
34
3 Often there are many possible separators for the data. Support Vector
Machines choose the separator with maximum margin as it has best
generalization properties. It is said that one separator is better than
another if it generalizes better, i.e. shows better performance on
documents outside of the training set. It turns out that the generalization
quality of the plane is related to the distance between the plane and the
data points that lay on the boundary of the two data classes. These data
points are called "support vectors" and the SVM algorithm determines
the plane that is as far from all support vectors as possible. In other
words SVM finds the separator with a maximum margin and is often
called a "maximum margin classifier". Multi-class classification The
examples above demonstrate classification into two categories;
however it is often necessary to group documents into three or more
classes. There are established methods of using SVM for multi-class
classification. Most commonly an ensemble of binary (twoclass)
classifiers is used for this problem. In such an ensemble each classifier
is trained to recognize one particular category versus all o
mathematically projected into higher dimensions where it can be more
easily separated. For example, while the data shown in Figure 4 is not
linearly separable in two dimensions; it can be separated in three
dimensions. To do so, imagine that we loaded this picture into a slide
projector and projected it onto a white cone attached to the wall in front
of the projector with the tip pointing away from the wall. We center the
cone such that its tip is matched with the center of our plot. Then,
consider where the points appear on the surface of the cone. All blue
points will be projected closer to the tip of the cone and all the red points
will appear closer to the other end. Figure 5: Non-linear classification.
First figure shows the dataset that cannon be separated by a linear
classifier in two dimensions, but can be projected into three dimensions
where it can be separated by the plane. Now, we can take a plane (a
linear separator in three dimensions) and cut the cone, such that all blue
points stay on one side of the plane and the green points on the other
side. SVM will pick a plane that has good generalization properties to
classify unseen data. The cone in this example is called a kernel. In
other words, the kernel is the function that describes how to project data
35
into higher dimensions. There are a number of different types of kernels
available to fit different data. Good guidelines exist to help in selecting
a proper kernel. Accuracy Evaluation Understanding the accuracy or
the expected rate of success of the algorithm is essential to the
successful application in a commercial enterprise. Fortunately solid
testing procedures have been developed over the years to evaluate the
performance of learning algorithms. Accuracy is a measure of how
close the results of the automatic classification match the true
categories of the documents. Accuracy is estimated by applying the
classifier to the testing dataset classified by domain experts. The
documents for which the classification algorithm and domain experts
assigned the same label is said to be classified correctly and the rest of
the documents are classified incorrectly. The accuracy is computed as
number of correct over number of correct plus number incorrect. In
other words, the accuracy is the percentage of documents that were
classified correctly. When evaluating the accuracy it is essential to
ensure that documents from the testing set were not used to train the
classifier. This ensures that the classifier will have no unfair
information about testing documents that will inflate the performance.
Typically, the set of all documents labeled by domain experts is split
into training and testing sets. The algorithm is trained using the training
set and then applied to the testing set for accuracy evaluation. Since,
none of the testing documents appeared in the training set, the
performance of the algorithm on the training set will be good estimator
of expected accuracy on unseen data. In order to build up the confidence
in the estimated value of accuracy it is beneficial to train the model on
multiple training sets and evaluate it against multiple testing sets and
then compute the average accuracy. This approach is known as k-fold
cross-validation and is accomplished by partitioning a single labeled
dataset into multiple testing and training sets. As shown in Figure 6, the
method prescribes the labeled set to be split into k parts, also called
folds. Commonly a 10 fold split is considered to be sufficient. For every
fold the training set is constructed from all but one part of the labeled
data. The single remaining part is used as a testing set. This approach
results in k training/testing sets from the same labeled dataset. The
36
accuracy is evaluated on each of the splits and the average computed.
When making sense of a quality of the classification algorithm it is
important to keep in mind that it is typically not possible to reach 100%
accuracy, however accuracies of 80%- 90% are commonly considered
achievable. To gain the perspective of what 90% accuracy means in the
real world it is necessary to compare it to the accuracy of human
reviewers. It turns out that on average human reviewers do not perform
better than some of the best machine learning algorithms and in many
cases humans perform significantly worse. Godbole and Roy, 2008
studied the quality of human classification of natural language texts in
the support industry. They found that when different groups of
reviewers were asked to review the same set of documents they
disagreed on categories for 47% of documents. Furthermore, when the
same reviewer was given the same document to review on different
occasions their labels only agreed in 64% of cases, this means that the
same reviewer did not even agree with themselves 1/3 of the time. It is
now possible to train a machine algorithm that will outperform or work
on par with manual classification. Wai Lam et. al., 1999 observed this
when comparing the quality of Figure 6: Example of 5-fold cross-
validation. Labeled data is split into five parts and for each fold
classifier is trained on four parts and validated on one remaining part
and the average of fold accuracies is computed. manual and automatic
classification of medical literature with respect to text retrieval. Similar
observations were reported in the litigation support industry by Anne
Kershaw, a founder of nationally recognized litigation management
consulting firm. Her group compared the results of automatic and
manual privilege coding over population of 48,000 documents and
found that automated classification was much more accurate then
manual review, minimizing the chance of missing an important
privileged document. Defensibility To analyze the defensibility of
results obtained using SVM classification, consider the related standard
for admitting expert scientific testimony in a federal trial. In Daubert
vs. Merrell Dow Pharmaceuticals, Mr. Justice Blackman suggested
following four factors be considered: • Whether the theory or technique
can be and has been tested • Whether the theory or technique has been
37
subjected to peer review and publications • The known or potential rate
of error or the existence of standards • Whether the theory or technique
used has been generally accepted SVM satisfies all four of
requirements. The years of research in statistical learning theory as well
as thousands of publications that study SVMs completely satisfy the
first and second requirements. The extensive testing methodologies
available for SVM quantify the expected accuracy of the algorithm and
as such completely satisfy the third requirement. The data-centric error
rate calculation described in the Accuracy Evaluation section above
measures the accuracy of the algorithm as it specifically relates to a
particular data. This approach to testing and quality evaluation meets
the strictest requirements of modern science. The second and fourth
requirements are satisfied by the wide acceptance of SVM as a state-
of-the-art method for classification that is broadly utilized in science
and industry.
CONVOLUTIONAL NEURAL NETWORK
CNNs, like neural networks, are made up of neurons with learnable weights
and biases. Each neuron receives several inputs, takes a weighted sum over
them, pass it through an activation function and responds with an output. The
whole network has a loss function and all the tips and tricks that we developed
for neural networks still apply on CNNs. Pretty straightforward, right?So, how
are Convolutional Neural Networks different than Neural Networks?
38
CNNs operate over Volumes !
What do we mean by this?
1. Example of a RGB image (let’s call it ‘input image’)
2. Unlike neural networks, where the input is a vector, here the
input is a multi-channeled image (3 channeled in this case).
Before we go any deeper, let us first understand what convolution means.
39
Convolution
2. Convolving an image with a filter
We take the 5*5*3 filter and slide it over the complete image and along the
way take the dot product between the filter and chunks of the input image.
The convolution layer is the main building block of a convolutional neural
network.
Convolution Layer
The convolution layer comprises of a set of independent filters (6 in the
example shown). Each filter is independently convolved with the image and
40
we end up with 6 feature maps of shape 28*28*1.
Convolution Layers in sequence
All these filters are initialized randomly and become our parameters which
will be learned by the network subsequently.
I will show you an example of a trained network.
Filters in a trained network
Take a look at the filters in the very first layer (these are our 5*5*3 filters).
Through back propagation, they have tuned themselves to become blobs of
coloured pieces and edges. As we go deeper to other convolution layers, the
filters are doing dot products to the input of the previous convolution layers.
So, they are taking the smaller coloured pieces or edges and making larger
pieces out of them.Take a look at image 4 and imagine the 28*28*1 grid as a
grid of 28*28 neurons. For a particular feature map (the output received on
convolving the image with a particular filter is called a feature map), each
neuron is connected only to a small chunk of the input image and all the
neurons have the same connection weights. So again coming back to the
differences between CNN and a neural network.CNNs have a couple of
concepts called parameter sharing and local connectivity
Parameter sharing is sharing of weights by all neurons in a particular feature
map.
Local connectivity is the concept of each neural connected only to a subset of
the input image (unlike a neural network where all the neurons are fully
41
connected)This helps to reduce the number of parameters in the whole system
and makes the computation more efficient.
Pooling Layers
A pooling layer is another building block of a CNN.
Pooling
Its function is to progressively reduce the spatial size of the representation to
reduce the amount of parameters and computation in the network. Pooling
layer operates on each feature map independently.
The most common approach used in pooling is max pooling.
Convolution neural networks (CNN):
Convolutional neural network (CNN) and General Regression Neural Networks (GRNN)
have similar architectures, but there is a fundamental difference: Probabilistic networks
perform classification where the target variable is categorical, whereas general regression
neural networks perform regression where the target variable is continuous. If you select a
42
CNN/GRNN network, DTREG will automatically select the correct type of network based
on the type of target variable.
CONVOLUTIONAL NEURAL NETWORK ALGORITHM
Consider a network with a single real input x and network function F. The
derivative F’(x) is computed in two phases:
Feed-forward: the input x is fed into the network. The primitive functions at the nodes
and their derivatives are evaluated at each node. The derivatives are stored.
Back propagation: The constant 1 is fed into the output unit and the network
is run backwards. Incoming information to a node is added and the result is multiplied by
the value stored in the left part of the unit. The result is transmitted to the left of the unit.
The result collected at the input unit is the derivative of the network function with respect
to x.
STEPS OF THE ALGORITHM
The Convolution neural algorithm is used to compute the necessary corrections,
after choosing the weights of the network randomly. The algorithm can be decomposed in
the following four steps:
i) Feed-forward computation
ii) Convolution neural to the output layer
iii) Convolution neural to the hidden layer
iv) Weight updates
43
Typical architecture of CNN
We have already discussed about convolution layers (denoted by CONV) and
pooling layers (denoted by POOL).
RELU is just a non linearity which is applied similar to neural networks.
The FC is the fully connected layer of neurons at the end of CNN. Neurons in
a fully connected layer have full connections to all activations in the previous
layer, as seen in regular Neural Networks and work in a similar way.
CNNs are especially tricky to train, as they add even more hyper-parameters
than a standard MLP. While the usual rules of thumb for learning rates and
regularization constants still apply, the following should be kept in mind when
optimizing CNNs.
Number of filters
When choosing the number of filters per layer, keep in mind that computing
the activations of a single convolutional filter is much more expensive than
with traditional MLPs !
Assume layer contains feature maps and pixel positions
(i.e., number of positions times number of feature maps), and there
are filters at layer of shape . Then computing a feature map
(applying an filter at all pixel positions where the
filter can be applied) costs . The total
cost is times that. Things may be more complicated if not all features at
one level are connected to all features at the previous one.
44
For a standard MLP, the cost would only be where there
are different neurons at level . As such, the number of filters used in
CNNs is typically much smaller than the number of hidden units in MLPs and
depends on the size of the feature maps (itself a function of input image size
and filter shapes).
Since feature map size decreases with depth, layers near the input layer will
tend to have fewer filters while layers higher up can have much more. In fact,
to equalize computation at each layer, the product of the number of features
and the number of pixel positions is typically picked to be roughly constant
across layers. To preserve the information about the input would require
keeping the total number of activations (number of feature maps times number
of pixel positions) to be non-decreasing from one layer to the next (of course
we could hope to get away with less when we are doing supervised learning).
The number of feature maps directly controls capacity and so that depends on
the number of available examples and the complexity of the task.
Filter Shape
Common filter shapes found in the literature vary greatly, usually based on
the dataset. Best results on MNIST-sized images (28x28) are usually in the
5x5 range on the first layer, while natural image datasets (often with hundreds
of pixels in each dimension) tend to use larger first-layer filters of shape 12x12
or 15x15.
The trick is thus to find the right level of “granularity” (i.e. filter shapes) in
order to create abstractions at the proper scale, given a particular dataset.
Max Pooling Shape
Typical values are 2x2 or no max-pooling. Very large input images may
45
warrant 4x4 pooling in the lower-layers. Keep in mind however, that this will
reduce the dimension of the signal by a factor of 16, and may result in
throwing away too much information.
CHAPTER 6:
SOFTWARE REQUIREMENT
HARDWARE
Software:
Android studio
● Python idle :3.9
● Anaconda navigator 3.3
● Adas lane vehicle
Hardware:
● RAM:8GB
● HDD:1TB
46
CHAPTER 7:
UML DIAGRAMS
47
Input
video
Face
detec
t
Feature
extract
Cnn
classi
fy
With
mask
Without
mask
48
CLASS DIAGRAM:
SEQUENCE DIAGRAM:
49
Collaboration diagram:
CHAPTER 8
SOFTWARE DESCRIPTION:
Java (programming language)
History
The JAVA language was created by James Gosling in June 1991 for use in a set top box project.
The language was initially called Oak, after an oak tree that stood outside Gosling's office - and also went
50
by the name Green - and ended up later being renamed to Java, from a list of random words. Gosling's goals
were to implement a virtual machine and a language that had a familiar C/C++ style of notation. The first
public implementation was Java 1.0 in 1995. It promised "Write Once, Run Anywhere” (WORA),
providing no-cost runtimes on popular platforms. It was fairly secure and its security was configurable,
allowing network and file access to be restricted. Major web browsers soon incorporated the ability to run
secure Java applets within web pages. Java quickly became popular. With the advent of Java 2, new versions
had multiple configurations built for different types of platforms. For example, J2EE was for enterprise
applications and the greatly stripped down version J2ME was for mobile applications. J2SE was the
designation for the Standard Edition. In 2006, for marketing purposes, new J2 versions were renamed Java
EE, Java ME, and Java SE, respectively.
In 1997, Sun Microsystems approached the ISO/IEC JTC1 standards bodyand later the Ecma
International to formalize Java, but it soon withdrew from the process. Java remains a standard that is
controlled through the Java Community Process. At one time, Sun made most of its Java implementations
available without charge although they were proprietary software. Sun's revenue from Java was generated
by the selling of licenses for specialized products such as the Java Enterprise System. Sun distinguishes
between its Software Development Kit (SDK) and Runtime Environment (JRE)which is a subset of the
SDK, the primary distinction being that in the JRE, the compiler, utility programs, and many necessary
header files are not present.
On 13 Novmber2006, Sun released much of Java as free softwareunder the terms of the GNU
General Public License(GPL). On 8 May2007Sun finished the process, making all of Java's core code open
source, aside from a small portion of code to which Sun did not hold the copyright.
Primary goals
There were five primary goals in the creation of the Java language:
• It should use the object-oriented programming methodology.
• It should allow the same program to be executed on multiple operating systems.
• It should contain built-in support for using computer networks.
• It should be designed to execute code from remote sources securely.
51
• It should be easy to use by selecting what were considered the good parts of other object-
oriented languages
The Java Programming Language:
The Java programming language is a high-level language that can be characterized by all of the
following buzzwords:
• Simple
• Architecture neutral
• Object oriented
• Portable
• Distributed
• High performance
Each of the preceding buzzwords is explained in The Java Language Environment , a white paper
written by James Gosling and Henry McGilton.
In the Java programming language, all source code is first written in plain text files ending with the
.java extension. Those source files are then compiled into .class files by the javac compiler.
A .class file does not contain code that is native to your processor; it instead contains byte codes
— the machine language of the Java Virtual Machine1
(Java VM). The java launcher tool then runs your
application with an instance of the Java Virtual Machine.
52
An overview of the software development process.
Because the Java VM is available on many different operating systems, the same .class files are
capable of running on Microsoft Windows, the Solaris TM
Operating System (Solaris OS), Linux, or Mac
OS. Some virtual machines, such as the Java Hot Spot virtual machineperform additional steps at runtime
to give your application a performance boost. This include various tasks such as finding performance
bottlenecks and recompiling (to native code) frequently used sections of code.
53
Through the Java VM, the same application is capable of running on multiple
platforms.
The Java Platform
A platform is the hardware or software environment in which a program runs. We've already mentioned
some of the most popular platforms like Microsoft Windows, Linux, Solaris OS, and Mac OS. Most platforms can
be described as a combination of the operating system and underlying hardware. The Java platform differs from most
other platforms in that it's a software-only platform that runs on top of other hardware-based platforms.
The Java platform has two components:
The Java Virtual Machine
The Java Application Programming Interface (API)
You've already been introduced to the Java Virtual Machine; it's the base for the Java platform and is ported
onto various hardware-based platforms.
The API is a large collection of ready-made software components that provide many useful
capabilities. It is grouped into libraries of related classes and interfaces; these libraries are known as
packages. The next section, What CanJavaTechnologyDo?Highlights some of the functionality provided
by the API.
54
The API and Java Virtual Machine insulate the program from the underlying
hardware.
As a platform-independent environment, the Java platform can be a bit slower than native code. However,
advances in compiler and virtual machine technologies are bringing performance close to that of native code without
threatening portability.
Java Runtime Environment
The Java Runtime Environment, or JRE, is the software required to run any application deployed
on the Java Platform. End-users commonly use a JRE in software packages and Web browser plug-in. Sun
also distributes a superset of the JRE called the Java 2 SDK(more commonly known as the JDK), which
includes development tools such as the Javacompiler,Javadoc, Jarand debugger.
One of the unique advantages of the concept of a runtime engine is that errors (exceptions) should
not 'crash' the system. Moreover, in runtime engine environments such as Java there exist tools that attach
to the runtime engine and every time that an exception of interest occurs they record debugging information
that existed in memory at the time the exception was thrown (stack and heap values). These Automated
Exception Handling tools provide 'root-cause' information for exceptions in Java programs that run in
production, testing or development environments.
Uses OF JAVA
Blue is a smart card enabled with the secure, cross-platform, object-oriented Java Card API and
55
technology. Blue contains an actual on-card processing chip, allowing for enhance able and multiple
functionality within a single card. Applets that comply with the Java Card API specification can run on any
third-party vendor card that provides the necessary Java Card Application Environment (JCAE). Not only
can multiple applet programs run on a single card, but new applets and functionality can be added after the
card is issued to the customer
• Java Can be used in Chemistry.
• In NASA also Java is used.
• In 2D and 3D applications java is used.
• In Graphics Programming also Java is used.
• In Animations Java is used.
• In Online and Web Applications Java is used.
JSP :
JavaServer Pages (JSP) is a Java technology that allows software developers to dynamically
generate HTML, XML or other types of documents in response to a Web client request. The technology
allows Java code and certain pre-defined actions to be embedded into static content.
The JSP syntax adds additional XML-like tags, called JSP actions, to be used to invoke built-in
functionality. Additionally, the technology allows for the creation of JSP tag libraries that act as extensions
to the standard HTML or XML tags. Tag libraries provide a platform independent way of extending the
capabilities of a Web server.
JSPs are compiled into Java Servlet by a JSP compiler. A JSP compiler may generate a servlet in
Java code that is then compiled by the Java compiler, or it may generate byte code for the servlet directly.
JSPs can also be interpreted on-the-fly reducing the time taken to reload changes
JavaServer Pages (JSP) technology provides a simplified, fast way to create dynamic web content.
JSP technology enables rapid development of web-based applications that are server and platform-
independent.
56
Architecture OF JSP
The Advantages of JSP
Active Server Pages (ASP). ASP is a similar technology from Microsoft. The advantages of JSP are
twofold. First, the dynamic part is written in Java, not Visual Basic or other MS-specific language, so it is
more powerful and easier to use. Second, it is portable to other operating systems and non-Microsoft Web
servers. Pure Servlet. JSP doesn't give you anything that you couldn't in principle do with a Servlet. But it
is more convenient to write (and to modify!) regular HTML than to have a zillion println statements that
generate the HTML. Plus, by separating the look from the content you can put different people on different
57
tasks: your Web page design experts can build the HTML, leaving places for your Servlet programmers to
insert the dynamic content.
Server-Side Includes (SSI). SSI is a widely-supported technology for including externally-defined
pieces into a static Web page.
JSP is better because it lets you use Servlet instead of a separate program to generate that dynamic
part. Besides, SSI is really only intended for simple inclusions, not for "real" programs that use form data,
make database connections, and the like. JavaScript. JavaScript can generate HTML dynamically on the
client. This is a useful capability, but only handles situations where the dynamic information is based on
the client's environment.
With the exception of cookies, HTTP and form submission data is not available to JavaScript. And,
since it runs on the client, JavaScript can't access server-side resources like databases, catalogs, pricing
information, and the like. Static HTML. Regular HTML, of course, cannot contain dynamic information.
JSP is so easy and convenient that it is quite feasible to augment HTML pages that only benefit marginally
by the insertion of small amounts of dynamic data. Previously, the cost of using dynamic data would
preclude its use in all but the most valuable instances.
ARCHITECTURE OF JSP
• The browser sends a request to a JSP page.
• The JSP page communicates with a Java bean.
58
• The Java bean is connected to a database.
• The JSP page responds to the browser.
SERVLETS – FRONT END
The Java Servlet API allows a software developer to add dynamic content to a Web server using
the Java platform. The generated content is commonly HTML, but may be other data such as XML.
Servlet are the Java counterpart to non-Java dynamic Web content technologies such as PHP, CGI
and ASP.NET. Servlet can maintain state across many server transactions by using HTTP cookies, session
variables or URL rewriting.
The Servlet API, contained in the Java package hierarchy javax. Servlet, defines the expected
interactions of a Web container and a Servlet. A Web container is essentially the component of a Web
server that interacts with the Servlet. The Web container is responsible for managing the lifecycle of Servlet,
mapping a URL to a particular Servlet and ensuring that the URL requester has the correct access rights.
A Servlet is an object that receives a request and generates a response based on that request. The
basic Servlet package defines Java objects to represent Servlet requests and responses, as well as objects to
reflect the Servlet configuration parameters and execution environment. The package javax .Servlet. Http
defines HTTP-specific subclasses of the generic Servlet elements, including session management objects
that track multiple requests and responses between the Web server and a client. Servlet may be packaged
in a WAR file as a Web application.
Servlet can be generated automatically by Java Server Pages(JSP), or alternately by template
engines such as Web Macro. Often Servlet are used in conjunction with JSPs in a pattern called "Model 2”,
which is a flavour of the model-view-controller pattern.
Servlet are Java technology's answer to CGI programming. They are programs that run on a Web
server and build Web pages. Building Web pages on the fly is useful (and commonly done) for a number
of reasons:.
59
The Web page is based on data submitted by the user. For example the results pages from search
engines are generated this way, and programs that process orders for e-commerce sites do this as well. The
data changes frequently. For example, a weather-report or news headlines page might build the page
dynamically, perhaps returning a previously built page if it is still up to date. The Web page uses
information from corporate databases or other such sources. For example, you would use this for making a
Web page at an on-line store that lists current prices and number of items in stock.
The Servlet Run-time Environment
A Servlet is a Java class and therefore needs to be executed in a Java VM by a service we call a
Servlet engine. The Servlet engine loads the servlet class the first time the Servlet is requested, or optionally
already when the Servlet engine is started. The Servlet then stays loaded to handle multiple requests until
it is explicitly unloaded or the Servlet engine is shut down.
Some Web servers, such as Sun's Java Web Server (JWS), W3C's Jigsaw and Gefion Software's
Lite Web Server (LWS) are implemented in Java and have a built-in Servlet engine. Other Web servers,
such as Netscape's Enterprise Server, Microsoft's Internet Information Server (IIS) and the Apache Group's
Apache, require a Servlet engine add-on module. The add-on intercepts all requests for Servlet, executes
them and returns the response through the Web server to the client. Examples of Servlet engine add-ons are
Gefion Software's WAI Cool Runner, IBM's Web Sphere, Live Software's JRun and New Atlanta's Servlet
Exec.
All Servlet API classes and a simple Servlet-enabled Web server are combined into the Java Servlet
Development Kit (JSDK), available for download at Sun's official Servlet site .To get started with Servlet
I recommend that you download the JSDK and play around with the sample Servlet.
Life Cycle OF Servlet
• The Servlet lifecycle consists of the following steps:
60
• The Servlet class is loaded by the container during start-up.
The container calls the init() method. This method initializes the Servlet and must be called before
the Servlet can service any requests. In the entire life of a Servlet, the init() method is called only once.
After initialization, the Servlet can service client-requests.
Each request is serviced in its own separate thread. The container calls the service() method of the
Servlet for every request.
The service() method determines the kind of request being made and dispatches it to an appropriate
method to handle the request. The developer of the Servlet must provide an implementation for these
methods. If a request for a method that is not implemented by the Servlet is made, the method of the parent
class is called, typically resulting in an error being returned to the requester. Finally, the container calls the
destroy() method which takes the Servlet out of service. The destroy() method like init() is called only once
in the lifecycle of a Servlet.
Request and Response Objects
The do Get method has two interesting parameters: HttpServletRequest and HttpServletResponse.
These two objects give you full access to all information about the request and let you control the output
sent to the client as the response to the request. With CGI you read environment variables and stdin to get
information about the request, but the names of the environment variables may vary between
implementations and some are not provided by all Web servers.
The HttpServletRequest object provides the same information as the CGI environment variables,
plus more, in a standardized way.
It also provides methods for extracting HTTP parameters from the query string or the request body
depending on the type of request (GET or POST). As a Servlet developer you access parameters the same
way for both types of requests. Other methods give you access to all request headers and help you parse
date and cookie headers.
Instead of writing the response to stdout as you do with CGI, you get an OutputStream or a
PrintWriter from the HttpServletResponse. The OuputStream is intended for binary data, such as a GIF or
JPEG image, and the PrintWriter for text output. You can also set all response headers and the status code,
without having to rely on special Web server CGI configurations such as Non Parsed Headers (NPH). This
makes your Servlet easier to install.
61
ServletConfig and ServletContext
There is only one ServletContext in every application. This object can be used by all the Servlet to
obtain application level information or container details. Every Servlet, on the other hand, gets its own
ServletConfig object. This object provides initialization parameters for a servlet. A developer can obtain
the reference to ServletContext using either the ServletConfig object or ServletRequest object.
All servlets belong to one servlet context. In implementations of the 1.0 and 2.0 versions of the
Servlet API all servlets on one host belongs to the same context, but with the 2.1 version of the API the
context becomes more powerful and can be seen as the humble beginnings of an Application concept.
Future versions of the API will make this even more pronounced.
Many servlet engines implementing the Servlet 2.1 API let you group a set of servlets into one
context and support more than one context on the same host. The ServletContext in the 2.1 API is
responsible for the state of its servlets and knows about resources and attributes available to the servlets in
the context. Here we will only look at how ServletContext attributes can be used to share information
among a group of servlets.
There are three ServletContext methods dealing with context attributes: getAttribute, setAttribute
and removeAttribute. In addition the servlet engine may provide ways to configure a servlet context with
initial attribute values. This serves as a welcome addition to the servlet initialization arguments for
configuration information used by a group of servlets, for instance the database identifier we talked about
above, a style sheet URL for an application, the name of a mail server, etc.
JDBC
Java Database Connectivity (JDBC) is a programming framework for Java developers writing programs
that access information stored in databases, spreadsheets, and flat files.
JDBC is commonly used to connect a user program to a "behind the scenes" database, regardless of
what database management software is used to control the database. In this way, JDBC is cross-
platform .
This article will provide an introduction and sample code that demonstrates database access from
Java programs that use the classes of the JDBC API, which is available for free download from Sun's site .
62
A database that another program links to is called a data source. Many data sources, including
products produced by Microsoft and Oracle, already use a standard called Open Database Connectivity
(ODBC). Many legacy C and Perl programs use ODBC to connect to data sources. ODBC consolidated
much of the commonality between database management systems. JDBC builds on this feature, and
increases the level of abstraction. JDBC-ODBC bridges have been created to allow Java programs to
connect to ODBC-enabled database software .
JDBC Architecture
Two-tier and Three-tier Processing Models
The JDBC API supports both two-tier and three-tier processing models for database access.
In the two-tier model, a Java applet or application talks directly to the data source. This requires a
JDBC driver that can communicate with the particular data source being accessed. A user's commands are
delivered to the database or other data source, and the results of those statements are sent back to the user.
The data source may be located on another machine to which the user is connected via a network. This is
referred to as a client/server configuration, with the user's machine as the client, and the machine housing
the data source as the server. The network can be an intranet, which, for example, connects employees
within a corporation, or it can be the Internet.
In the three-tier model, commands are sent to a "middle tier" of services, which then sends the
commands to the data source. The data source processes the commands and sends the results back to the
middle tier, which then sends them to the user.
63
MIS directors find the three-tier model very attractive because the middle tier makes it possible to
maintain control over access and the kinds of updates that can be made to corporate data. Another advantage
is that it simplifies the deployment of applications. Finally, in many cases, the three-tier architecture can
provide performance advantages.
Until recently, the middle tier has often been written in languages such as C or C++, which offer
fast performance. However, with the introduction of optimizing compilers that translate Java byte code into
efficient machine-specific code and technologies such as Enterprise JavaBeans™, the Java platform is fast
becoming the standard platform for middle-tier development. This is a big plus, making it possible to take
advantage of Java's robustness, multithreading, and security features.
With enterprises increasingly using the Java programming language for writing server code, the
JDBC API is being used more and more in the middle tier of a three-tier architecture. Some of the features
that make JDBC a server technology are its support for connection pooling, distributed transactions, and
disconnected rowsets. The JDBC API is also what allows access to a data source from a Java middle tier.
64
Testing
The various levels of testing are
1. White Box Testing
2. Black Box Testing
3. Unit Testing
4. Functional Testing
5. Performance Testing
6. Integration Testing
7. Objective
8. Integration Testing
9. Validation Testing
10. System Testing
11. Structure Testing
12. Output Testing
13. User Acceptance Testing
White Box Testing
White-box testing (also known as clear box testing, glass box testing, transparent box testing,
and structural testing) is a method of testing software that tests internal structures or workings of an
application, as opposed to its functionality (i.e. black-box testing). In white-box testing an internal
perspective of the system, as well as programming skills, are used to design test cases. The tester chooses
inputs to exercise paths through the code and determine the appropriate outputs. This is analogous to testing
nodes in a circuit, e.g. in-circuit testing (ICT).
While white-box testing can be applied at the unit, integration and system levels of the software
testing process, it is usually done at the unit level. It can test paths within a unit, paths between units during
integration, and between subsystems during a system–level test. Though this method of test design can
uncover many errors or problems, it might not detect unimplemented parts of the specification or missing
requirements.
White-box test design techniques include:
• Control flow testing
• Data flow testing
65
• Branch testing
• Path testing
• Statement coverage
• Decision coverage
White-box testing is a method of testing the application at the level of the source code. The test cases
are derived through the use of the design techniques mentioned above: control flow testing, data flow
testing, branch testing, path testing, statement coverage and decision coverage as well as modified
condition/decision coverage. White-box testing is the use of these techniques as guidelines to create an
error free environment by examining any fragile code.
These White-box testing techniques are the building blocks of white-box testing, whose essence is the
careful testing of the application at the source code level to prevent any hidden errors later on. These
different techniques exercise every visible path of the source code to minimize errors and create an error-
free environment. The whole point of white-box testing is the ability to know which line of the code is
being executed and being able to identify what the correct output should be.
Levels
1. Unit testing. White-box testing is done during unit testing to ensure that the code is working as
intended, before any integration happens with previously tested code. White-box testing during unit
testing catches any defects early on and aids in any defects that happen later on after the code is
integrated with the rest of the application and therefore prevents any type of errors later on.
2. Integration testing. White-box testing at this level are written to test the interactions of each interface
with each other. The Unit level testing made sure that each code was tested and working accordingly
in an isolated environment and integration examines the correctness of the behaviour in an open
environment through the use of white-box testing for any interactions of interfaces that are known
to the programmer.
3. Regression testing. White-box testing during regression testing is the use of recycled white-box test
cases at the unit and integration testing levels.
66
White-box testing's basic procedures involve the understanding of the source code that you are testing
at a deep level to be able to test them. The programmer must have a deep understanding of the application
to know what kinds of test cases to create so that every visible path is exercised for testing. Once the source
code is understood then the source code can be analysed for test cases to be created. These are the three
basic steps that white-box testing takes in order to create test cases:
1. Input, involves different types of requirements, functional specifications, detailed designing of
documents, proper source code, security specifications. This is the preparation stage of white-box
testing to layout all of the basic information.
2. Processing Unit, involves performing risk analysis to guide whole testing process, proper test plan,
execute test cases and communicate results. This is the phase of building test cases to make sure
they thoroughly test the application the given results are recorded accordingly.
3. Output, prepare final report that encompasses all of the above preparations and results.
Black Box Testing
Black-box testing is a method of software testing that examines the functionality of an application
(e.g. what the software does) without peering into its internal structures or workings (see white-box testing).
This method of test can be applied to virtually every level of software
testing: unit, integration,system and acceptance. It typically comprises most if not all higher level testing,
but can also dominate unit testing as well
Test procedures
Specific knowledge of the application's code/internal structure and programming knowledge in
general is not required. The tester is aware of what the software is supposed to do but is not aware of how it
does it. For instance, the tester is aware that a particular input returns a certain, invariable output but is not
aware of how the software produces the output in the first place.
67
Test cases
Test cases are built around specifications and requirements, i.e., what the application is supposed to
do. Test cases are generally derived from external descriptions of the software, including specifications,
requirements and design parameters. Although the tests used are primarily functional in nature, non-
functional tests may also be used. The test designer selects both valid and invalid inputs and determines the
correct output without any knowledge of the test object's internal structure.
Test design techniques
Typical black-box test design techniques include:
• Decision table testing
• All-pairs testing
• State transition tables
• Equivalence partitioning
• Boundary value analysis
Unit testing
In computer programming, unit testing is a method by which individual units of source code, sets
of one or more computer program modules together with associated control data, usage procedures, and
operating procedures are tested to determine if they are fit for use. Intuitively, one can view a unit as the
smallest testable part of an application. In procedural programming, a unit could be an entire module, but
is more commonly an individual function or procedure. In object-oriented programming, a unit is often an
entire interface, such as a class, but could be an individual method. Unit tests are created by programmers
or occasionally by white box testers during the development process.
Ideally, each test case is independent from the others. Substitutes such as method stubs, mock
objects, fakes, and test harnesses can be used to assist testing a module in isolation. Unit tests are typically
written and run by software developers to ensure that code meets its design and behaves as intended. Its
implementation can vary from being very manual (pencil and paper)to being formalized as part of build
automation.
Testing will not catch every error in the program, since it cannot evaluate every execution path in
any but the most trivial programs. The same is true for unit testing.
68
Additionally, unit testing by definition only tests the functionality of the units themselves.
Therefore, it will not catch integration errors or broader system-level errors (such as functions performed
across multiple units, or non-functional test areas such as performance).
Unit testing should be done in conjunction with other software testing activities, as they can only
show the presence or absence of particular errors; they cannot prove a complete absence of errors. In order
to guarantee correct behaviour for every execution path and every possible input, and ensure the absence
of errors, other techniques are required, namely the application of formal methods to proving that a software
component has no unexpected behaviour.
Software testing is a combinatorial problem. For example, every Boolean decision statement requires at
least two tests: one with an outcome of "true" and one with an outcome of "false". As a result, for every
line of code written, programmers often need 3 to 5 lines of test code.
This obviously takes time and its investment may not be worth the effort. There are also many
problems that cannot easily be tested at all – for example those that are nondeterministic or involve
multiple threads. In addition, code for a unit test is likely to be at least as buggy as the code it is testing.
Fred Brooks in The Mythical Man-Month quotes: never take two chronometers to sea. Always take one or
three. Meaning, if two chronometers contradict, how do you know which one is correct?
Another challenge related to writing the unit tests is the difficulty of setting up realistic and useful
tests. It is necessary to create relevant initial conditions so the part of the application being tested behaves
like part of the complete system. If these initial conditions are not set correctly, the test will not be exercising
the code in a realistic context, which diminishes the value and accuracy of unit test results.
To obtain the intended benefits from unit testing, rigorous discipline is needed throughout the
software development process. It is essential to keep careful records not only of the tests that have been
performed, but also of all changes that have been made to the source code of this or any other unit in the
software. Use of a version control system is essential. If a later version of the unit fails a particular test that
it had previously passed, the version-control software can provide a list of the source code changes (if any)
that have been applied to the unit since that time.
69
It is also essential to implement a sustainable process for ensuring that test case failures are reviewed
daily and addressed immediately if such a process is not implemented and ingrained into the team's
workflow, the application will evolve out of sync with the unit test suite, increasing false positives and
reducing the effectiveness of the test suite.
Unit testing embedded system software presents a unique challenge: Since the software is being
developed on a different platform than the one it will eventually run on, you cannot readily run a test
program in the actual deployment environment, as is possible with desktop programs.
Functional testing
Functional testing is a quality assurance (QA) process and a type of black box testing that bases
its test cases on the specifications of the software component under test. Functions are tested by feeding
them input and examining the output, and internal program structure is rarely considered (not like in white-
box testing). Functional Testing usually describes what the system does.
Functional testing differs from system testing in that functional testing "verifies a program by checking it
against ... design document(s) or specification(s)", while system testing "validate a program by checking it
against the published user or system requirements" (Kane, Falk, Nguyen 1999, p. 52).
Functional testing typically involves five steps .The identification of functions that the software is
expected to perform
1. The creation of input data based on the function's specifications
2. The determination of output based on the function's specifications
3. The execution of the test case
4. The comparison of actual and expected outputs
Performance testing
In software engineering, performance testing is in general testing performed to determine how
a system performs in terms of responsiveness and stability under a particular workload. It can also serve to
70
investigate, measure, validate or verify other quality attributes of the system, such
as scalability, reliability and resource usage.
Performance testing is a subset of performance engineering, an emerging computer science practice
which strives to build performance into the implementation, design and architecture of a system.
Testing types
Load testing
Load testing is the simplest form of performance testing. A load test is usually conducted to
understand the behaviour of the system under a specific expected load. This load can be the expected
concurrent number of users on the application performing a specific number of transactions within the set
duration. This test will give out the response times of all the important business critical transactions. If
the database, application server, etc. are also monitored, then this simple test can itself point
towards bottlenecks in the application software.
Stress testing
Stress testing is normally used to understand the upper limits of capacity within the system. This
kind of test is done to determine the system's robustness in terms of extreme load and helps application
administrators to determine if the system will perform sufficiently if the current load goes well above the
expected maximum.
Soak testing
Soak testing, also known as endurance testing, is usually done to determine if the system can sustain
the continuous expected load. During soak tests, memory utilization is monitored to detect potential leaks.
Also important, but often overlooked is performance degradation.
That is, to ensure that the throughput and/or response times after some long period of sustained
activity are as good as or better than at the beginning of the test. It essentially involves applying a significant
load to a system for an extended, significant period of time. The goal is to discover how the system behaves
under sustained use.
71
Spike testing
Spike testing is done by suddenly increasing the number of or load generated by, users by a very
large amount and observing the behaviour of the system. The goal is to determine whether performance
will suffer, the system will fail, or it will be able to handle dramatic changes in load.
Configuration testing
Rather than testing for performance from the perspective of load, tests are created to determine the
effects of configuration changes to the system's components on the system's performance and behaviour. A
common example would be experimenting with different methods of load-balancing.
Isolation testing
Isolation testing is not unique to performance testing but involves repeating a test execution that
resulted in a system problem. Often used to isolate and confirm the fault domain.
Integration testing
Integration testing (sometimes called integration and testing, abbreviated I&T) is the phase
in software testing in which individual software modules are combined and tested as a group. It occurs
after unit testing and before validation testing. Integration testing takes as its input modules that have
been unit tested, groups them in larger aggregates, applies tests defined in an integration test plan to those
aggregates, and delivers as its output the integrated system ready for system testing.
Purpose
The purpose of integration testing is to verify functional, performance, and
reliability requirements placed on major design items. These "design items", i.e. assemblages (or groups of
units), are exercised through their interfaces using black box testing, success and error cases being
simulated via appropriate parameter and data inputs. Simulated usage of shared data areas and inter-process
communication is tested and individual subsystems are exercised through their input interface.
72
Test cases are constructed to test whether all the components within assemblages interact correctly,
for example across procedure calls or process activations, and this is done after testing individual modules,
i.e. unit testing. The overall idea is a "building block" approach, in which verified assemblages are added
to a verified base which is then used to support the integration testing of further assemblages.
Some different types of integration testing are big bang, top-down, and bottom-up. Other
Integration Patterns are: Collaboration Integration, Backbone Integration, Layer Integration, Client/Server
Integration, Distributed Services Integration and High-frequency Integration.
Big Bang
In this approach, all or most of the developed modules are coupled together to form a complete
software system or major part of the system and then used for integration testing. The Big Bang method is
very effective for saving time in the integration testing process. However, if the test cases and their results
are not recorded properly, the entire integration process will be more complicated and may prevent the
testing team from achieving the goal of integration testing.
A type of Big Bang Integration testing is called Usage Model testing. Usage Model Testing can be
used in both software and hardware integration testing. The basis behind this type of integration testing is
to run user-like workloads in integrated user-like environments. In doing the testing in this manner, the
environment is proofed, while the individual components are proofed indirectly through their use.
Usage Model testing takes an optimistic approach to testing, because it expects to have few
problems with the individual components. The strategy relies heavily on the component developers to do
the isolated unit testing for their product. The goal of the strategy is to avoid redoing the testing done by
the developers, and instead flesh-out problems caused by the interaction of the components in the
environment.
For integration testing, Usage Model testing can be more efficient and provides better test coverage
than traditional focused functional integration testing. To be more efficient and accurate, care must be used
in defining the user-like workloads for creating realistic scenarios in exercising the environment. This gives
confidence that the integrated environment will work as expected for the target customers.
73
Top-down and Bottom-up
Bottom Up Testing is an approach to integrated testing where the lowest level components are
tested first, then used to facilitate the testing of higher level components. The process is repeated until the
component at the top of the hierarchy is tested.
All the bottom or low-level modules, procedures or functions are integrated and then tested. After
the integration testing of lower level integrated modules, the next level of modules will be formed and can
be used for integration testing. This approach is helpful only when all or most of the modules of the same
development level are ready. This method also helps to determine the levels of software developed and
makes it easier to report testing progress in the form of a percentage.
Top Down Testing is an approach to integrated testing where the top integrated modules are tested
and the branch of the module is tested step by step until the end of the related module.
Sandwich Testing is an approach to combine top down testing with bottom up testing.
The main advantage of the Bottom-Up approach is that bugs are more easily found. With Top-Down, it is
easier to find a missing branch link
Verification and validation
Verification and Validation are independent procedures that are used together for checking that a
product, service, or system meets requirements and specifications and that it full fills its intended
purpose. These are critical components of a quality management system such as ISO 9000. The words
"verification" and "validation" are sometimes preceded with "Independent" (or IV&V), indicating that the
verification and validation is to be performed by a disinterested third party.
It is sometimes said that validation can be expressed by the query "Are you building the right thing?"
and verification by "Are you building it right?"In practice, the usage of these terms varies. Sometimes they
are even used interchangeably.
The PMBOK guide, an IEEE standard, defines them as follows in its 4th edition
• "Validation. The assurance that a product, service, or system meets the needs of the customer and other
identified stakeholders. It often involves acceptance and suitability with external customers. Contrast
with verification."
74
• "Verification. The evaluation of whether or not a product, service, or system complies with a
regulation, requirement, specification, or imposed condition. It is often an internal process. Contrast
with validation."
.Verification is intended to check that a product, service, or system (or portion thereof, or set thereof)
meets a set of initial design specifications. In the development phase, verification procedures involve
performing special tests to model or simulate a portion, or the entirety, of a product, service or system,
then performing a review or analysis of the modelling results. In the post-development phase, verification
procedures involve regularly repeating tests devised specifically to ensure that the product, service, or
system continues to meet the initial design requirements, specifications, and regulations as time progresses.
It is a process that is used to evaluate whether a product, service, or system complies with
regulations, specifications, or conditions imposed at the start of a development phase. Verification can be
in development, scale-up, or production. This is often an internal process.
• Validation is intended to check that development and verification procedures for a product, service,
or system (or portion thereof, or set thereof) result in a product, service, or system (or portion
thereof, or set thereof) that meets initial requirements. For a new development flow or verification
flow, validation procedures may involve modelling either flow and using simulations to predict
faults or gaps that might lead to invalid or incomplete verification or development of a product,
service, or system (or portion thereof, or set thereof). A set of validation requirements,
specifications, and regulations may then be used as a basis for qualifying a development flow or
verification flow for a product, service, or system (or portion thereof, or set thereof). Additional
validation procedures also include those that are designed specifically to ensure that modifications
made to an existing qualified development flow or verification flow will have the effect of
producing a product, service, or system (or portion thereof, or set thereof) that meets the initial
design requirements, specifications, and regulations; these validations help to keep the flow
qualified. It is a process of establishing evidence that provides a high degree of assurance that a
product, service, or system accomplishes its intended requirements. This often involves acceptance
of fitness for purpose with end users and other product stakeholders. This is often an external
process.
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf
Android Based Facemask Detection system report.pdf

More Related Content

What's hot

Online movie booking synopsis
Online movie booking  synopsisOnline movie booking  synopsis
Online movie booking synopsismca projects
 
Airline Reservation System - Software Engineering
Airline Reservation System - Software EngineeringAirline Reservation System - Software Engineering
Airline Reservation System - Software EngineeringDrishti Bhalla
 
LPG Booking System [ bookmylpg.com ] Report
LPG Booking System [ bookmylpg.com ] ReportLPG Booking System [ bookmylpg.com ] Report
LPG Booking System [ bookmylpg.com ] ReportNandu B Rajan
 
Report in Java programming and SQL
Report in Java programming and SQLReport in Java programming and SQL
Report in Java programming and SQLvikram mahendra
 
Tourism and travelling management System
Tourism and travelling management SystemTourism and travelling management System
Tourism and travelling management SystemMoeenuddin Patel
 
Skin Disease Detection using Convolutional Neural Network
Skin Disease Detection using Convolutional Neural NetworkSkin Disease Detection using Convolutional Neural Network
Skin Disease Detection using Convolutional Neural NetworkIRJET Journal
 
⭐⭐⭐⭐⭐ #BCI System using a Novel Processing Technique Based on Electrodes Sele...
⭐⭐⭐⭐⭐ #BCI System using a Novel Processing Technique Based on Electrodes Sele...⭐⭐⭐⭐⭐ #BCI System using a Novel Processing Technique Based on Electrodes Sele...
⭐⭐⭐⭐⭐ #BCI System using a Novel Processing Technique Based on Electrodes Sele...Victor Asanza
 
CS Project on Railway Tickect Reservation for class 12
CS Project on Railway Tickect Reservation for class 12CS Project on Railway Tickect Reservation for class 12
CS Project on Railway Tickect Reservation for class 12anekant28
 
Attendance Management System using Face Recognition
Attendance Management System using Face RecognitionAttendance Management System using Face Recognition
Attendance Management System using Face RecognitionNanditaDutta4
 
Matrix e-Canteen Management (CMM)
Matrix e-Canteen Management (CMM)Matrix e-Canteen Management (CMM)
Matrix e-Canteen Management (CMM)Matrix Comsec
 
Smart web cam motion detection
Smart web cam motion detectionSmart web cam motion detection
Smart web cam motion detectionAnu Mathew
 
Railway Reservation System
Railway Reservation SystemRailway Reservation System
Railway Reservation SystemRoccers
 
Online movie ticket booking system
Online movie ticket booking systemOnline movie ticket booking system
Online movie ticket booking systemSikandar Pandit
 
Erd - School Bus Routing Management System
Erd - School Bus Routing Management SystemErd - School Bus Routing Management System
Erd - School Bus Routing Management Systemayushi goyal
 
Railway booking & management system
Railway booking & management systemRailway booking & management system
Railway booking & management systemNikhil Raj
 
Face Recognition Attendance System
Face Recognition Attendance System Face Recognition Attendance System
Face Recognition Attendance System Shreya Dandavate
 
Synopsis gor online Tourism.
Synopsis gor online Tourism.Synopsis gor online Tourism.
Synopsis gor online Tourism.Janu Ansari
 

What's hot (20)

Safety app for woman
Safety app for womanSafety app for woman
Safety app for woman
 
Online Railway reservation
Online Railway reservationOnline Railway reservation
Online Railway reservation
 
Online movie booking synopsis
Online movie booking  synopsisOnline movie booking  synopsis
Online movie booking synopsis
 
Airline Reservation System - Software Engineering
Airline Reservation System - Software EngineeringAirline Reservation System - Software Engineering
Airline Reservation System - Software Engineering
 
LPG Booking System [ bookmylpg.com ] Report
LPG Booking System [ bookmylpg.com ] ReportLPG Booking System [ bookmylpg.com ] Report
LPG Booking System [ bookmylpg.com ] Report
 
Report in Java programming and SQL
Report in Java programming and SQLReport in Java programming and SQL
Report in Java programming and SQL
 
Tourism and travelling management System
Tourism and travelling management SystemTourism and travelling management System
Tourism and travelling management System
 
Skin Disease Detection using Convolutional Neural Network
Skin Disease Detection using Convolutional Neural NetworkSkin Disease Detection using Convolutional Neural Network
Skin Disease Detection using Convolutional Neural Network
 
⭐⭐⭐⭐⭐ #BCI System using a Novel Processing Technique Based on Electrodes Sele...
⭐⭐⭐⭐⭐ #BCI System using a Novel Processing Technique Based on Electrodes Sele...⭐⭐⭐⭐⭐ #BCI System using a Novel Processing Technique Based on Electrodes Sele...
⭐⭐⭐⭐⭐ #BCI System using a Novel Processing Technique Based on Electrodes Sele...
 
CS Project on Railway Tickect Reservation for class 12
CS Project on Railway Tickect Reservation for class 12CS Project on Railway Tickect Reservation for class 12
CS Project on Railway Tickect Reservation for class 12
 
Attendance Management System using Face Recognition
Attendance Management System using Face RecognitionAttendance Management System using Face Recognition
Attendance Management System using Face Recognition
 
Matrix e-Canteen Management (CMM)
Matrix e-Canteen Management (CMM)Matrix e-Canteen Management (CMM)
Matrix e-Canteen Management (CMM)
 
Smart web cam motion detection
Smart web cam motion detectionSmart web cam motion detection
Smart web cam motion detection
 
Railway Reservation System
Railway Reservation SystemRailway Reservation System
Railway Reservation System
 
Online movie ticket booking system
Online movie ticket booking systemOnline movie ticket booking system
Online movie ticket booking system
 
Erd - School Bus Routing Management System
Erd - School Bus Routing Management SystemErd - School Bus Routing Management System
Erd - School Bus Routing Management System
 
Railway booking & management system
Railway booking & management systemRailway booking & management system
Railway booking & management system
 
Face Recognition Attendance System
Face Recognition Attendance System Face Recognition Attendance System
Face Recognition Attendance System
 
Synopsis gor online Tourism.
Synopsis gor online Tourism.Synopsis gor online Tourism.
Synopsis gor online Tourism.
 
Final_report
Final_reportFinal_report
Final_report
 

Similar to Android Based Facemask Detection system report.pdf

Attendance System using Face Recognition
Attendance System using Face RecognitionAttendance System using Face Recognition
Attendance System using Face RecognitionIRJET Journal
 
IRJET- Free & Generic Facial Attendance System using Android
IRJET- Free & Generic Facial Attendance System using AndroidIRJET- Free & Generic Facial Attendance System using Android
IRJET- Free & Generic Facial Attendance System using AndroidIRJET Journal
 
FACE MASK DETECTION AND COUNTER IN THINGSPEAK WITH EMAIL ALERT SYSTEM FOR COV...
FACE MASK DETECTION AND COUNTER IN THINGSPEAK WITH EMAIL ALERT SYSTEM FOR COV...FACE MASK DETECTION AND COUNTER IN THINGSPEAK WITH EMAIL ALERT SYSTEM FOR COV...
FACE MASK DETECTION AND COUNTER IN THINGSPEAK WITH EMAIL ALERT SYSTEM FOR COV...IRJET Journal
 
Real Time Face Mask Detection
Real Time Face Mask DetectionReal Time Face Mask Detection
Real Time Face Mask DetectionIRJET Journal
 
Face Mask Detection and Face Recognition Using Machine Learning
Face Mask Detection and Face Recognition Using Machine LearningFace Mask Detection and Face Recognition Using Machine Learning
Face Mask Detection and Face Recognition Using Machine LearningIRJET Journal
 
DROWSINESS DETECTION MODEL USING PYTHON
DROWSINESS DETECTION MODEL USING PYTHONDROWSINESS DETECTION MODEL USING PYTHON
DROWSINESS DETECTION MODEL USING PYTHONIRJET Journal
 
A VISUAL ATTENDANCE SYSTEM USING FACE RECOGNITION
A VISUAL ATTENDANCE SYSTEM USING FACE RECOGNITIONA VISUAL ATTENDANCE SYSTEM USING FACE RECOGNITION
A VISUAL ATTENDANCE SYSTEM USING FACE RECOGNITIONIRJET Journal
 
Person Acquisition and Identification Tool
Person Acquisition and Identification ToolPerson Acquisition and Identification Tool
Person Acquisition and Identification ToolIRJET Journal
 
FACE SHAPE CLASSIFIER USING DEEP LEARNING
FACE SHAPE CLASSIFIER USING DEEP LEARNINGFACE SHAPE CLASSIFIER USING DEEP LEARNING
FACE SHAPE CLASSIFIER USING DEEP LEARNINGIRJET Journal
 
MTCNN BASED AUTOMATIC ATTENDANCE SYSTEM USING FACE RECOGNITION
MTCNN BASED AUTOMATIC ATTENDANCE SYSTEM USING FACE RECOGNITIONMTCNN BASED AUTOMATIC ATTENDANCE SYSTEM USING FACE RECOGNITION
MTCNN BASED AUTOMATIC ATTENDANCE SYSTEM USING FACE RECOGNITIONIRJET Journal
 
IRJET-Human Face Detection and Identification using Deep Metric Learning
IRJET-Human Face Detection and Identification using Deep Metric LearningIRJET-Human Face Detection and Identification using Deep Metric Learning
IRJET-Human Face Detection and Identification using Deep Metric LearningIRJET Journal
 
Automated Attendance Management System
Automated Attendance Management SystemAutomated Attendance Management System
Automated Attendance Management SystemIRJET Journal
 
IRJET - Face Detection and Recognition System
IRJET -  	  Face Detection and Recognition SystemIRJET -  	  Face Detection and Recognition System
IRJET - Face Detection and Recognition SystemIRJET Journal
 
FACE RECOGNITION ATTENDANCE SYSTEM
FACE RECOGNITION ATTENDANCE SYSTEMFACE RECOGNITION ATTENDANCE SYSTEM
FACE RECOGNITION ATTENDANCE SYSTEMIRJET Journal
 
FACE RECOGNITION ATTENDANCE SYSTEM
FACE RECOGNITION ATTENDANCE SYSTEMFACE RECOGNITION ATTENDANCE SYSTEM
FACE RECOGNITION ATTENDANCE SYSTEMIRJET Journal
 
Attendance System using Face Recognition
Attendance System using Face RecognitionAttendance System using Face Recognition
Attendance System using Face RecognitionIRJET Journal
 
Development of Real Time Face Recognition System using OpenCV
Development of Real Time Face Recognition System using OpenCVDevelopment of Real Time Face Recognition System using OpenCV
Development of Real Time Face Recognition System using OpenCVIRJET Journal
 
Jitin_Francis_CV....
Jitin_Francis_CV....Jitin_Francis_CV....
Jitin_Francis_CV....Jitin Francis
 

Similar to Android Based Facemask Detection system report.pdf (20)

Attendance System using Face Recognition
Attendance System using Face RecognitionAttendance System using Face Recognition
Attendance System using Face Recognition
 
Face detection
Face detectionFace detection
Face detection
 
IRJET- Free & Generic Facial Attendance System using Android
IRJET- Free & Generic Facial Attendance System using AndroidIRJET- Free & Generic Facial Attendance System using Android
IRJET- Free & Generic Facial Attendance System using Android
 
FACE MASK DETECTION AND COUNTER IN THINGSPEAK WITH EMAIL ALERT SYSTEM FOR COV...
FACE MASK DETECTION AND COUNTER IN THINGSPEAK WITH EMAIL ALERT SYSTEM FOR COV...FACE MASK DETECTION AND COUNTER IN THINGSPEAK WITH EMAIL ALERT SYSTEM FOR COV...
FACE MASK DETECTION AND COUNTER IN THINGSPEAK WITH EMAIL ALERT SYSTEM FOR COV...
 
Real Time Face Mask Detection
Real Time Face Mask DetectionReal Time Face Mask Detection
Real Time Face Mask Detection
 
term paper 1
term paper 1term paper 1
term paper 1
 
Face Mask Detection and Face Recognition Using Machine Learning
Face Mask Detection and Face Recognition Using Machine LearningFace Mask Detection and Face Recognition Using Machine Learning
Face Mask Detection and Face Recognition Using Machine Learning
 
DROWSINESS DETECTION MODEL USING PYTHON
DROWSINESS DETECTION MODEL USING PYTHONDROWSINESS DETECTION MODEL USING PYTHON
DROWSINESS DETECTION MODEL USING PYTHON
 
A VISUAL ATTENDANCE SYSTEM USING FACE RECOGNITION
A VISUAL ATTENDANCE SYSTEM USING FACE RECOGNITIONA VISUAL ATTENDANCE SYSTEM USING FACE RECOGNITION
A VISUAL ATTENDANCE SYSTEM USING FACE RECOGNITION
 
Person Acquisition and Identification Tool
Person Acquisition and Identification ToolPerson Acquisition and Identification Tool
Person Acquisition and Identification Tool
 
FACE SHAPE CLASSIFIER USING DEEP LEARNING
FACE SHAPE CLASSIFIER USING DEEP LEARNINGFACE SHAPE CLASSIFIER USING DEEP LEARNING
FACE SHAPE CLASSIFIER USING DEEP LEARNING
 
MTCNN BASED AUTOMATIC ATTENDANCE SYSTEM USING FACE RECOGNITION
MTCNN BASED AUTOMATIC ATTENDANCE SYSTEM USING FACE RECOGNITIONMTCNN BASED AUTOMATIC ATTENDANCE SYSTEM USING FACE RECOGNITION
MTCNN BASED AUTOMATIC ATTENDANCE SYSTEM USING FACE RECOGNITION
 
IRJET-Human Face Detection and Identification using Deep Metric Learning
IRJET-Human Face Detection and Identification using Deep Metric LearningIRJET-Human Face Detection and Identification using Deep Metric Learning
IRJET-Human Face Detection and Identification using Deep Metric Learning
 
Automated Attendance Management System
Automated Attendance Management SystemAutomated Attendance Management System
Automated Attendance Management System
 
IRJET - Face Detection and Recognition System
IRJET -  	  Face Detection and Recognition SystemIRJET -  	  Face Detection and Recognition System
IRJET - Face Detection and Recognition System
 
FACE RECOGNITION ATTENDANCE SYSTEM
FACE RECOGNITION ATTENDANCE SYSTEMFACE RECOGNITION ATTENDANCE SYSTEM
FACE RECOGNITION ATTENDANCE SYSTEM
 
FACE RECOGNITION ATTENDANCE SYSTEM
FACE RECOGNITION ATTENDANCE SYSTEMFACE RECOGNITION ATTENDANCE SYSTEM
FACE RECOGNITION ATTENDANCE SYSTEM
 
Attendance System using Face Recognition
Attendance System using Face RecognitionAttendance System using Face Recognition
Attendance System using Face Recognition
 
Development of Real Time Face Recognition System using OpenCV
Development of Real Time Face Recognition System using OpenCVDevelopment of Real Time Face Recognition System using OpenCV
Development of Real Time Face Recognition System using OpenCV
 
Jitin_Francis_CV....
Jitin_Francis_CV....Jitin_Francis_CV....
Jitin_Francis_CV....
 

Recently uploaded

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 

Recently uploaded (20)

Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 

Android Based Facemask Detection system report.pdf

  • 1. A Project Report on ANDROID BASED FACE MASK DETECTION SYSTEM Pantech Solution pvt Limited Project Report Submitted By APU KUMAR GIRI 4th Semester MCA BPUT Reg. No. 2005280031 Biju Patnaik University of Technology, Rourkela, Odisha. Under The Guidance of Prof. PRAVAKAR MISHRA (HOD,Dpartment of Master of Computer Application ) Fulfilment of the Requirement of Project Report Submitted to NIIS INSTITUTE OF BUSINESS ADMINISTRATION in Partial 4th Semester MCA examinations of BPUT,ODISHA-2022 NIIS INSTITUTE OF BUSINESS ADMINISTRATION Madanpur,Bhubaneswar,Odisha,Pin code-752054 www.niisgroup.org/www.niisinst.com
  • 2. Srinivasan.N CIN: U80902TN2021PTC141464 www.pantechelearning.com PANHYD0660/MAJOR/ANDROID/2021-2022 COMPLETION CERTIFICATE TO WHOMSOEVER IT MAY CONCERN This is to certify that Mr./Mrs. APU KUMAR GIRI, Roll Number -(2005280031), who is pursuing Master of Computer Applications, Department at NIIS INSTITUTE OF BUSINESS ADMINISTRATION has successfully completed his/her Major Project at, Pantech Solutions Pvt Limited on (“FACE MASK DETECTION”) and has submitted the report. During the Major Project period, the candidate has shown keen interest and commitment towards learning and his/her performance was good. Yours truly, Pantech E Learning Pvt. Ltd, (Branch Manager) Pantech E Learning Pvt Ltd 4th Floor, Delta Chambers, Behind Chennai Shopping Mall, Ameerpet, Hyderabad, Telangana – 500 016 Phone:91 040-40077960. | hr@pantechmail.com
  • 3. 1 NIIS INSTITUTE OF BUSINESS ADMINISTRATION CERTIFICATE This is to certify that the project work report entitled “ANDROID BASED FACE MASK DETECTION” submitted by APU KUMAR GIRI (Roll no-2005280031) have the undertaken and successfully completed the project submission in partial fulfilment of the requirement for MASTER OF COMPUTER APPLICATION(MCA). This work is original and being submitted as a part of 4th semester project for MCA curriculum. Signature of Internal guide Signature of External guide
  • 4. 22 Guidance Certificate This is to certify that the Project is titled ANDROID BASED FACE MASK DETECTION SYSTEM. This Project is Submitted by APU KUMAR GIRI Student of MCA , 4th semester , regd. No.-2005280031 , NIIS INSTITUTE OF BUSINESS ADMINISTRATION ,BHUBANESWAR in fulfilment of the requirements for MASTER OF COMPUTER APPLICATION. This project was an Excellent work done by him under my guidance. Signature of guidance
  • 5. NIIS INSTITUTE OF BUSINESS ADMINISTRATION DECLARATION I, APU KUMAR GIRI ,The student of MCA 4th Semester (2020-22) studying at NIIS INSTITUTE OF BUSINESS ADMINISTRATION,Madanpur,Bhubaneswar solemnly declare that the project titled “ANDROID BASED FACE MASK DETECTION SYSTEM” was carried out by we partial fulfilment of MCA programme. This programme was undertaken as a part of academic curriculum according to the University rules and norms and by no commercial interest and motives. APU KUMAR GIRI MCA 4TH SEMSETER ROLL NO-2005280031 NIIS INSTITUTE OF BUSINES ADMINISTRATION MADANPUR,BHUBANESWAR ODISHA,752054
  • 6. 4 ACKNOWLEDGEMENT I feel great pleasure for the completion of this project. At the very outset I would express my sincere of thanks and deep sense of gratitude to personnel who helped me during the collection of data and gave me rare and valuable guidance for the preparation of this report . I thank to my guide com supervisor Prof. PRAVAKAR MISHRA for his Continuous patience and support. I take this opportunity to express my deep sense of gratitude and appreciation to my project guide Prof. PRAVAKAR MISHRA for his assistance , motivation and being continual source of encouragement for me. I would like to thanks Prof. PRAVAKAR MISHRA (HOD of MCA) for always helping me right from the beginning of this project. I take opportunity to thanks all my friends and also thank all the people who directly or indirectly concerned with this project. I also express my gratitude to my parents who give a constant support and love throughout my life and career. APU KUMAR GIRI MCA 4TH SEMSETER ROLL NO-2005280031 NIIS INSTITUTE OF BUSINES ADMINISTRATION Madanpur, Bhubaaneswar Odisha-752054
  • 7. 5 CONTENTS- Abstract Chapter -1: Overview Introduction Chapter-2: Literature Survey Chapter-3: System Analysis Chapter-4: Block diagram Chapter-5: Modules Chapter-6: Softwsre requirements Hardware requirements Chapter-7: UML diagram Class diagram Sequence diagram Chapter-8: Software description Testing Coding Output Reference
  • 8. 6 Android Based Face mask detection System ABSTRACT: In order to effectively prevent the spread of COVID19 virus, almost everyone wears a mask during corona virus pandemic. This almost makes conventional facial recognition technology ineffective in many cases, such as community access control, face access control, facial attendance, facial security checks at train stations, etc. Therefore, it is very urgent to improve the recognition performance of the existing face recognition technology on the masked faces. Most current advanced face recognition approaches are designed based on deep learning, which depend on a large number of face samples. However, at present, there are no publicly available masked face recognition datasets. To this end, this work proposes three types of masked face datasets, including Masked Face Detection Dataset (MFDD), Real-world Masked Face Recognition Dataset (RMFRD) and Simulated Masked Face Recognition Dataset (SMFRD). Among them, to the best of our knowledge, RMFRD is currently the world’s largest real-world masked face dataset. These datasets are freely available to industry and academia, based on which various applications on masked faces can be developed. CHAPTER 1: OVERVIEW: Face mask detection is a simple model to detect face mask .Due to COVID- 19 there is need of face mask detection application on many places like Malls and Theatres for safety. By the development of face mask detection we can detect if the person is wearing a face mask and allow their entry would be of great help to the society.
  • 9. 7 Face Mask detection model is built using the Deep Learning technique called asConvolutional Neural Networks (CNN). This CNN Model is built using the Keras and Tensor Flow framework and the OpenCV library which is highly used for real-time applications. INTRODUCTION: Face recognition is a promising area of applied computer vision. This technique is used to recognize a face or identify a person automatically from given images. In our daily life activates like, in a passport checking, smart door, access control, voter verification, criminal investigation, and many other purposes face recognition is widely used to authenticate a person correctly and automatically. Face recognition has gained much attention as a unique, reliable biometric recognition technology that makes it most popular than any other biometric technique likes password, pin, fingerprint, etc. Many of the governments across the world also interested in the face recognition system to secure public places such as parks, airports, bus stations, and railway stations, etc. Face recognition is one of the well-studied real life problems. Excellent progress has been done against face recognition technology. CHAPTER 2 LITERATURE SURVEY: SNO TITLE AUTHOR YEAR drawback 1 Study of Mask face detection gayathri deore 2016 We can’t See the Eye line detection
  • 10. 8 2 Cascade mask face detection Chengbin peng 2017 It is difficult to Detect the Masked face 3 Design of Deep face Detector Using rcnn Caner ozer 2019 In this Low accuracy rate will be appeared 4 Face mask detection Using mobilenet Shree prakash 2020 It has high accuracy CHAPTER 3: SYSTEM ANALYSIS EXISTING SYSTEM: ● Support vector machine ● Discrete wavelet transform DRAWBACK: ● Existing face recognition solutions are no longer reliable when wearing a mask. ● Time consuming Process ● Poor Detection
  • 11. 9 PROPOSED SYSTEM: ● Convolutional neural network ● Caffle dataset ADAVNTAGE: ● Highly Security ● Its easily detection in mask CHAPTER 4: BLOCK DIAGRAM:
  • 12. 10 CHAPTER 5: MODULES ● Pre-processing ● Discrete wavelet transform ● Cropping face image ● Caffle mobile net models
  • 13. 11 ● NN CLASSIFIER MODULES EXPLANATION: PRE-PROCESSING Image Pre-processing is a common name for operations with images at the lowest level of abstraction. Its input and output are intensity images. The aim of pre-processing is an improvement of the image data that suppresses unwanted distortions or enhances some image features important for further processing. Image restoration is the operation of taking a corrupted/noisy image and estimating the clean original image. Corruption may come in many forms such as motion blur, noise, and camera misfocus. Image restoration is different from image enhancement in that the latter is designed to emphasize features of the image that make the image more pleasing to the observer, but not necessarily to produce realistic data from a scientific point of view. Image enhancement techniques (like contrast stretching or de-blurring by a nearest neighbor procedure) provided by "Imaging packages" use no a priori model of the process that created the image. With image enhancement noise can be effectively be removed by sacrificing some resolution, but this is not acceptable in many applications. In a Fluorescence Microscope resolution in the z-direction is bad as it is. More advanced image processing techniques must be applied to recover the object. De-Convolution is an example of image restoration method. It is capable of: Increasing resolution, especially in the axial direction removing noise increasing contrast. Discrete wavelet transforms: The CWT and the discrete wavelet transforms differ in how they discretize
  • 14. 12 the scale parameter. The CWT typically uses exponential scales with a base smaller than 2, for example 21/12 . The discrete wavelet transform always uses exponential scales with the base equal to 2. The scales in the discrete wavelet transform are powers of 2. Keep in mind that the physical intrepretation of scales for both the CWT and discrete wavelet transforms requires the inclusion of the signal’s sampling interval if it is not equal to one. For example, assume you are using the CWT and you set your base to s0=21/12. To attach physical significance to that scale, you must multiply by the sampling interval Δt, so a scale vector covering approximately four octaves with the sampling interval taken into account is sj0Δt j=1,2,⋯48. Note that the sampling interval multiplies the scales, it is not in the exponent. For discrete wavelet transforms the base scale is always 2. The decimated and nondecimated discrete wavelet transforms differ in how they discretize the translation parameter. The decimated discrete wavelet transform (DWT), always translates by an integer multiple of the scale, 2jm . The nondecimated discrete wavelet transform translates by integer shifts. These differences in how scale and translation are discretized result in advantages and disadvantages for the two classes of wavelet transforms. These differences also determine use cases where one wavelet transform is likely to provide superior results. Some important consequences of the discretization of the scale and translation parameter are: The DWT provides a sparse representation for many natural signals. In other words, the important features of many natural signals are captured by a subset of DWT coefficients that is typically much smaller than the original signal.
  • 15. 13 This “compresses” the signal. With the DWT, you always end up with the same number of coefficients as the original signal, but many of the coefficients may be close to zero in value. As a result, you can often throw away those coefficients and still maintain a high-quality signal approximation. With the CWT, you go from N samples for an N-length signal to a M-by-N matrix of coefficents with M equal to the number of scales. The CWT is a highly redundant transform. There is significant overlap between wavelets at each scale and between scales. The computational resources required to compute the CWT and store the coefficients is much larger than the DWT. The nondecimated discrete wavelet transform is also redundant but the redundancy factor is usually significantly less than the CWT, because the scale parameter is not discretized so finely. For the nondecimated discrete wavelet transform, you go from N samples to an L+1-by-N matrix of coefficients where L is the level of the transform. The strict discretization of scale and translation in the DWT ensures that the DWT is an orthonormal transform (when using an orthogonal wavelet). There are many benefits of orthonormal transforms in signal analysis. Many signal models consist of some deterministic signal plus white Gaussian noise. An orthonormal transform takes this kind of signal and outputs the transform applied to the signal plus white noise. In other words, an orthonormal transform takes in white Gaussian noise and outputs white Gaussian noise. The noise is uncorrelated at the input and output. This is important in many statistical signal processing settings. In the case of the DWT, the signal of interest is typically captured by a few large-magnitude DWT coefficients, while the noise results in many small DWT coefficients that you can throw
  • 16. 14 away. If you have studied linear algebra, you have no doubt learned many advantages to using orthonormal bases in the analysis and representation of vectors. The wavelets in the DWT are like orthonormal vectors. Neither the CWT nor the nondecimated discrete wavelet transform are orthonormal transforms. The wavelets in the CWT and nondecimated discrete wavelet transform are technically called frames, they are linearly-dependent sets. The DWT is not shift-invariant. Because the DWT downsamples, a shift in the input signal does not manifest itself as a simple equivalent shift in the DWT coefficients at all levels. A simple shift in a signal can cause a significant realignment of signal energy in the DWT coefficients by scale. The CWT and non decimated discrete wavelet transform are shift-invariant. There are some modifications of the DWT such as the dual-tree complex discrete wavelet transform that mitigate the lack of shift invariance in the DWT, see Critically Sampled and Oversampled Wavelet Filter Banks for some conceptual material on this topic and Dual-Tree Wavelet Transforms for an example. The discrete wavelet transforms are equivalent to discrete filter banks. Specifically, they are tree-structured discrete filter banks where the signal is first filtered by a lowpass and a highpass filter to yield lowpass and highpass subbands. Subsequently, the lowpass subband is iteratively filtered by the same scheme to yield narrower octave-band lowpass and highpass subbands. In the DWT, the filter outputs are downsampled at each successive stage. In
  • 17. 15 the non decimated discrete wavelet transform, the outputs are not downsampled. The filters that define the discrete wavelet transforms typically only have a small number of coefficients so the transform can be implemented very efficiently. For both the DWT and non decimated discrete wavelet transform, you do not actually require an expression for the wavelet. The filters are sufficient. This is not the case with the CWT. The most common implementation of the CWT requires you have the wavelet explicitly defined. Even though the non decimated discrete wavelet transform does not down sample the signal, the filter bank implementation still allows for good computational performance. DWT The discrete wavelet transforms provide perfect reconstruction of the signal upon inversion. This means that you can take the discrete wavelet transform of a signal and then use the coefficients to synthesize an exact reproduction of the signal to within numerical precision. You can implement an inverse CWT, but it is often the case that the reconstruction is not perfect. Reconstructing a signal from the CWT coefficients is a much less stable numerical operation. The finer sampling of scales in the CWT typically results in a higher-fidelity signal analysis. You can localize transients in your signal, or characterize oscillatory behaviour better with the CWT than with the discrete wavelet transforms.
  • 18. 16 CROPPING FACE IMAGE: It is mainly used to crop the image from nose to jaw it checks whether the user is wearing a mask or not . Caffle mobile net models: It is one type of dataset model and mainly used store it in database. Neural network: Neural networks are predictive models loosely based on the action of biological neurons. The selection of the name “neural network” was one of the great PR successes of the Twentieth Century. It certainly sounds more exciting than a technical description such as “A network of weighted, additive values with nonlinear transfer functions”. However, despite the name, neural networks are far from “thinking machines” or “artificial brains”. A typical artificial neural network might have a hundred neurons. In comparison, the human nervous system is believed to have about 3x1010 neurons. We are still light years from “Data”.
  • 19. 17 The original “Perceptron” model was developed by Frank Rosenblatt in 1958. Rosenblatt’s model consisted of three layers, (1) a “retina” that distributed inputs to the second layer, (2) “association units” that combine the inputs with weights and trigger a threshold step function which feeds to the output layer, (3) the output layer which combines the values. Unfortunately, the use of a step function in the neurons made the perceptions difficult or impossible to train. A critical analysis of perceptrons published in 1969 by Marvin Minsky and Seymore Paper pointed out a number of critical weaknesses of perceptrons, and, for a period of time, interest in perceptrons waned. Interest in neural networks was revived in 1986 when David Rumelhart, Geoffrey Hinton and Ronald Williams published “Learning Internal Representations by Error Propagation”. They proposed a multilayer neural network with nonlinear but differentiable transfer functions that avoided the pitfalls of the original perceptron’s step functions. They also provided a reasonably effective training algorithm for neural networks. Types of Neural Networks: 1) Artificial Neural Network 2) Probabilistic Neural Networks 3) General Regression Neural Networks DTREG implements the most widely used types of neural networks:
  • 20. 18 a) Multilayer Perceptron Networks (also known as multilayer feed-forward network), b) Cascade Correlation Neural Networks, c) Probabilistic Neural Networks (NN) d) General Regression Neural Networks (GRNN). Radial Basis Function Networks: a) Functional Link Networks, b) Kohonen networks, c) Gram-Charlier networks, d) Hebb networks, e) Adaline networks, f) Hybrid Networks. The Multilayer Perceptron Neural Network Model The following diagram illustrates a perceptron network with three layers:
  • 21. 19 This network has an input layer (on the left) with three neurons, one hidden layer (in the middle) with three neurons and an output layer (on the right) with three neurons. There is one neuron in the input layer for each predictor variable. In the case of categorical variables, N-1 neurons are used to represent the N categories of the variable. Input Layer — A vector of predictor variable values (x1...xp) is presented to the input layer. The input layer (or processing before the input layer) standardizes these values so that the range of each variable is -1 to 1. The input layer distributes the values to each of the neurons in the hidden layer. In addition to the predictor variables, there is a constant input of 1.0, called the bias that is fed to each of the hidden layers; the bias is multiplied by a weight and added to the sum going into the neuron. Hidden Layer — Arriving at a neuron in the hidden layer, the value from each input neuron is multiplied by a weight (wji), and the resulting weighted values are added together producing a combined value uj. The weighted sum
  • 22. 20 (uj) is fed into a transfer function, σ, which outputs a value hj. The outputs from the hidden layer are distributed to the output layer. Output Layer — Arriving at a neuron in the output layer, the value from each hidden layer neuron is multiplied by a weight (wkj), and the resulting weighted values are added together producing a combined value vj. The weighted sum (vj) is fed into a transfer function, σ, which outputs a value yk. The y values are the outputs of the network. If a regression analysis is being performed with a continuous target variable, then there is a single neuron in the output layer, and it generates a single y value. For classification problems with categorical target variables, there are N neurons in the output layer producing N values, one for each of the N categories of the target variable. Neural Networks (NN): Neural Network (NN) and General Regression Neural Networks (GRNN) have similar architectures, but there is a fundamental difference: networks perform classification where the target variable is categorical, whereas general regression neural networks perform regression where the target variable is continuous. If you select a NN/GRNN network, DTREG will automatically select the correct type of network based on the type of target variable.
  • 23. 21 Architecture of a NN: All NN networks have four layers: 1. Input layer — There is one neuron in the input layer for each predictor variable. In the case of categorical variables, N-1 neurons are used where N is the number of categories. The input neurons (or processing before the input layer) standardizes the range of the values by subtracting the median and dividing by the interquartile range. The input neurons then feed the values to each of the neurons in the hidden layer. 2. Hidden layer — This layer has one neuron for each case in the training data set. The neuron stores the values of the predictor variables for the case along with the target value. When presented with the x vector of input values from the input layer, a hidden neuron computes the
  • 24. 22 Euclidean distance of the test case from the neuron’s center point and then applies the RBF kernel function using the sigma value(s). The resulting value is passed to the neurons in the pattern layer. 3. Pattern layer / Summation layer — The next layer in the network is different for NN networks and for GRNN networks. For NN networks there is one pattern neuron for each category of the target variable. The actual target category of each training case is stored with each hidden neuron; the weighted value coming out of a hidden neuron is fed only to the pattern neuron that corresponds to the hidden neuron’s category. The pattern neurons add the values for the class they represent (hence, it is a weighted vote for that category). For GRNN networks, there are only two neurons in the pattern layer. One neuron is the denominator summation unit the other is the numerator summation unit. The denominator summation unit adds up the weight values coming from each of the hidden neurons. The numerator summation unit adds up the weight values multiplied by the actual target value for each hidden neuron. 4. Decision layer — The decision layer is different for NN and GRNN networks. For NN networks, the decision layer compares the weighted
  • 25. 23 votes for each target category accumulated in the pattern layer and uses the largest vote to predict the target category. For GRNN networks, the decision layer divides the value accumulated in the numerator summation unit by the value in the denominator summation unit and uses the result as the predicted target value. The following diagram is actual diagram or propose network used in our project 1) Input Layer: The input vector, denoted as p, is presented as the black vertical bar.Its dimension is R × 1. In this paper, R = 3. 2) Radial Basis Layer: In Radial Basis Layer, the vector distances between input vector p and the weight vector made of each row of weight matrix W are calculated. Here, the vector distance is defined as the dot product between two vectors [8]. Assume
  • 26. 24 the dimension of W is Q×R. The dot product between p and the i-th row of W produces the i-th element of the distance vector ||W-p||, whose dimension is Q×1. The minus symbol, “-”, indicates that it is the distance between vectors. Then, the bias vector b is combined with ||W- p|| by an element-by- element multiplication, .The result is denoted as n = ||W- p|| ..p. The transfer function in NN has built into a distance criterion with respect to a center. In this paper, it is defined as radbas(n) = 2 n e- (1) Each element of n is substituted into Eq. 1 and produces corresponding element of a, the output vector of Radial Basis Layer. The i-th element of a can be represented as ai = radbas(||Wi - p|| ..bi) (2) where Wi is the vector made of the i-th row of W and bi is the i-th element of bias vector b. Some characteristics of Radial Basis Layer: The i-th element of a equals to 1 if the input p is identical to the ith row of input weight matrix W. A radial basis neuron with a weight vector close to the input vector p produces a value near 1 and then its output weights in the competitive layer will pass their values to the competitive function. It is also possible that several elements of a are close to 1 since the input pattern is close to several training patterns. . 3) Competitive Layer: There is no bias in Competitive Layer. In Competitive Layer, the vector a is firstly multiplied with layer weight matrix M, producing an output vector d. The competitive function, denoted as C in Fig. 2, produces a 1 corresponding to the largest element of d, and 0’s elsewhere. The output vector of competitive function is denoted as c. The index of 1 in c is the number of tumor that the system can classify. The dimension of output vector, K, is 5 in this paper.
  • 27. 25 How NN network work: Although the implementation is very different, neural networks are conceptually similar to K-Nearest Neighbor (k-NN) models. The basic idea is that a predicted target value of an item is likely to be about the same as other items that have close values of the predictor variables. Consider this figure: Assume that each case in the training set has two predictor variables, x and y. The cases are plotted using their x,y coordinates as shown in the figure. Also assume that the target variable has two categories, positive which is denoted by a square and negative which is denoted by a dash. Now, suppose we are trying to predict the value of a new case represented by the triangle with
  • 28. 26 predictor values x=6, y=5.1. Should we predict the target as positive or negative? Notice that the triangle is position almost exactly on top of a dash representing a negative value. But that dash is in a fairly unusual position compared to the other dashes which are clustered below the squares and left of center. So it could be that the underlying negative value is an odd case. The nearest neighbor classification performed for this example depends on how many neighboring points are considered. If 1-NN is used and only the closest point is considered, then clearly the new point should be classified as negative since it is on top of a known negative point. On the other hand, if 9- NN classification is used and the closest 9 points are considered, then the effect of the surrounding 8 positive points may overbalance the close negative point. A neural network builds on this foundation and generalizes it to consider all of the other points. The distance is computed from the point being evaluated to each of the other points, and a radial basis function (RBF) (also called a kernel function) is applied to the distance to compute the weight (influence) for each point. The radial basis function is so named because the radius distance is the argument to the function. Weight = RBF (distance) The further some other point is from the new point, the less influence it has.
  • 29. 27
  • 30. 28 Radial Basis Function Different types of radial basis functions could be used, but the most common is the Gaussian function: Advantages and disadvantages of NN networks: 1. It is usually much faster to train a NN/GRNN network than a multilayer a. perceptron network. 2. NN/GRNN networks often are more accurate than multilayer perceptron a. networks. 3. NN/GRNN networks are relatively insensitive to outliers (wild points). 4. NN networks generate accurate predicted target probability scores. 5. NN networks approach Bayes optimal classification. 6. NN/GRNN networks are slower than multilayer perceptron networks at
  • 31. 29 7. classifying new cases. 8. NN/GRNN networks require more memory space to store the model. Removing unnecessary neurons One of the disadvantages of NN models compared to multilayer perceptron networks is that NN models are large due to the fact that there is one neuron for each training row. This causes the model to run slower than multilayer perceptron networks when using scoring to predict values for new rows. DTREG provides an option to cause it remove unnecessary neurons from the model after the model has been constructed. Removing unnecessary neurons has three benefits: 1. The size of the stored model is reduced. 2. The time required to apply the model during scoring is reduced. 3. Removing neurons often improves the accuracy of the model. The process of removing unnecessary neurons is an iterative process. Leave- one-out validation is used to measure the error of the model with each neuron removed. The neuron that causes the least increase in error (or possibly the largest reduction in error) is then removed from the model. The process is repeated with the remaining neurons until the stopping criterion is reached. When unnecessary neurons are removed, the “Model Size” section of the analysis report shows how the error changes with different numbers of
  • 32. 30 neurons. You can see a graphical chart of this by clicking Chart/Model size. There are three criteria that can be selected to guide the removal of neurons: 1. Minimize error – If this option is selected, then DTREG removes neurons as long as the leave-one-out error remains constant or decreases. It stops when it finds a neuron whose removal would cause the error to increase above the minimum found. 2. Minimize neurons – If this option is selected, DTREG removes neurons until the leave-one-out error would exceed the error for the model with all neurons. 3. # of neurons – If this option is selected, DTREG reduces the least significant neurons until only the specified number of neurons remain. 4. Document classification is the task of grouping documents into categories based upon their content - never before has it been as important as it is today. The exponential growth of unstructured data combined with a marked increase in litigation, security and privacy rules have left organizations utterly unable to cope with the conflicting demands of the business, lawyers and regulators. The net is escalating costs and risks, with no end in sight. Without tools to facilitate automated, content based classification, organizations have little hope of catching up, let alone getting ahead of the problem. Technology has created the problem, and technology will be needed to address it. Manual classification is out of the question due to the volume of data, while naïve automatic approaches such as predefined search terms have performed poorly due to the complexity of human language. Many
  • 33. 31 advanced approaches have been proposed to solve this problem, however over the last several years Support Vector Machines (SVM) classification has come to the forefront. SVM’s deep academic roots, accuracy, computational scalability, language independence and ease of implementation make it ideally suited to tackling the document classification challenges faced by today’s large organizations. SVM (support vector machine) SVM is a group of learning algorithms primarily used for classification tasks on complicated data such as image classification and protein structure analysis. SVM is used in a countless fields in science and industry, including Bio-technology, Medicine, Chemistry and Computer Science. It has also turned out to be ideally suited for categorization of large text repositories such as those housed in virtually all large, modern organizations. Introduced in 1992, SVM quickly became regarded as the state-of-the-art method for classification of complex, high-dimensional data. In particular its ability to capture trends observed in a small training set and to generalize those trends against a broader corpus have made it useful across a large number of applications. SVM uses a supervised learning approach, which means it learns to classify unseen data based on a set of labeled training data, such as corporate documents. The initial set of training data is typically identified by domain experts and is used to build a model that can be applied to any other data outside the training set. The effort required to construct a high quality training set is quite modest, particularly when compared to the volume of data that may be ultimately classified against it. This means that learning algorithms such as SVM offer an exceptionally cost effective method of text classification for the massive volumes of documents produced by modern organizations. The balance of this paper covers the inner workings of SVM, its application in science and industry, the legal defensibility of the method as well as classification accuracy compared to manual classification.
  • 34. 32 5. Overview SUPPORT VECTOR MACHINE : 6. SVM is built upon a solid foundation of statistical learning theory. Early classifiers were proposed by Vladimir Vapnik and Alexey Chervonenkis more than 40 years ago. In 1992 Boser, Guyon and Vapnik proposed an improvement that considerably extended the applicability of SVM. From this point on SVM began to establish its reputation as the stateof-the-art method for data categorization. Starting with handwriting recognition tasks SVM showed results that were superior to all other methods of classification. It was quickly shown that SVM was able to beat even Artificial Neural Networks that were considered to be the strongest categorization algorithms at the time. Thousands of researchers applied SVM to a large number of machine learning problems and the results have contributed to the acceptance of this technology as the state-of-the-art for machine classification. Numerous studies (Joachmis 1998, Dumais et al. 1998, Drucker et al. 1999) have shown the superiority of SVM over other machine learning methods for text categorization problems. For example, Joachmis reported 86% accuracy of SVM on classification of the Reuters news dataset, while the next best method, a significantly slower k- NearestNeighbor algorithm was only able to achieve 82% accuracy. Today SVM is widely accepted in industry as well as in the academia. For example, Health Discovery Corporation uses SVM in a medical image analysis tool currently licensed to Pfizer, Dow Chemical uses SVM in their research for outlier detection and Reuters uses SVM for text classification. 7. Under the Hood 8. The typical approach for text classification using SVM. The model is trained using a set of documents labeled by domain experts. The validation procedure computes the expected accuracy of the model on unclassified data. The labeled data itself is used in the accuracy evaluation and therefore the error estimates take into account the specifics of particular data. Once a model is constructed, it can then be
  • 35. 33 used to efficiently and quickly classify new unseen documents in real time. 9. Model construction 10.SVM is most commonly used to split a single input set of documents into two distinct subsets. For example, we could classify documents into privileged and non-privileged or record and non-record sets. The SVM algorithm learns to distinguish between the two categories based on a training set of documents that contains labeled examples from both categories. Internally, SVM manipulates docum The separating line (also called "the model") is recorded and used for classification of new documents. New documents are mapped and classified based on their position with respect to the model. There are many ways to reduce a document to a vector representation that can be used for classification. For example, we could count the number of times particular terms, characters or substrings appeared. We may also consider the lengths of sentences or the amount of white space. Example: Say we are trying to classify documents into "work" and "fun" categories. To represent the document as a vector, we could count the number of times the words "meeting" and "play" occurred. In this case the document will be represented as a vector of size two or simply as a point on a twodimensional plane (like in Figure 2). Similarly, if we count appearance of three or more different words, then the document would be represented in three or more dimensions. The dimensions are also called features. There are many kinds of features we can compute on a document including the frequency of appearance of a particular character, the ratio of upper case letters to lower case, the average size of the sentence, etc. Some features are more useful than others, while some are simply noise. Within SVM there exist good algorithms that evaluate how well a particular feature helps in classification. Typically, when documents are prepared for classification the features are extracted, analyzed and the noisiest ones automatically removed. Once data is preprocessed and a multi-dimensional representation of a document is generated, SVM then finds the optimum hyper-plane to separate the data. As shown in Figure 3 there may be several possible separators, however the SVM algorithm must pick the best one. Figure
  • 36. 34 3 Often there are many possible separators for the data. Support Vector Machines choose the separator with maximum margin as it has best generalization properties. It is said that one separator is better than another if it generalizes better, i.e. shows better performance on documents outside of the training set. It turns out that the generalization quality of the plane is related to the distance between the plane and the data points that lay on the boundary of the two data classes. These data points are called "support vectors" and the SVM algorithm determines the plane that is as far from all support vectors as possible. In other words SVM finds the separator with a maximum margin and is often called a "maximum margin classifier". Multi-class classification The examples above demonstrate classification into two categories; however it is often necessary to group documents into three or more classes. There are established methods of using SVM for multi-class classification. Most commonly an ensemble of binary (twoclass) classifiers is used for this problem. In such an ensemble each classifier is trained to recognize one particular category versus all o mathematically projected into higher dimensions where it can be more easily separated. For example, while the data shown in Figure 4 is not linearly separable in two dimensions; it can be separated in three dimensions. To do so, imagine that we loaded this picture into a slide projector and projected it onto a white cone attached to the wall in front of the projector with the tip pointing away from the wall. We center the cone such that its tip is matched with the center of our plot. Then, consider where the points appear on the surface of the cone. All blue points will be projected closer to the tip of the cone and all the red points will appear closer to the other end. Figure 5: Non-linear classification. First figure shows the dataset that cannon be separated by a linear classifier in two dimensions, but can be projected into three dimensions where it can be separated by the plane. Now, we can take a plane (a linear separator in three dimensions) and cut the cone, such that all blue points stay on one side of the plane and the green points on the other side. SVM will pick a plane that has good generalization properties to classify unseen data. The cone in this example is called a kernel. In other words, the kernel is the function that describes how to project data
  • 37. 35 into higher dimensions. There are a number of different types of kernels available to fit different data. Good guidelines exist to help in selecting a proper kernel. Accuracy Evaluation Understanding the accuracy or the expected rate of success of the algorithm is essential to the successful application in a commercial enterprise. Fortunately solid testing procedures have been developed over the years to evaluate the performance of learning algorithms. Accuracy is a measure of how close the results of the automatic classification match the true categories of the documents. Accuracy is estimated by applying the classifier to the testing dataset classified by domain experts. The documents for which the classification algorithm and domain experts assigned the same label is said to be classified correctly and the rest of the documents are classified incorrectly. The accuracy is computed as number of correct over number of correct plus number incorrect. In other words, the accuracy is the percentage of documents that were classified correctly. When evaluating the accuracy it is essential to ensure that documents from the testing set were not used to train the classifier. This ensures that the classifier will have no unfair information about testing documents that will inflate the performance. Typically, the set of all documents labeled by domain experts is split into training and testing sets. The algorithm is trained using the training set and then applied to the testing set for accuracy evaluation. Since, none of the testing documents appeared in the training set, the performance of the algorithm on the training set will be good estimator of expected accuracy on unseen data. In order to build up the confidence in the estimated value of accuracy it is beneficial to train the model on multiple training sets and evaluate it against multiple testing sets and then compute the average accuracy. This approach is known as k-fold cross-validation and is accomplished by partitioning a single labeled dataset into multiple testing and training sets. As shown in Figure 6, the method prescribes the labeled set to be split into k parts, also called folds. Commonly a 10 fold split is considered to be sufficient. For every fold the training set is constructed from all but one part of the labeled data. The single remaining part is used as a testing set. This approach results in k training/testing sets from the same labeled dataset. The
  • 38. 36 accuracy is evaluated on each of the splits and the average computed. When making sense of a quality of the classification algorithm it is important to keep in mind that it is typically not possible to reach 100% accuracy, however accuracies of 80%- 90% are commonly considered achievable. To gain the perspective of what 90% accuracy means in the real world it is necessary to compare it to the accuracy of human reviewers. It turns out that on average human reviewers do not perform better than some of the best machine learning algorithms and in many cases humans perform significantly worse. Godbole and Roy, 2008 studied the quality of human classification of natural language texts in the support industry. They found that when different groups of reviewers were asked to review the same set of documents they disagreed on categories for 47% of documents. Furthermore, when the same reviewer was given the same document to review on different occasions their labels only agreed in 64% of cases, this means that the same reviewer did not even agree with themselves 1/3 of the time. It is now possible to train a machine algorithm that will outperform or work on par with manual classification. Wai Lam et. al., 1999 observed this when comparing the quality of Figure 6: Example of 5-fold cross- validation. Labeled data is split into five parts and for each fold classifier is trained on four parts and validated on one remaining part and the average of fold accuracies is computed. manual and automatic classification of medical literature with respect to text retrieval. Similar observations were reported in the litigation support industry by Anne Kershaw, a founder of nationally recognized litigation management consulting firm. Her group compared the results of automatic and manual privilege coding over population of 48,000 documents and found that automated classification was much more accurate then manual review, minimizing the chance of missing an important privileged document. Defensibility To analyze the defensibility of results obtained using SVM classification, consider the related standard for admitting expert scientific testimony in a federal trial. In Daubert vs. Merrell Dow Pharmaceuticals, Mr. Justice Blackman suggested following four factors be considered: • Whether the theory or technique can be and has been tested • Whether the theory or technique has been
  • 39. 37 subjected to peer review and publications • The known or potential rate of error or the existence of standards • Whether the theory or technique used has been generally accepted SVM satisfies all four of requirements. The years of research in statistical learning theory as well as thousands of publications that study SVMs completely satisfy the first and second requirements. The extensive testing methodologies available for SVM quantify the expected accuracy of the algorithm and as such completely satisfy the third requirement. The data-centric error rate calculation described in the Accuracy Evaluation section above measures the accuracy of the algorithm as it specifically relates to a particular data. This approach to testing and quality evaluation meets the strictest requirements of modern science. The second and fourth requirements are satisfied by the wide acceptance of SVM as a state- of-the-art method for classification that is broadly utilized in science and industry. CONVOLUTIONAL NEURAL NETWORK CNNs, like neural networks, are made up of neurons with learnable weights and biases. Each neuron receives several inputs, takes a weighted sum over them, pass it through an activation function and responds with an output. The whole network has a loss function and all the tips and tricks that we developed for neural networks still apply on CNNs. Pretty straightforward, right?So, how are Convolutional Neural Networks different than Neural Networks?
  • 40. 38 CNNs operate over Volumes ! What do we mean by this? 1. Example of a RGB image (let’s call it ‘input image’) 2. Unlike neural networks, where the input is a vector, here the input is a multi-channeled image (3 channeled in this case). Before we go any deeper, let us first understand what convolution means.
  • 41. 39 Convolution 2. Convolving an image with a filter We take the 5*5*3 filter and slide it over the complete image and along the way take the dot product between the filter and chunks of the input image. The convolution layer is the main building block of a convolutional neural network. Convolution Layer The convolution layer comprises of a set of independent filters (6 in the example shown). Each filter is independently convolved with the image and
  • 42. 40 we end up with 6 feature maps of shape 28*28*1. Convolution Layers in sequence All these filters are initialized randomly and become our parameters which will be learned by the network subsequently. I will show you an example of a trained network. Filters in a trained network Take a look at the filters in the very first layer (these are our 5*5*3 filters). Through back propagation, they have tuned themselves to become blobs of coloured pieces and edges. As we go deeper to other convolution layers, the filters are doing dot products to the input of the previous convolution layers. So, they are taking the smaller coloured pieces or edges and making larger pieces out of them.Take a look at image 4 and imagine the 28*28*1 grid as a grid of 28*28 neurons. For a particular feature map (the output received on convolving the image with a particular filter is called a feature map), each neuron is connected only to a small chunk of the input image and all the neurons have the same connection weights. So again coming back to the differences between CNN and a neural network.CNNs have a couple of concepts called parameter sharing and local connectivity Parameter sharing is sharing of weights by all neurons in a particular feature map. Local connectivity is the concept of each neural connected only to a subset of the input image (unlike a neural network where all the neurons are fully
  • 43. 41 connected)This helps to reduce the number of parameters in the whole system and makes the computation more efficient. Pooling Layers A pooling layer is another building block of a CNN. Pooling Its function is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network. Pooling layer operates on each feature map independently. The most common approach used in pooling is max pooling. Convolution neural networks (CNN): Convolutional neural network (CNN) and General Regression Neural Networks (GRNN) have similar architectures, but there is a fundamental difference: Probabilistic networks perform classification where the target variable is categorical, whereas general regression neural networks perform regression where the target variable is continuous. If you select a
  • 44. 42 CNN/GRNN network, DTREG will automatically select the correct type of network based on the type of target variable. CONVOLUTIONAL NEURAL NETWORK ALGORITHM Consider a network with a single real input x and network function F. The derivative F’(x) is computed in two phases: Feed-forward: the input x is fed into the network. The primitive functions at the nodes and their derivatives are evaluated at each node. The derivatives are stored. Back propagation: The constant 1 is fed into the output unit and the network is run backwards. Incoming information to a node is added and the result is multiplied by the value stored in the left part of the unit. The result is transmitted to the left of the unit. The result collected at the input unit is the derivative of the network function with respect to x. STEPS OF THE ALGORITHM The Convolution neural algorithm is used to compute the necessary corrections, after choosing the weights of the network randomly. The algorithm can be decomposed in the following four steps: i) Feed-forward computation ii) Convolution neural to the output layer iii) Convolution neural to the hidden layer iv) Weight updates
  • 45. 43 Typical architecture of CNN We have already discussed about convolution layers (denoted by CONV) and pooling layers (denoted by POOL). RELU is just a non linearity which is applied similar to neural networks. The FC is the fully connected layer of neurons at the end of CNN. Neurons in a fully connected layer have full connections to all activations in the previous layer, as seen in regular Neural Networks and work in a similar way. CNNs are especially tricky to train, as they add even more hyper-parameters than a standard MLP. While the usual rules of thumb for learning rates and regularization constants still apply, the following should be kept in mind when optimizing CNNs. Number of filters When choosing the number of filters per layer, keep in mind that computing the activations of a single convolutional filter is much more expensive than with traditional MLPs ! Assume layer contains feature maps and pixel positions (i.e., number of positions times number of feature maps), and there are filters at layer of shape . Then computing a feature map (applying an filter at all pixel positions where the filter can be applied) costs . The total cost is times that. Things may be more complicated if not all features at one level are connected to all features at the previous one.
  • 46. 44 For a standard MLP, the cost would only be where there are different neurons at level . As such, the number of filters used in CNNs is typically much smaller than the number of hidden units in MLPs and depends on the size of the feature maps (itself a function of input image size and filter shapes). Since feature map size decreases with depth, layers near the input layer will tend to have fewer filters while layers higher up can have much more. In fact, to equalize computation at each layer, the product of the number of features and the number of pixel positions is typically picked to be roughly constant across layers. To preserve the information about the input would require keeping the total number of activations (number of feature maps times number of pixel positions) to be non-decreasing from one layer to the next (of course we could hope to get away with less when we are doing supervised learning). The number of feature maps directly controls capacity and so that depends on the number of available examples and the complexity of the task. Filter Shape Common filter shapes found in the literature vary greatly, usually based on the dataset. Best results on MNIST-sized images (28x28) are usually in the 5x5 range on the first layer, while natural image datasets (often with hundreds of pixels in each dimension) tend to use larger first-layer filters of shape 12x12 or 15x15. The trick is thus to find the right level of “granularity” (i.e. filter shapes) in order to create abstractions at the proper scale, given a particular dataset. Max Pooling Shape Typical values are 2x2 or no max-pooling. Very large input images may
  • 47. 45 warrant 4x4 pooling in the lower-layers. Keep in mind however, that this will reduce the dimension of the signal by a factor of 16, and may result in throwing away too much information. CHAPTER 6: SOFTWARE REQUIREMENT HARDWARE Software: Android studio ● Python idle :3.9 ● Anaconda navigator 3.3 ● Adas lane vehicle Hardware: ● RAM:8GB ● HDD:1TB
  • 51. 49 Collaboration diagram: CHAPTER 8 SOFTWARE DESCRIPTION: Java (programming language) History The JAVA language was created by James Gosling in June 1991 for use in a set top box project. The language was initially called Oak, after an oak tree that stood outside Gosling's office - and also went
  • 52. 50 by the name Green - and ended up later being renamed to Java, from a list of random words. Gosling's goals were to implement a virtual machine and a language that had a familiar C/C++ style of notation. The first public implementation was Java 1.0 in 1995. It promised "Write Once, Run Anywhere” (WORA), providing no-cost runtimes on popular platforms. It was fairly secure and its security was configurable, allowing network and file access to be restricted. Major web browsers soon incorporated the ability to run secure Java applets within web pages. Java quickly became popular. With the advent of Java 2, new versions had multiple configurations built for different types of platforms. For example, J2EE was for enterprise applications and the greatly stripped down version J2ME was for mobile applications. J2SE was the designation for the Standard Edition. In 2006, for marketing purposes, new J2 versions were renamed Java EE, Java ME, and Java SE, respectively. In 1997, Sun Microsystems approached the ISO/IEC JTC1 standards bodyand later the Ecma International to formalize Java, but it soon withdrew from the process. Java remains a standard that is controlled through the Java Community Process. At one time, Sun made most of its Java implementations available without charge although they were proprietary software. Sun's revenue from Java was generated by the selling of licenses for specialized products such as the Java Enterprise System. Sun distinguishes between its Software Development Kit (SDK) and Runtime Environment (JRE)which is a subset of the SDK, the primary distinction being that in the JRE, the compiler, utility programs, and many necessary header files are not present. On 13 Novmber2006, Sun released much of Java as free softwareunder the terms of the GNU General Public License(GPL). On 8 May2007Sun finished the process, making all of Java's core code open source, aside from a small portion of code to which Sun did not hold the copyright. Primary goals There were five primary goals in the creation of the Java language: • It should use the object-oriented programming methodology. • It should allow the same program to be executed on multiple operating systems. • It should contain built-in support for using computer networks. • It should be designed to execute code from remote sources securely.
  • 53. 51 • It should be easy to use by selecting what were considered the good parts of other object- oriented languages The Java Programming Language: The Java programming language is a high-level language that can be characterized by all of the following buzzwords: • Simple • Architecture neutral • Object oriented • Portable • Distributed • High performance Each of the preceding buzzwords is explained in The Java Language Environment , a white paper written by James Gosling and Henry McGilton. In the Java programming language, all source code is first written in plain text files ending with the .java extension. Those source files are then compiled into .class files by the javac compiler. A .class file does not contain code that is native to your processor; it instead contains byte codes — the machine language of the Java Virtual Machine1 (Java VM). The java launcher tool then runs your application with an instance of the Java Virtual Machine.
  • 54. 52 An overview of the software development process. Because the Java VM is available on many different operating systems, the same .class files are capable of running on Microsoft Windows, the Solaris TM Operating System (Solaris OS), Linux, or Mac OS. Some virtual machines, such as the Java Hot Spot virtual machineperform additional steps at runtime to give your application a performance boost. This include various tasks such as finding performance bottlenecks and recompiling (to native code) frequently used sections of code.
  • 55. 53 Through the Java VM, the same application is capable of running on multiple platforms. The Java Platform A platform is the hardware or software environment in which a program runs. We've already mentioned some of the most popular platforms like Microsoft Windows, Linux, Solaris OS, and Mac OS. Most platforms can be described as a combination of the operating system and underlying hardware. The Java platform differs from most other platforms in that it's a software-only platform that runs on top of other hardware-based platforms. The Java platform has two components: The Java Virtual Machine The Java Application Programming Interface (API) You've already been introduced to the Java Virtual Machine; it's the base for the Java platform and is ported onto various hardware-based platforms. The API is a large collection of ready-made software components that provide many useful capabilities. It is grouped into libraries of related classes and interfaces; these libraries are known as packages. The next section, What CanJavaTechnologyDo?Highlights some of the functionality provided by the API.
  • 56. 54 The API and Java Virtual Machine insulate the program from the underlying hardware. As a platform-independent environment, the Java platform can be a bit slower than native code. However, advances in compiler and virtual machine technologies are bringing performance close to that of native code without threatening portability. Java Runtime Environment The Java Runtime Environment, or JRE, is the software required to run any application deployed on the Java Platform. End-users commonly use a JRE in software packages and Web browser plug-in. Sun also distributes a superset of the JRE called the Java 2 SDK(more commonly known as the JDK), which includes development tools such as the Javacompiler,Javadoc, Jarand debugger. One of the unique advantages of the concept of a runtime engine is that errors (exceptions) should not 'crash' the system. Moreover, in runtime engine environments such as Java there exist tools that attach to the runtime engine and every time that an exception of interest occurs they record debugging information that existed in memory at the time the exception was thrown (stack and heap values). These Automated Exception Handling tools provide 'root-cause' information for exceptions in Java programs that run in production, testing or development environments. Uses OF JAVA Blue is a smart card enabled with the secure, cross-platform, object-oriented Java Card API and
  • 57. 55 technology. Blue contains an actual on-card processing chip, allowing for enhance able and multiple functionality within a single card. Applets that comply with the Java Card API specification can run on any third-party vendor card that provides the necessary Java Card Application Environment (JCAE). Not only can multiple applet programs run on a single card, but new applets and functionality can be added after the card is issued to the customer • Java Can be used in Chemistry. • In NASA also Java is used. • In 2D and 3D applications java is used. • In Graphics Programming also Java is used. • In Animations Java is used. • In Online and Web Applications Java is used. JSP : JavaServer Pages (JSP) is a Java technology that allows software developers to dynamically generate HTML, XML or other types of documents in response to a Web client request. The technology allows Java code and certain pre-defined actions to be embedded into static content. The JSP syntax adds additional XML-like tags, called JSP actions, to be used to invoke built-in functionality. Additionally, the technology allows for the creation of JSP tag libraries that act as extensions to the standard HTML or XML tags. Tag libraries provide a platform independent way of extending the capabilities of a Web server. JSPs are compiled into Java Servlet by a JSP compiler. A JSP compiler may generate a servlet in Java code that is then compiled by the Java compiler, or it may generate byte code for the servlet directly. JSPs can also be interpreted on-the-fly reducing the time taken to reload changes JavaServer Pages (JSP) technology provides a simplified, fast way to create dynamic web content. JSP technology enables rapid development of web-based applications that are server and platform- independent.
  • 58. 56 Architecture OF JSP The Advantages of JSP Active Server Pages (ASP). ASP is a similar technology from Microsoft. The advantages of JSP are twofold. First, the dynamic part is written in Java, not Visual Basic or other MS-specific language, so it is more powerful and easier to use. Second, it is portable to other operating systems and non-Microsoft Web servers. Pure Servlet. JSP doesn't give you anything that you couldn't in principle do with a Servlet. But it is more convenient to write (and to modify!) regular HTML than to have a zillion println statements that generate the HTML. Plus, by separating the look from the content you can put different people on different
  • 59. 57 tasks: your Web page design experts can build the HTML, leaving places for your Servlet programmers to insert the dynamic content. Server-Side Includes (SSI). SSI is a widely-supported technology for including externally-defined pieces into a static Web page. JSP is better because it lets you use Servlet instead of a separate program to generate that dynamic part. Besides, SSI is really only intended for simple inclusions, not for "real" programs that use form data, make database connections, and the like. JavaScript. JavaScript can generate HTML dynamically on the client. This is a useful capability, but only handles situations where the dynamic information is based on the client's environment. With the exception of cookies, HTTP and form submission data is not available to JavaScript. And, since it runs on the client, JavaScript can't access server-side resources like databases, catalogs, pricing information, and the like. Static HTML. Regular HTML, of course, cannot contain dynamic information. JSP is so easy and convenient that it is quite feasible to augment HTML pages that only benefit marginally by the insertion of small amounts of dynamic data. Previously, the cost of using dynamic data would preclude its use in all but the most valuable instances. ARCHITECTURE OF JSP • The browser sends a request to a JSP page. • The JSP page communicates with a Java bean.
  • 60. 58 • The Java bean is connected to a database. • The JSP page responds to the browser. SERVLETS – FRONT END The Java Servlet API allows a software developer to add dynamic content to a Web server using the Java platform. The generated content is commonly HTML, but may be other data such as XML. Servlet are the Java counterpart to non-Java dynamic Web content technologies such as PHP, CGI and ASP.NET. Servlet can maintain state across many server transactions by using HTTP cookies, session variables or URL rewriting. The Servlet API, contained in the Java package hierarchy javax. Servlet, defines the expected interactions of a Web container and a Servlet. A Web container is essentially the component of a Web server that interacts with the Servlet. The Web container is responsible for managing the lifecycle of Servlet, mapping a URL to a particular Servlet and ensuring that the URL requester has the correct access rights. A Servlet is an object that receives a request and generates a response based on that request. The basic Servlet package defines Java objects to represent Servlet requests and responses, as well as objects to reflect the Servlet configuration parameters and execution environment. The package javax .Servlet. Http defines HTTP-specific subclasses of the generic Servlet elements, including session management objects that track multiple requests and responses between the Web server and a client. Servlet may be packaged in a WAR file as a Web application. Servlet can be generated automatically by Java Server Pages(JSP), or alternately by template engines such as Web Macro. Often Servlet are used in conjunction with JSPs in a pattern called "Model 2”, which is a flavour of the model-view-controller pattern. Servlet are Java technology's answer to CGI programming. They are programs that run on a Web server and build Web pages. Building Web pages on the fly is useful (and commonly done) for a number of reasons:.
  • 61. 59 The Web page is based on data submitted by the user. For example the results pages from search engines are generated this way, and programs that process orders for e-commerce sites do this as well. The data changes frequently. For example, a weather-report or news headlines page might build the page dynamically, perhaps returning a previously built page if it is still up to date. The Web page uses information from corporate databases or other such sources. For example, you would use this for making a Web page at an on-line store that lists current prices and number of items in stock. The Servlet Run-time Environment A Servlet is a Java class and therefore needs to be executed in a Java VM by a service we call a Servlet engine. The Servlet engine loads the servlet class the first time the Servlet is requested, or optionally already when the Servlet engine is started. The Servlet then stays loaded to handle multiple requests until it is explicitly unloaded or the Servlet engine is shut down. Some Web servers, such as Sun's Java Web Server (JWS), W3C's Jigsaw and Gefion Software's Lite Web Server (LWS) are implemented in Java and have a built-in Servlet engine. Other Web servers, such as Netscape's Enterprise Server, Microsoft's Internet Information Server (IIS) and the Apache Group's Apache, require a Servlet engine add-on module. The add-on intercepts all requests for Servlet, executes them and returns the response through the Web server to the client. Examples of Servlet engine add-ons are Gefion Software's WAI Cool Runner, IBM's Web Sphere, Live Software's JRun and New Atlanta's Servlet Exec. All Servlet API classes and a simple Servlet-enabled Web server are combined into the Java Servlet Development Kit (JSDK), available for download at Sun's official Servlet site .To get started with Servlet I recommend that you download the JSDK and play around with the sample Servlet. Life Cycle OF Servlet • The Servlet lifecycle consists of the following steps:
  • 62. 60 • The Servlet class is loaded by the container during start-up. The container calls the init() method. This method initializes the Servlet and must be called before the Servlet can service any requests. In the entire life of a Servlet, the init() method is called only once. After initialization, the Servlet can service client-requests. Each request is serviced in its own separate thread. The container calls the service() method of the Servlet for every request. The service() method determines the kind of request being made and dispatches it to an appropriate method to handle the request. The developer of the Servlet must provide an implementation for these methods. If a request for a method that is not implemented by the Servlet is made, the method of the parent class is called, typically resulting in an error being returned to the requester. Finally, the container calls the destroy() method which takes the Servlet out of service. The destroy() method like init() is called only once in the lifecycle of a Servlet. Request and Response Objects The do Get method has two interesting parameters: HttpServletRequest and HttpServletResponse. These two objects give you full access to all information about the request and let you control the output sent to the client as the response to the request. With CGI you read environment variables and stdin to get information about the request, but the names of the environment variables may vary between implementations and some are not provided by all Web servers. The HttpServletRequest object provides the same information as the CGI environment variables, plus more, in a standardized way. It also provides methods for extracting HTTP parameters from the query string or the request body depending on the type of request (GET or POST). As a Servlet developer you access parameters the same way for both types of requests. Other methods give you access to all request headers and help you parse date and cookie headers. Instead of writing the response to stdout as you do with CGI, you get an OutputStream or a PrintWriter from the HttpServletResponse. The OuputStream is intended for binary data, such as a GIF or JPEG image, and the PrintWriter for text output. You can also set all response headers and the status code, without having to rely on special Web server CGI configurations such as Non Parsed Headers (NPH). This makes your Servlet easier to install.
  • 63. 61 ServletConfig and ServletContext There is only one ServletContext in every application. This object can be used by all the Servlet to obtain application level information or container details. Every Servlet, on the other hand, gets its own ServletConfig object. This object provides initialization parameters for a servlet. A developer can obtain the reference to ServletContext using either the ServletConfig object or ServletRequest object. All servlets belong to one servlet context. In implementations of the 1.0 and 2.0 versions of the Servlet API all servlets on one host belongs to the same context, but with the 2.1 version of the API the context becomes more powerful and can be seen as the humble beginnings of an Application concept. Future versions of the API will make this even more pronounced. Many servlet engines implementing the Servlet 2.1 API let you group a set of servlets into one context and support more than one context on the same host. The ServletContext in the 2.1 API is responsible for the state of its servlets and knows about resources and attributes available to the servlets in the context. Here we will only look at how ServletContext attributes can be used to share information among a group of servlets. There are three ServletContext methods dealing with context attributes: getAttribute, setAttribute and removeAttribute. In addition the servlet engine may provide ways to configure a servlet context with initial attribute values. This serves as a welcome addition to the servlet initialization arguments for configuration information used by a group of servlets, for instance the database identifier we talked about above, a style sheet URL for an application, the name of a mail server, etc. JDBC Java Database Connectivity (JDBC) is a programming framework for Java developers writing programs that access information stored in databases, spreadsheets, and flat files. JDBC is commonly used to connect a user program to a "behind the scenes" database, regardless of what database management software is used to control the database. In this way, JDBC is cross- platform . This article will provide an introduction and sample code that demonstrates database access from Java programs that use the classes of the JDBC API, which is available for free download from Sun's site .
  • 64. 62 A database that another program links to is called a data source. Many data sources, including products produced by Microsoft and Oracle, already use a standard called Open Database Connectivity (ODBC). Many legacy C and Perl programs use ODBC to connect to data sources. ODBC consolidated much of the commonality between database management systems. JDBC builds on this feature, and increases the level of abstraction. JDBC-ODBC bridges have been created to allow Java programs to connect to ODBC-enabled database software . JDBC Architecture Two-tier and Three-tier Processing Models The JDBC API supports both two-tier and three-tier processing models for database access. In the two-tier model, a Java applet or application talks directly to the data source. This requires a JDBC driver that can communicate with the particular data source being accessed. A user's commands are delivered to the database or other data source, and the results of those statements are sent back to the user. The data source may be located on another machine to which the user is connected via a network. This is referred to as a client/server configuration, with the user's machine as the client, and the machine housing the data source as the server. The network can be an intranet, which, for example, connects employees within a corporation, or it can be the Internet. In the three-tier model, commands are sent to a "middle tier" of services, which then sends the commands to the data source. The data source processes the commands and sends the results back to the middle tier, which then sends them to the user.
  • 65. 63 MIS directors find the three-tier model very attractive because the middle tier makes it possible to maintain control over access and the kinds of updates that can be made to corporate data. Another advantage is that it simplifies the deployment of applications. Finally, in many cases, the three-tier architecture can provide performance advantages. Until recently, the middle tier has often been written in languages such as C or C++, which offer fast performance. However, with the introduction of optimizing compilers that translate Java byte code into efficient machine-specific code and technologies such as Enterprise JavaBeans™, the Java platform is fast becoming the standard platform for middle-tier development. This is a big plus, making it possible to take advantage of Java's robustness, multithreading, and security features. With enterprises increasingly using the Java programming language for writing server code, the JDBC API is being used more and more in the middle tier of a three-tier architecture. Some of the features that make JDBC a server technology are its support for connection pooling, distributed transactions, and disconnected rowsets. The JDBC API is also what allows access to a data source from a Java middle tier.
  • 66. 64 Testing The various levels of testing are 1. White Box Testing 2. Black Box Testing 3. Unit Testing 4. Functional Testing 5. Performance Testing 6. Integration Testing 7. Objective 8. Integration Testing 9. Validation Testing 10. System Testing 11. Structure Testing 12. Output Testing 13. User Acceptance Testing White Box Testing White-box testing (also known as clear box testing, glass box testing, transparent box testing, and structural testing) is a method of testing software that tests internal structures or workings of an application, as opposed to its functionality (i.e. black-box testing). In white-box testing an internal perspective of the system, as well as programming skills, are used to design test cases. The tester chooses inputs to exercise paths through the code and determine the appropriate outputs. This is analogous to testing nodes in a circuit, e.g. in-circuit testing (ICT). While white-box testing can be applied at the unit, integration and system levels of the software testing process, it is usually done at the unit level. It can test paths within a unit, paths between units during integration, and between subsystems during a system–level test. Though this method of test design can uncover many errors or problems, it might not detect unimplemented parts of the specification or missing requirements. White-box test design techniques include: • Control flow testing • Data flow testing
  • 67. 65 • Branch testing • Path testing • Statement coverage • Decision coverage White-box testing is a method of testing the application at the level of the source code. The test cases are derived through the use of the design techniques mentioned above: control flow testing, data flow testing, branch testing, path testing, statement coverage and decision coverage as well as modified condition/decision coverage. White-box testing is the use of these techniques as guidelines to create an error free environment by examining any fragile code. These White-box testing techniques are the building blocks of white-box testing, whose essence is the careful testing of the application at the source code level to prevent any hidden errors later on. These different techniques exercise every visible path of the source code to minimize errors and create an error- free environment. The whole point of white-box testing is the ability to know which line of the code is being executed and being able to identify what the correct output should be. Levels 1. Unit testing. White-box testing is done during unit testing to ensure that the code is working as intended, before any integration happens with previously tested code. White-box testing during unit testing catches any defects early on and aids in any defects that happen later on after the code is integrated with the rest of the application and therefore prevents any type of errors later on. 2. Integration testing. White-box testing at this level are written to test the interactions of each interface with each other. The Unit level testing made sure that each code was tested and working accordingly in an isolated environment and integration examines the correctness of the behaviour in an open environment through the use of white-box testing for any interactions of interfaces that are known to the programmer. 3. Regression testing. White-box testing during regression testing is the use of recycled white-box test cases at the unit and integration testing levels.
  • 68. 66 White-box testing's basic procedures involve the understanding of the source code that you are testing at a deep level to be able to test them. The programmer must have a deep understanding of the application to know what kinds of test cases to create so that every visible path is exercised for testing. Once the source code is understood then the source code can be analysed for test cases to be created. These are the three basic steps that white-box testing takes in order to create test cases: 1. Input, involves different types of requirements, functional specifications, detailed designing of documents, proper source code, security specifications. This is the preparation stage of white-box testing to layout all of the basic information. 2. Processing Unit, involves performing risk analysis to guide whole testing process, proper test plan, execute test cases and communicate results. This is the phase of building test cases to make sure they thoroughly test the application the given results are recorded accordingly. 3. Output, prepare final report that encompasses all of the above preparations and results. Black Box Testing Black-box testing is a method of software testing that examines the functionality of an application (e.g. what the software does) without peering into its internal structures or workings (see white-box testing). This method of test can be applied to virtually every level of software testing: unit, integration,system and acceptance. It typically comprises most if not all higher level testing, but can also dominate unit testing as well Test procedures Specific knowledge of the application's code/internal structure and programming knowledge in general is not required. The tester is aware of what the software is supposed to do but is not aware of how it does it. For instance, the tester is aware that a particular input returns a certain, invariable output but is not aware of how the software produces the output in the first place.
  • 69. 67 Test cases Test cases are built around specifications and requirements, i.e., what the application is supposed to do. Test cases are generally derived from external descriptions of the software, including specifications, requirements and design parameters. Although the tests used are primarily functional in nature, non- functional tests may also be used. The test designer selects both valid and invalid inputs and determines the correct output without any knowledge of the test object's internal structure. Test design techniques Typical black-box test design techniques include: • Decision table testing • All-pairs testing • State transition tables • Equivalence partitioning • Boundary value analysis Unit testing In computer programming, unit testing is a method by which individual units of source code, sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures are tested to determine if they are fit for use. Intuitively, one can view a unit as the smallest testable part of an application. In procedural programming, a unit could be an entire module, but is more commonly an individual function or procedure. In object-oriented programming, a unit is often an entire interface, such as a class, but could be an individual method. Unit tests are created by programmers or occasionally by white box testers during the development process. Ideally, each test case is independent from the others. Substitutes such as method stubs, mock objects, fakes, and test harnesses can be used to assist testing a module in isolation. Unit tests are typically written and run by software developers to ensure that code meets its design and behaves as intended. Its implementation can vary from being very manual (pencil and paper)to being formalized as part of build automation. Testing will not catch every error in the program, since it cannot evaluate every execution path in any but the most trivial programs. The same is true for unit testing.
  • 70. 68 Additionally, unit testing by definition only tests the functionality of the units themselves. Therefore, it will not catch integration errors or broader system-level errors (such as functions performed across multiple units, or non-functional test areas such as performance). Unit testing should be done in conjunction with other software testing activities, as they can only show the presence or absence of particular errors; they cannot prove a complete absence of errors. In order to guarantee correct behaviour for every execution path and every possible input, and ensure the absence of errors, other techniques are required, namely the application of formal methods to proving that a software component has no unexpected behaviour. Software testing is a combinatorial problem. For example, every Boolean decision statement requires at least two tests: one with an outcome of "true" and one with an outcome of "false". As a result, for every line of code written, programmers often need 3 to 5 lines of test code. This obviously takes time and its investment may not be worth the effort. There are also many problems that cannot easily be tested at all – for example those that are nondeterministic or involve multiple threads. In addition, code for a unit test is likely to be at least as buggy as the code it is testing. Fred Brooks in The Mythical Man-Month quotes: never take two chronometers to sea. Always take one or three. Meaning, if two chronometers contradict, how do you know which one is correct? Another challenge related to writing the unit tests is the difficulty of setting up realistic and useful tests. It is necessary to create relevant initial conditions so the part of the application being tested behaves like part of the complete system. If these initial conditions are not set correctly, the test will not be exercising the code in a realistic context, which diminishes the value and accuracy of unit test results. To obtain the intended benefits from unit testing, rigorous discipline is needed throughout the software development process. It is essential to keep careful records not only of the tests that have been performed, but also of all changes that have been made to the source code of this or any other unit in the software. Use of a version control system is essential. If a later version of the unit fails a particular test that it had previously passed, the version-control software can provide a list of the source code changes (if any) that have been applied to the unit since that time.
  • 71. 69 It is also essential to implement a sustainable process for ensuring that test case failures are reviewed daily and addressed immediately if such a process is not implemented and ingrained into the team's workflow, the application will evolve out of sync with the unit test suite, increasing false positives and reducing the effectiveness of the test suite. Unit testing embedded system software presents a unique challenge: Since the software is being developed on a different platform than the one it will eventually run on, you cannot readily run a test program in the actual deployment environment, as is possible with desktop programs. Functional testing Functional testing is a quality assurance (QA) process and a type of black box testing that bases its test cases on the specifications of the software component under test. Functions are tested by feeding them input and examining the output, and internal program structure is rarely considered (not like in white- box testing). Functional Testing usually describes what the system does. Functional testing differs from system testing in that functional testing "verifies a program by checking it against ... design document(s) or specification(s)", while system testing "validate a program by checking it against the published user or system requirements" (Kane, Falk, Nguyen 1999, p. 52). Functional testing typically involves five steps .The identification of functions that the software is expected to perform 1. The creation of input data based on the function's specifications 2. The determination of output based on the function's specifications 3. The execution of the test case 4. The comparison of actual and expected outputs Performance testing In software engineering, performance testing is in general testing performed to determine how a system performs in terms of responsiveness and stability under a particular workload. It can also serve to
  • 72. 70 investigate, measure, validate or verify other quality attributes of the system, such as scalability, reliability and resource usage. Performance testing is a subset of performance engineering, an emerging computer science practice which strives to build performance into the implementation, design and architecture of a system. Testing types Load testing Load testing is the simplest form of performance testing. A load test is usually conducted to understand the behaviour of the system under a specific expected load. This load can be the expected concurrent number of users on the application performing a specific number of transactions within the set duration. This test will give out the response times of all the important business critical transactions. If the database, application server, etc. are also monitored, then this simple test can itself point towards bottlenecks in the application software. Stress testing Stress testing is normally used to understand the upper limits of capacity within the system. This kind of test is done to determine the system's robustness in terms of extreme load and helps application administrators to determine if the system will perform sufficiently if the current load goes well above the expected maximum. Soak testing Soak testing, also known as endurance testing, is usually done to determine if the system can sustain the continuous expected load. During soak tests, memory utilization is monitored to detect potential leaks. Also important, but often overlooked is performance degradation. That is, to ensure that the throughput and/or response times after some long period of sustained activity are as good as or better than at the beginning of the test. It essentially involves applying a significant load to a system for an extended, significant period of time. The goal is to discover how the system behaves under sustained use.
  • 73. 71 Spike testing Spike testing is done by suddenly increasing the number of or load generated by, users by a very large amount and observing the behaviour of the system. The goal is to determine whether performance will suffer, the system will fail, or it will be able to handle dramatic changes in load. Configuration testing Rather than testing for performance from the perspective of load, tests are created to determine the effects of configuration changes to the system's components on the system's performance and behaviour. A common example would be experimenting with different methods of load-balancing. Isolation testing Isolation testing is not unique to performance testing but involves repeating a test execution that resulted in a system problem. Often used to isolate and confirm the fault domain. Integration testing Integration testing (sometimes called integration and testing, abbreviated I&T) is the phase in software testing in which individual software modules are combined and tested as a group. It occurs after unit testing and before validation testing. Integration testing takes as its input modules that have been unit tested, groups them in larger aggregates, applies tests defined in an integration test plan to those aggregates, and delivers as its output the integrated system ready for system testing. Purpose The purpose of integration testing is to verify functional, performance, and reliability requirements placed on major design items. These "design items", i.e. assemblages (or groups of units), are exercised through their interfaces using black box testing, success and error cases being simulated via appropriate parameter and data inputs. Simulated usage of shared data areas and inter-process communication is tested and individual subsystems are exercised through their input interface.
  • 74. 72 Test cases are constructed to test whether all the components within assemblages interact correctly, for example across procedure calls or process activations, and this is done after testing individual modules, i.e. unit testing. The overall idea is a "building block" approach, in which verified assemblages are added to a verified base which is then used to support the integration testing of further assemblages. Some different types of integration testing are big bang, top-down, and bottom-up. Other Integration Patterns are: Collaboration Integration, Backbone Integration, Layer Integration, Client/Server Integration, Distributed Services Integration and High-frequency Integration. Big Bang In this approach, all or most of the developed modules are coupled together to form a complete software system or major part of the system and then used for integration testing. The Big Bang method is very effective for saving time in the integration testing process. However, if the test cases and their results are not recorded properly, the entire integration process will be more complicated and may prevent the testing team from achieving the goal of integration testing. A type of Big Bang Integration testing is called Usage Model testing. Usage Model Testing can be used in both software and hardware integration testing. The basis behind this type of integration testing is to run user-like workloads in integrated user-like environments. In doing the testing in this manner, the environment is proofed, while the individual components are proofed indirectly through their use. Usage Model testing takes an optimistic approach to testing, because it expects to have few problems with the individual components. The strategy relies heavily on the component developers to do the isolated unit testing for their product. The goal of the strategy is to avoid redoing the testing done by the developers, and instead flesh-out problems caused by the interaction of the components in the environment. For integration testing, Usage Model testing can be more efficient and provides better test coverage than traditional focused functional integration testing. To be more efficient and accurate, care must be used in defining the user-like workloads for creating realistic scenarios in exercising the environment. This gives confidence that the integrated environment will work as expected for the target customers.
  • 75. 73 Top-down and Bottom-up Bottom Up Testing is an approach to integrated testing where the lowest level components are tested first, then used to facilitate the testing of higher level components. The process is repeated until the component at the top of the hierarchy is tested. All the bottom or low-level modules, procedures or functions are integrated and then tested. After the integration testing of lower level integrated modules, the next level of modules will be formed and can be used for integration testing. This approach is helpful only when all or most of the modules of the same development level are ready. This method also helps to determine the levels of software developed and makes it easier to report testing progress in the form of a percentage. Top Down Testing is an approach to integrated testing where the top integrated modules are tested and the branch of the module is tested step by step until the end of the related module. Sandwich Testing is an approach to combine top down testing with bottom up testing. The main advantage of the Bottom-Up approach is that bugs are more easily found. With Top-Down, it is easier to find a missing branch link Verification and validation Verification and Validation are independent procedures that are used together for checking that a product, service, or system meets requirements and specifications and that it full fills its intended purpose. These are critical components of a quality management system such as ISO 9000. The words "verification" and "validation" are sometimes preceded with "Independent" (or IV&V), indicating that the verification and validation is to be performed by a disinterested third party. It is sometimes said that validation can be expressed by the query "Are you building the right thing?" and verification by "Are you building it right?"In practice, the usage of these terms varies. Sometimes they are even used interchangeably. The PMBOK guide, an IEEE standard, defines them as follows in its 4th edition • "Validation. The assurance that a product, service, or system meets the needs of the customer and other identified stakeholders. It often involves acceptance and suitability with external customers. Contrast with verification."
  • 76. 74 • "Verification. The evaluation of whether or not a product, service, or system complies with a regulation, requirement, specification, or imposed condition. It is often an internal process. Contrast with validation." .Verification is intended to check that a product, service, or system (or portion thereof, or set thereof) meets a set of initial design specifications. In the development phase, verification procedures involve performing special tests to model or simulate a portion, or the entirety, of a product, service or system, then performing a review or analysis of the modelling results. In the post-development phase, verification procedures involve regularly repeating tests devised specifically to ensure that the product, service, or system continues to meet the initial design requirements, specifications, and regulations as time progresses. It is a process that is used to evaluate whether a product, service, or system complies with regulations, specifications, or conditions imposed at the start of a development phase. Verification can be in development, scale-up, or production. This is often an internal process. • Validation is intended to check that development and verification procedures for a product, service, or system (or portion thereof, or set thereof) result in a product, service, or system (or portion thereof, or set thereof) that meets initial requirements. For a new development flow or verification flow, validation procedures may involve modelling either flow and using simulations to predict faults or gaps that might lead to invalid or incomplete verification or development of a product, service, or system (or portion thereof, or set thereof). A set of validation requirements, specifications, and regulations may then be used as a basis for qualifying a development flow or verification flow for a product, service, or system (or portion thereof, or set thereof). Additional validation procedures also include those that are designed specifically to ensure that modifications made to an existing qualified development flow or verification flow will have the effect of producing a product, service, or system (or portion thereof, or set thereof) that meets the initial design requirements, specifications, and regulations; these validations help to keep the flow qualified. It is a process of establishing evidence that provides a high degree of assurance that a product, service, or system accomplishes its intended requirements. This often involves acceptance of fitness for purpose with end users and other product stakeholders. This is often an external process.