Detection of medical instruments project- PART 1

HEALTHCARE
MEDICAL INSTRUMENT
DETECTION SYSTEM

AI is one of the fastest growing data driven technology that plays a futuristic role in
healthcare industry. Several ML, Image processing and DL algorithms have taken a vital role
in performing clinical diagnoses and suggesting treatments. By 2021, the artificial
intelligence (AI) market in healthcare industry is expected to grow by $6.6 billion.
This project involves a system that helps in detecting various biomedical devices or
instruments via digital image processing technique. It consists of:
 Datasets (Images) obtained from various sources.
 A large number of output classes.
 Various types of categories, parts or components related to biomedical instruments.
 Various types of scans that help in providing a lot of information.
 Simple interface to interact with various inputs and get required output information.
This reflects a basic implementation of AI in healthcare industry and associated sectors in
order to classify various images under several categories of medical devices.

OBJECTIVES
 The prime objective is to identify and classify the given image among the 20 most
popular medical devices and provide a brief overview on the detected instrument
 To develop a dataset containing various images of biomedical devices or instruments.
Trusted sources are used to get practical images of various instruments.
 To develop an algorithm for digital image processing. This can include any pre-trained
neural network.
 To Test, train and evaluate the performance of the algorithm containing a lot of features.
 To Utilize this algorithm for classifying various images of biomedical instruments.
 To Form of website, portal or an app by which the input images are uploaded and
processed to get the required output.
 Along with the output some information is provided to develop a basic understanding of
the device.

COMPONENTS USED
 Convolutional Neural Networks (CNN) are a specific application of deep learning in the field of
images. This network is used to classify or perform other operations on images. The primary
function of CNN’s is to extract features from images basically they convert multi-dimensional
images to 1 dimensional vectors. These CNN’s are combined with Fully Connected layers to
process the vector.
 VGG-16 is one such deep CNN which was developed by the Oxford University. In this project this
model has been used for the detection of medical devices.
 The training dataset includes about 40 images per class and the source of these images are
shopping websites like alibaba.com and indiamart.com etc.
 Tensorflow keras is such an library found in python well suited for CNN’s. the pre-trained model
was available in this library. Apart from that another function called ImageDataGenerator was
available which can generate samples out of images.
 The streamlit was used to develop a web application design format from the developed code.
 The ngrok was used to host a website and create URL for the web application.

WORKING CONCEPT
• The convolutional neural network mainly works on image data. It is used for feature
extraction from the image. This is a partially connected neural network. Image can be
interpreted by us but not by machines. Hence they interpret images as a vector whose
values represent the colour intensity of the image. Every colour can be expressed as a
vector of 3-D known as RGB- Red Green Blue. The size of the vector is equal to the
dimensions of image.
• Convolution in mathematics refers to the process of combining two different functions. With
respect to CNN, convolution occurs between the image and the filter or kernel. Convolution
itself is one of the processes done on the image. Here also the operation is mathematical. It
is a kind of operation on two vectors. The input image gets converted into vector based on
colour and dimension. The kernel or filter is a predefined vector with fixed values to perform
various functions onto the image.

We have seen that there is a reduction of dimension in the output vector. A technique
known as padding is done to preserve the original dimensions in the output vector. The only
change in this process is that we add a boundary of ‘0s’ over the input vector and then do
the convolution process.
It is not necessary that the filter or kernel must be applied to all the cells. The pattern of
applying the kernel onto the input vector is determined using the stride. It determines the
shift or gaps in the cells where the filter has to be applied.
This is another aspect of the CNN. There are different types of pooling like min pooling, max
pooling, avg pooling etc. the process is same as before i.e. the kernel vector slides over the
input vector and does computations on the dot product. If a 3*3 kernels is considered then
it is applied over a 3*3 region inside the vector, it finds the dot product in the case of
convolution. The same in pooling finds a particular value and substitutes that value in the
output vector. The kernel value decides the type of pooling.
The convolution and pooling are the basis for feature extraction. The vector obtained from
this step is fed into a FFN which then does the required task on the image.
the FFN consists of neurons connected to each other. The last layer of FFN consists of
neurons equal to the number of output classes. (in this case 20)

This project involves VGG-16 network. This Neural network consists of 1 layers in total. There
are 13 layers pertaining to feature extraction and 3 layers pertaining to classification.
The output dimension is changed into 1*1*20 and the given images must be reshaped to
224*224 since this dimension is compatible for VGGNet. The below table shows the total
number of parameters. About 120 Million parameters come from the FC layers and 16M
parameters from CNN Layer Value of parameters
Convolution 16M
FF1 102M
FF2 16M
Total 134M

It takes days to train about 132 Million parameters. Hence the GPU (Graphics Processing
Unit) is used to accelerate the training time to hours.
To achieve even faster training time pre-trained CNNs’ are used. The researchers have
already trained the VGGNet and stored them in the keras library. As a result the training time
significantly drops to minutes but there will be less accuracy. To get the best accuracy, the
network has to be trained which can take hours.
 the streamlit is one such library in python which is used to create and design web
application out of the code. The design for the website is done here which includes a dialog
box for uploading images and submit button which when pressed the output is given. The
instance when the application is opened the training of the model has to run in the
background and when the submit is given the evaluation occurs.
The ngrok is used to host the website by providing an URL. There are some problems in the
website which could be overcome by using the paid version.
The URL is temporary and does not exist long. The URL cannot withstand large traffic (no of
users at a time) and the session can expire on reloading. It is recommended to upgrade to a
premium version to overcome these problems.

1. Uploading the dataset. The image dataset was available in the google drive. So we
had to mount google drive into google colab.
2. Providing the path for the training and testing image datsets.
3. Reshaping the images to (224,224) the size which is appropriate for VGGNet.
4. Importing VGGNet from tensorflow keras library.
5. Using the layer.trainable= false through which we can significantly reduce the
parameters to be trained. As a result of this step only 3.2% of the entire parameters
have to be trained.
6. Changing the output classes to 20 and providing softmax activation function
(softmax provides a distribution of probability and is recommended for multiclass
classification).
7. Compiling the model using cross entropy loss function and adam optimising
algorithm.
8. Using the ImageDataGenerator to obtain the samples for training and testing.
9. Training the model for the desired number of steps and epochs.

11. Importing streamlit and required libraries.
12. Creating the design for the website.
13. Importing the stable ngrok zip file.
14. Unzipping the zip file.
15. Importing the https from ngrok
16. Combining the https from the obtained URL.
17. Running the application on the website.

 https://pro.panopto.com/Panopto/Pages/Viewer.aspx?tid=84c4e28d-b089-495b-
b75a-ad510081cda9

 A website, portal or an app could be developed for better implementation of the
trained algorithm.
 Various images can be labelled in their respective category of medical devices (in a
quick manner).
 This model can be used for educative purposes. It can be used for students in an IV
to hospital/industry to learn on the biomedical devices.
 It can be used by common man to know some facts and basic working of the
common biomedical devices.
 On an advance level, the algorithm could be deployed in industries or PCBs with a
mounted camera to classify the real biomedical instrument.
 Upon training with medical datasets, this could help in categorizing images (or scans)
related to a particular organ or organ system.
 Upon training with medication datasets, this could help in a better drug classification
via image analysis technique.

ADVANTAGES
People could use this system (as website or an app) to gain information about several
medical devices surrounding them.
In a short period of time, the required output information is obtained from the
corresponding image input.
Several scans can be classified into various categories depending upon the recording
instrument or technique.
Better accuracy has been obtained (about 93%) on the datasets obtained from online
shopping websites. Can achieve even better with images from hospitals.
Several platforms or sites can use digital image processing as a feature to label various
medical instruments.
This idea could help the healthcare industry by keep a timeline or track of
advancements in several biomedical instruments, devices and techniques.

DISADVANTAGES
Without proper and practical datasets, the prediction or classification system
cannot detect with high accuracy.
The design for the ECG, EEG and EMG machines of both printer version and video
version are mostly similar and the difference is in the point of application. Hence
there is some lack of clarity in detecting these instruments
The design for the MRI and CT scan machines are similar and also most of the CT
scan machines perform PET scan. There is some misconception arised due to the
above points.
Thee is some confusion in detecting pulse oximeter an blood pressure machine
due to the similarities in the display monitor.
If most of the dataset is trained upon low-end or older version of medical devices
or instruments (without updating to the portable and advance ones), then this will
cause a problem in detecting advance and higher-end devices.

The dataset does not house enough samples to get the ideal prediction. The model
has to be trained with a lot of images.
We have taken images from shopping websites. It is recommended to take pictures
of devices straight from the hospital/manufacturing industry. We can take images
from different angles, positions, views and distances. These images can improve
the model’s function.
Taking images straight from the hospital or manufacturing industry is practically
not possible in this pandemic situation.
The URL obtained from the ngrok is not permanent and has some issues since we
are using the free version. Also there are some errors while uploading the image
which could not be rectified and the website stops working when reloaded.

With respect to the futuristic developments in image processing technology, it is expected to
gain market growth from 2020 to 2027. It is expected to reach USD 25,702 million by 2027
(growth rate of 21.8%). Healthcare is one of the prime industries which requires AI and Image
processing techniques to provide better classification and diagnostic results. Latest data shows
that:
 Increase in the manufacturing of ventilators to treat COVID-19 patients.
 Increased production of pulse oximeters and oxygen concentrators with reference to COVID.
 Several startups have used their platforms to promote portable and advance versions of
medical devices. (AgVa healthcare)
 Detection, scanning and testing techniques are modified to cope up with the increasing risk
of chronic diseases.
 Incorporation of deep learning methods in particular CNN’s for the detection of diseases and
more applications in the medical fields.

CONCLUSION
Hence we have successfully implemented VGGNet to detect and classify the given image
among the 20 common biomedical instruments and in this process we have achieved about
93% accuracy. The model was able to correctly detect the uploaded images of biomedical
instruments of the 20 classes. The model is also able to detect the machine even if the image
of the parts are given (model correctly predicted endoscopy when image of the camera was
uploaded and catheter when the image of the tube was uploaded) .
Accuracy: 93.4%
Time Taken: 5 mins and 12 secs (depend on the epochs)

PROFILE OF THE TEAM
SUBMITTED BY:-
NAME- Arjun Bhattacharya
Dept.- Biomedical Engineering
Year- II Year
e-mail-
arjun.bhattacharya.2019.bme@Raj
alakshmi.edu.in
Contact No.- 7091832747
SUBMITTED BY:-
NAME- V. A. Sairam
Dept.- Biomedical Engineering
Year- II Year
e-mail-
sairam.va.2019.bme@Rajalakshmi.
edu.in
Contact No.- 7010127706
COLLEGE→ Rajalakshmi Engineering

Detection of medical instruments project- PART 1

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Detection of medical instruments project- PART 1

Similar to Detection of medical instruments project- PART 1 (20)

More from Sairam Adithya

More from Sairam Adithya (10)

Recently uploaded

Recently uploaded (20)

Detection of medical instruments project- PART 1