In many industrial, medical and scientific image processing applications, various feature and pattern recognition
techniques are used to match specific features in an image with a known template. Despite the capabilities of
these techniques, some applications require simultaneous analysis of multiple, complex, and irregular features
within an image as in semiconductor wafer inspection. In wafer inspection discovered defects are often complex
and irregular and demand more human-like inspection techniques to recognize irregularities. By incorporating
neural network techniques such image processing systems with much number of images can be trained until the
system eventually learns to recognize irregularities. The aim of this project is to develop a framework of a
machine-learning system that can classify objects of different category. The framework utilizes the toolboxes in
the Matlab such as Computer Vision Toolbox, Neural Network Toolbox etc.
New Research Articles 2019 October Issue Signal & Image Processing An Interna...sipij
Signal & Image Processing: An International Journal (SIPIJ)
ISSN: 0976 – 710X [Online]; 2229 - 3922 [Print]
http://www.airccse.org/journal/sipij/index.html
Current Issue; October 2019, Volume 10, Number 5
Free- Reference Image Quality Assessment Framework Using Metrics Fusion and Dimensionality Reduction
Besma Sadou1, Atidel Lahoulou2, Toufik Bouden1, Anderson R. Avila3, Tiago H. Falk3 and Zahid Akhtar4, 1Non Destructive Testing Laboratory, University of Jijel, Algeria, 2LAOTI laboratory, University of Jijel, Algeria, 3University of Québec, Canada and 4University of Memphis, USA
Test-cost-sensitive Convolutional Neural Networks with Expert Branches
Mahdi Naghibi1, Reza Anvari1, Ali Forghani1 and Behrouz Minaei2, 1Malek-Ashtar University of Technology, Iran and 2Iran University of Science and Technology, Iran
Robust Image Watermarking Method using Wavelet Transform
Omar Adwan, The University of Jordan, Jordan
Improvements of the Analysis of Human Activity Using Acceleration Record of Electrocardiographs
Itaru Kaneko1, Yutaka Yoshida2 and Emi Yuda3, 1&2Nagoya City University, Japan and 3Tohoku University, Japan
http://www.airccse.org/journal/sipij/vol10.html
TOP 5 Most View Article From Academia in 2019sipij
TOP 5 Most View Article From Academia in 2019
Signal & Image Processing : An International Journal (SIPIJ)
ISSN : 0976 - 710X (Online) ; 2229 - 3922 (print)
http://www.airccse.org/journal/sipij/index.html
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...ijtsrd
This paper proposes extracting salient objects from motion fields. Salient object detection is an important technique for many content-based applications, but it becomes a challenging work when handling the clustered saliency maps, which cannot completely highlight salient object regions and cannot suppress background regions. We present algorithms for recognizing activity in monocular video sequences, based on discriminative gradient Random Field. Surveillance videos capture the behavioral activities of the objects accessing the surveillance system. Some behavior is frequent sequence of events and some deviate from the known frequent sequences of events. These events are termed as anomalies and may be susceptible to criminal activities. In the past, work was based on discovering the known abnormal events. Here, the unknown abnormal activities are to be detected and alerted such that early actions are taken. K. Shankar | Dr. S. Srinivasan | Dr. T. S. Sivakumaran | K. Madhavi Priya"Discovering Anomalies Based on Saliency Detection and Segmentation in Surveillance System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: http://www.ijtsrd.com/papers/ijtsrd5871.pdf http://www.ijtsrd.com/engineering/computer-engineering/5871/discovering-anomalies-based-on-saliency-detection-and-segmentation-in-surveillance-system/k-shankar
Neural network based numerical digits recognization using nnt in matlabijcses
Artificial neural networks are models inspired by human nervous system that is capable of learning. One of
the important applications of artificial neural network is character Recognition. Character Recognition
finds its application in number of areas, such as banking, security products, hospitals, in robotics also.
This paper is based on a system that recognizes a english numeral, given by the user, which is already
trained on the features of the numbers to be recognized using NNT (Neural network toolbox) .The system
has a neural network as its core, which is first trained on a database. The training of the neural network
extracts the features of the English numbers and stores in the database. The next phase of the system is to
recognize the number given by the user. The features of the number given by the user are extracted and
compared with the feature database and the recognized number is displayed.
New Research Articles 2019 October Issue Signal & Image Processing An Interna...sipij
Signal & Image Processing: An International Journal (SIPIJ)
ISSN: 0976 – 710X [Online]; 2229 - 3922 [Print]
http://www.airccse.org/journal/sipij/index.html
Current Issue; October 2019, Volume 10, Number 5
Free- Reference Image Quality Assessment Framework Using Metrics Fusion and Dimensionality Reduction
Besma Sadou1, Atidel Lahoulou2, Toufik Bouden1, Anderson R. Avila3, Tiago H. Falk3 and Zahid Akhtar4, 1Non Destructive Testing Laboratory, University of Jijel, Algeria, 2LAOTI laboratory, University of Jijel, Algeria, 3University of Québec, Canada and 4University of Memphis, USA
Test-cost-sensitive Convolutional Neural Networks with Expert Branches
Mahdi Naghibi1, Reza Anvari1, Ali Forghani1 and Behrouz Minaei2, 1Malek-Ashtar University of Technology, Iran and 2Iran University of Science and Technology, Iran
Robust Image Watermarking Method using Wavelet Transform
Omar Adwan, The University of Jordan, Jordan
Improvements of the Analysis of Human Activity Using Acceleration Record of Electrocardiographs
Itaru Kaneko1, Yutaka Yoshida2 and Emi Yuda3, 1&2Nagoya City University, Japan and 3Tohoku University, Japan
http://www.airccse.org/journal/sipij/vol10.html
TOP 5 Most View Article From Academia in 2019sipij
TOP 5 Most View Article From Academia in 2019
Signal & Image Processing : An International Journal (SIPIJ)
ISSN : 0976 - 710X (Online) ; 2229 - 3922 (print)
http://www.airccse.org/journal/sipij/index.html
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...ijtsrd
This paper proposes extracting salient objects from motion fields. Salient object detection is an important technique for many content-based applications, but it becomes a challenging work when handling the clustered saliency maps, which cannot completely highlight salient object regions and cannot suppress background regions. We present algorithms for recognizing activity in monocular video sequences, based on discriminative gradient Random Field. Surveillance videos capture the behavioral activities of the objects accessing the surveillance system. Some behavior is frequent sequence of events and some deviate from the known frequent sequences of events. These events are termed as anomalies and may be susceptible to criminal activities. In the past, work was based on discovering the known abnormal events. Here, the unknown abnormal activities are to be detected and alerted such that early actions are taken. K. Shankar | Dr. S. Srinivasan | Dr. T. S. Sivakumaran | K. Madhavi Priya"Discovering Anomalies Based on Saliency Detection and Segmentation in Surveillance System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: http://www.ijtsrd.com/papers/ijtsrd5871.pdf http://www.ijtsrd.com/engineering/computer-engineering/5871/discovering-anomalies-based-on-saliency-detection-and-segmentation-in-surveillance-system/k-shankar
Neural network based numerical digits recognization using nnt in matlabijcses
Artificial neural networks are models inspired by human nervous system that is capable of learning. One of
the important applications of artificial neural network is character Recognition. Character Recognition
finds its application in number of areas, such as banking, security products, hospitals, in robotics also.
This paper is based on a system that recognizes a english numeral, given by the user, which is already
trained on the features of the numbers to be recognized using NNT (Neural network toolbox) .The system
has a neural network as its core, which is first trained on a database. The training of the neural network
extracts the features of the English numbers and stores in the database. The next phase of the system is to
recognize the number given by the user. The features of the number given by the user are extracted and
compared with the feature database and the recognized number is displayed.
A Comprehensive Study On Handwritten Character Recognition Systemiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Deep Learning for X ray Image to Text Generationijtsrd
Motivated by the recent success of supervised and weakly supervised common object discovery, in this work we move forward one step further to tackle common object discovery in a fully unsupervised way. Mainly, object co localization aims at simultaneously localizing the objects of the same class across a group of images. Traditional object localization detection usually trains the specific object detectors which require bounding box annotations of object instances, or at least image level labels to indicate the presence absence of objects in an image. Given a collection of images without any annotations, our proposed fully unsupervised method is to simultaneously discover images that contain common objects and also localize common objects in corresponding images. It has been long envisioned that the machines one day will understand the visual world at a human level of intelligence. Now we can build very deep convolutional neural networks CNNs and achieve an impressively low error rate for tasks like large scale image classification. However, in tasks like image classification, the content of an image is usually simple, containing a predominant object to be classified. The situation could be much more challenging when we want computers to understand complex scenes. Image captioning is one such task. In these tasks, we have to train a model to predict the category of a given x ray image is to first annotate each x ray image in a training set with a label from the predefined set of categories. Through such fully supervised training, the computer learns how to classify an x ray image and convert into text. Mahima Chaddha | Sneha Kashid | Snehal Bhosale | Prof. Radha Deoghare ""Deep Learning for X-ray Image to Text Generation"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23168.pdf
Paper URL: https://www.ijtsrd.com/engineering/information-technology/23168/deep-learning-for-x-ray-image-to-text-generation/mahima-chaddha
MediaEval 2017 - Satellite Task: Flood detection using Social Media Data and ...multimediaeval
Presenter: Muhammad Atif Tahir, National University of Computer and Emerging Sciences, Karachi Campus, Pakistan
Paper: http://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_43.pdf
Video: https://youtu.be/n8PlWwtzWXo
Authors: Muhammad Hanif, Muhammad Atif Tahir, Mahrukh Khan, Muhammad Rafi
Abstract: Natural disasters destroy valuable resources and are necessary to recognize so that appropriate strategies may be designed. In recent past, social networks are very good source to gather event specific information. This working notes paper is based on the task of Disaster Image Retrieval from Social Media dataset (DRISM), as a part of MediaEval, 2017. The Dataset of images and their relevant metadata is taken from various social networks including Twitter and Flicker. An ensemble approach is adopted in this paper where different visual and metadata features are integrated. Kernel Discriminant analysis using spectral regression is then used as dimensionality reduction technique. Mean Average Precision (MAP) at various cutoffs are reported in this paper.
Rule based algorithm for handwritten characters recognitionRanda Elanwar
Presentation of Master Dissertation
Content:
Rule-based Algorithm for Off-line Isolated Handwritten character recognition
Rule-based Algorithm for On-line Arabic Cursive Handwriting Segmentation and Recognition
Artificial Neural Network / Hand written character RecognitionDr. Uday Saikia
1. Overview
2.Development of System
3.GCR Model
4.Proposed model
5.Back ground Information
6. Preprocessing
7.Architecture
8.ANN(Artificial Neural Network)
9.How the Human Brain Learns?
10.Synapse
11.The Neuron Model
12.A typical Feed-forward neural network model
13.The neural Network
14.Training of characters using neural networks
15.Regression of trained neural networks
16.Training state of neural networks
17.Graphical user interface….
Deep convolutional neural network for hand sign language recognition using mo...journalBEEI
An image processing system that based computer vision has received many attentions from science and technology expert. Research on image processing is needed in the development of human-computer interactions such as hand recognition or gesture recognition for people with hearing impairments and deaf people. In this research we try to collect the hand gesture data and used a simple deep neural network architecture that we called model E to recognize the actual hand gestured. The dataset that we used is collected from kaggle.com and in the form of ASL (American Sign Language) datasets. We doing accuracy comparison with another existing model such as AlexNet to see how robust our model. We find that by adjusting kernel size and number of epoch for each model also give a different result. After comparing with AlexNet model we find that our model E is perform better with 96.82% accuracy.
The slide is about pedestrian re-identification (re-ID) based on Deep Learning methods. I mainly review some prevalent methods and try to give some insights to the future work. Zhedong Zheng @2017.7.27
Two level data security using steganography and 2 d cellular automataeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONgerogepatton
Most of the currently known methods treat person re-identification task as classification problem and used commonly neural networks. However, these methods used only high-level convolutional feature or to express the feature representation of pedestrians. Moreover, the current data sets for person reidentification is relatively small. Under the limitation of the number of training set, deep convolutional networks are difficult to train adequately. Therefore, it is very worthwhile to introduce auxiliary data sets help training. In order to solve this problem, this paper propose a novel method of deep transfer learning, and combines the comparison model with the classification model and multi-level fusion of the
convolution features on the basis of transfer learning. In a multi-layers convolutional network, the characteristics of each layer of network are the dimensionality reduction of the previous layer of results, but the information of multi-level features is not only inclusive, but also has certain complementarity. We can using the information gap of different layers of convolutional neural networks to extract a better feature expression. Finally, the algorithm proposed in this paper is fully tested on four data sets (VIPeR, CUHK01, GRID and PRID450S). The obtained re-identification results prove the effectiveness of the algorithm.
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...Editor IJMTER
Digital Images are used in magazines, blogs, website, television and more. Digital image processing
techniques are used for feature selection, pattern extraction classification and retrieval requirements. Color, texture
and shape features are used in the image processing. Digital images processing also supports computer graphics
and computer vision domains. Scene text recognition is performed with two schemes. They are character
recognizer and binary character classifier models. A character recognizer is trained to predict the category of a
character in an image patch. A binary character classifier is trained for each character class to predict the existence
of this category in an image patch. Scene text recognition is performed on detected text regions. Pixel-based layout
analysis method is adopted to extract text regions and segment text characters in images. Text character
segmentation is carried out with color uniformity and horizontal alignment of text characters. Discriminative
character descriptor is designed by combining several feature detectors and descriptors. Histogram of Oriented
Gradients (HOG) is used to identify the character descriptors. Character structure is modeled at each character
class by designing stroke configuration maps. The scene text extraction scheme is also supports for smart mobile
devices. Text recognition methods are used with text understanding and text retrieval applications. The text
recognition scheme is enhanced with content based image retrieval process. The system is integrated with
additional representative and discriminative features for text structure modeling process. The system is enhanced to
perform text and word level recognition using lexicon analysis. The training process is included with word
database update task.
A Comprehensive Study On Handwritten Character Recognition Systemiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Deep Learning for X ray Image to Text Generationijtsrd
Motivated by the recent success of supervised and weakly supervised common object discovery, in this work we move forward one step further to tackle common object discovery in a fully unsupervised way. Mainly, object co localization aims at simultaneously localizing the objects of the same class across a group of images. Traditional object localization detection usually trains the specific object detectors which require bounding box annotations of object instances, or at least image level labels to indicate the presence absence of objects in an image. Given a collection of images without any annotations, our proposed fully unsupervised method is to simultaneously discover images that contain common objects and also localize common objects in corresponding images. It has been long envisioned that the machines one day will understand the visual world at a human level of intelligence. Now we can build very deep convolutional neural networks CNNs and achieve an impressively low error rate for tasks like large scale image classification. However, in tasks like image classification, the content of an image is usually simple, containing a predominant object to be classified. The situation could be much more challenging when we want computers to understand complex scenes. Image captioning is one such task. In these tasks, we have to train a model to predict the category of a given x ray image is to first annotate each x ray image in a training set with a label from the predefined set of categories. Through such fully supervised training, the computer learns how to classify an x ray image and convert into text. Mahima Chaddha | Sneha Kashid | Snehal Bhosale | Prof. Radha Deoghare ""Deep Learning for X-ray Image to Text Generation"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd23168.pdf
Paper URL: https://www.ijtsrd.com/engineering/information-technology/23168/deep-learning-for-x-ray-image-to-text-generation/mahima-chaddha
MediaEval 2017 - Satellite Task: Flood detection using Social Media Data and ...multimediaeval
Presenter: Muhammad Atif Tahir, National University of Computer and Emerging Sciences, Karachi Campus, Pakistan
Paper: http://ceur-ws.org/Vol-1984/Mediaeval_2017_paper_43.pdf
Video: https://youtu.be/n8PlWwtzWXo
Authors: Muhammad Hanif, Muhammad Atif Tahir, Mahrukh Khan, Muhammad Rafi
Abstract: Natural disasters destroy valuable resources and are necessary to recognize so that appropriate strategies may be designed. In recent past, social networks are very good source to gather event specific information. This working notes paper is based on the task of Disaster Image Retrieval from Social Media dataset (DRISM), as a part of MediaEval, 2017. The Dataset of images and their relevant metadata is taken from various social networks including Twitter and Flicker. An ensemble approach is adopted in this paper where different visual and metadata features are integrated. Kernel Discriminant analysis using spectral regression is then used as dimensionality reduction technique. Mean Average Precision (MAP) at various cutoffs are reported in this paper.
Rule based algorithm for handwritten characters recognitionRanda Elanwar
Presentation of Master Dissertation
Content:
Rule-based Algorithm for Off-line Isolated Handwritten character recognition
Rule-based Algorithm for On-line Arabic Cursive Handwriting Segmentation and Recognition
Artificial Neural Network / Hand written character RecognitionDr. Uday Saikia
1. Overview
2.Development of System
3.GCR Model
4.Proposed model
5.Back ground Information
6. Preprocessing
7.Architecture
8.ANN(Artificial Neural Network)
9.How the Human Brain Learns?
10.Synapse
11.The Neuron Model
12.A typical Feed-forward neural network model
13.The neural Network
14.Training of characters using neural networks
15.Regression of trained neural networks
16.Training state of neural networks
17.Graphical user interface….
Deep convolutional neural network for hand sign language recognition using mo...journalBEEI
An image processing system that based computer vision has received many attentions from science and technology expert. Research on image processing is needed in the development of human-computer interactions such as hand recognition or gesture recognition for people with hearing impairments and deaf people. In this research we try to collect the hand gesture data and used a simple deep neural network architecture that we called model E to recognize the actual hand gestured. The dataset that we used is collected from kaggle.com and in the form of ASL (American Sign Language) datasets. We doing accuracy comparison with another existing model such as AlexNet to see how robust our model. We find that by adjusting kernel size and number of epoch for each model also give a different result. After comparing with AlexNet model we find that our model E is perform better with 96.82% accuracy.
The slide is about pedestrian re-identification (re-ID) based on Deep Learning methods. I mainly review some prevalent methods and try to give some insights to the future work. Zhedong Zheng @2017.7.27
Two level data security using steganography and 2 d cellular automataeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONgerogepatton
Most of the currently known methods treat person re-identification task as classification problem and used commonly neural networks. However, these methods used only high-level convolutional feature or to express the feature representation of pedestrians. Moreover, the current data sets for person reidentification is relatively small. Under the limitation of the number of training set, deep convolutional networks are difficult to train adequately. Therefore, it is very worthwhile to introduce auxiliary data sets help training. In order to solve this problem, this paper propose a novel method of deep transfer learning, and combines the comparison model with the classification model and multi-level fusion of the
convolution features on the basis of transfer learning. In a multi-layers convolutional network, the characteristics of each layer of network are the dimensionality reduction of the previous layer of results, but the information of multi-level features is not only inclusive, but also has certain complementarity. We can using the information gap of different layers of convolutional neural networks to extract a better feature expression. Finally, the algorithm proposed in this paper is fully tested on four data sets (VIPeR, CUHK01, GRID and PRID450S). The obtained re-identification results prove the effectiveness of the algorithm.
CONTENT RECOVERY AND IMAGE RETRIVAL IN IMAGE DATABASE CONTENT RETRIVING IN TE...Editor IJMTER
Digital Images are used in magazines, blogs, website, television and more. Digital image processing
techniques are used for feature selection, pattern extraction classification and retrieval requirements. Color, texture
and shape features are used in the image processing. Digital images processing also supports computer graphics
and computer vision domains. Scene text recognition is performed with two schemes. They are character
recognizer and binary character classifier models. A character recognizer is trained to predict the category of a
character in an image patch. A binary character classifier is trained for each character class to predict the existence
of this category in an image patch. Scene text recognition is performed on detected text regions. Pixel-based layout
analysis method is adopted to extract text regions and segment text characters in images. Text character
segmentation is carried out with color uniformity and horizontal alignment of text characters. Discriminative
character descriptor is designed by combining several feature detectors and descriptors. Histogram of Oriented
Gradients (HOG) is used to identify the character descriptors. Character structure is modeled at each character
class by designing stroke configuration maps. The scene text extraction scheme is also supports for smart mobile
devices. Text recognition methods are used with text understanding and text retrieval applications. The text
recognition scheme is enhanced with content based image retrieval process. The system is integrated with
additional representative and discriminative features for text structure modeling process. The system is enhanced to
perform text and word level recognition using lexicon analysis. The training process is included with word
database update task.
Image Captioning Generator using Deep Machine Learningijtsrd
Technologys scope has evolved into one of the most powerful tools for human development in a variety of fields.AI and machine learning have become one of the most powerful tools for completing tasks quickly and accurately without the need for human intervention. This project demonstrates how deep machine learning can be used to create a caption or a sentence for a given picture. This can be used for visually impaired persons, as well as automobiles for self identification, and for various applications to verify quickly and easily. The Convolutional Neural Network CNN is used to describe the alphabet, and the Long Short Term Memory LSTM is used to organize the right meaningful sentences in this model. The flicker 8k and flicker 30k datasets were used to train this. Sreejith S P | Vijayakumar A "Image Captioning Generator using Deep Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42344.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42344/image-captioning-generator-using-deep-machine-learning/sreejith-s-p
Real Time Object Detection with Audio Feedback using Yolo v3ijtsrd
In this paper, we propose a system that combines real time object detection using the YOLOv3 algorithm with audio feedback to assist visually impaired individuals in locating and identifying objects in their surroundings. The YOLOv3 algorithm is a state of the art object detection algorithm that has been used in numerous studies for various applications. Audio feedback has also been studied in previous research as a useful tool for assisting visually impaired individuals. Our proposed system builds on the effectiveness of both these technologies to provide a valuable tool for improving the independence and quality of life of visually impaired individuals. We present the architecture of our proposed system, which includes a YOLOv3 model for object detection and a text to speech engine for providing audio feedback. We also present the results of our experiments, which demonstrate the effectiveness of our system in detecting and identifying objects in real time. Our proposed system can be used in various settings, such as indoor and outdoor environments, and can assist visually impaired individuals in various activities such as the navigation and object identification. Dr. K. Nagi Reddy | K. Sreeja | M. Sreenivasulu Reddy | K. Sireesha | M. Triveni "Real Time Object Detection with Audio Feedback using Yolo_v3" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-7 | Issue-2 , April 2023, URL: https://www.ijtsrd.com.com/papers/ijtsrd55158.pdf Paper URL: https://www.ijtsrd.com.com/engineering/electronics-and-communication-engineering/55158/real-time-object-detection-with-audio-feedback-using-yolov3/dr-k-nagi-reddy
Real-time eyeglass detection using transfer learning for non-standard facial...IJECEIAES
The aim of this paper is to build a real-time eyeglass detection framework based on deep features present in facial or ocular images, which serve as a prime factor in forensics analysis, authentication systems and many more. Generally, eyeglass detection methods were executed using cleaned and fine-tuned facial datasets; it resulted in a well-developed model, but the slightest deviation could affect the performance of the model giving poor results on real-time non-standard facial images. Therefore, a robust model is introduced which is trained on custom non-standard facial data. An Inception V3 architecture based pre-trained convolutional neural network (CNN) is used and fine-tuned using model hyper-parameters to achieve a high accuracy and good precision on non-standard facial images in real-time. This resulted in an accuracy score of about 99.2% and 99.9% for training and testing datasets respectively in less amount of time thereby showing the robustness of the model in all conditions.
End-to-end deep auto-encoder for segmenting a moving object with limited tra...IJECEIAES
Deep learning-based approaches have been widely used in various applications, including segmentation and classification. However, a large amount of data is required to train such techniques. Indeed, in the surveillance video domain, there are few accessible data due to acquisition and experiment complexity. In this paper, we propose an end-to-end deep auto-encoder system for object segmenting from surveillance videos. Our main purpose is to enhance the process of distinguishing the foreground object when only limited data are available. To this end, we propose two approaches based on transfer learning and multi-depth auto-encoders to avoid over-fitting by combining classical data augmentation and principal component analysis (PCA) techniques to improve the quality of training data. Our approach achieves good results outperforming other popular models, which used the same principle of training with limited data. In addition, a detailed explanation of these techniques and some recommendations are provided. Our methodology constitutes a useful strategy for increasing samples in the deep learning domain and can be applied to improve segmentation accuracy. We believe that our strategy has a considerable interest in various applications such as medical and biological fields, especially in the early stages of experiments where there are few samples.
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...CSCJournals
In today's era of digitization and fast internet, many video are uploaded on websites, a mechanism is required to access this video accurately and efficiently. Semantic concept detection achieve this task accurately and is used in many application like multimedia annotation, video summarization, annotation, indexing and retrieval. Video retrieval based on semantic concept is efficient and challenging research area. Semantic concept detection bridges the semantic gap between low level extraction of features from key-frame or shot of video and high level interpretation of the same as semantics. Semantic Concept detection automatically assigns labels to video from predefined vocabulary. This task is considered as supervised machine learning problem. Support vector machine (SVM) emerged as default classifier choice for this task. But recently Deep Convolutional Neural Network (CNN) has shown exceptional performance in this area. CNN requires large dataset for training. In this paper, we present framework for semantic concept detection using hybrid model of SVM and CNN. Global features like color moment, HSV histogram, wavelet transform, grey level co-occurrence matrix and edge orientation histogram are selected as low level features extracted from annotated groundtruth video dataset of TRECVID. In second pipeline, deep features are extracted using pretrained CNN. Dataset is partitioned in three segments to deal with data imbalance issue. Two classifiers are separately trained on all segments and fusion of scores is performed to detect the concepts in test dataset. The system performance is evaluated using Mean Average Precision for multi-label dataset. The performance of the proposed framework using hybrid model of SVM and CNN is comparable to existing approaches.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Modelling Framework of a Neural Object Recognition
1. Aswathy K S Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 4) February 2016, pp.87-93
www.ijera.com 87|P a g e
Modelling Framework of a Neural Object Recognition
Aswathy K S*, Prof. (Dr.) Gnana Sheela K**
*(Department of Electronics and Communication, Toc H Institute of Science and Technology, Kerala, India)
** (Department of Electronics and Communication, Toc H Institute of Science and Technology, Kerala, India)
ABSTRACT
In many industrial, medical and scientific image processing applications, various feature and pattern recognition
techniques are used to match specific features in an image with a known template. Despite the capabilities of
these techniques, some applications require simultaneous analysis of multiple, complex, and irregular features
within an image as in semiconductor wafer inspection. In wafer inspection discovered defects are often complex
and irregular and demand more human-like inspection techniques to recognize irregularities. By incorporating
neural network techniques such image processing systems with much number of images can be trained until the
system eventually learns to recognize irregularities. The aim of this project is to develop a framework of a
machine-learning system that can classify objects of different category. The framework utilizes the toolboxes in
the Matlab such as Computer Vision Toolbox, Neural Network Toolbox etc.
Keywords: Artificial Intelligence, Neural Networks, Computer Vision, Learning, Bag of words, Scale Invariant
Feature Transform.
I. INTRODUCTION
Today, machine vision applications crop up in
many industries, including semiconductor,
electronics, pharmaceuticals, packaging, medical
devices, automotive and consumer goods. Machine
vision systems offer a non-contact means of
inspecting and identifying parts, accurately
measuring dimensions, or guiding robots or other
machines during pick-and-place and other assembly
operations. In the near term, computer vision systems
that can discern the story in a picture will enable
people to search photo or video archives and find
highly specific images. Eventually, these advances
will lead to robotic systems able to navigate unknown
situations. Driverless cars would also be made safer.
However, it also raises the prospect of even greater
levels of government surveillance. Two important
specifications in any vision system are the sensitivity
and the resolution. The better the resolution, the more
confined the field of vision. Sensitivity and resolution
are interdependent. All other factors held constant,
increasing the sensitivity reduces the resolution, and
improving the resolution reduces the sensitivity. In
many industrial, medical and scientific image
processing applications, various feature and pattern
recognition techniques are used to match specific
features in an image with a known template. Despite
the capabilities of these techniques, some
applications require simultaneous analysis of
multiple, complex, and irregular features within an
image as in semiconductor wafer inspection. In wafer
inspection discovered defects are often complex and
irregular and demand more human-like inspection
techniques to recognize irregularities. By
incorporating neural network techniques such image
processing systems with much number of images can
be trained until the system eventually learns to
recognize irregularities. Object recognition is nothing
but finding and identifying objects in an image or
video sequence. Humans recognize a multitude of
objects in images with little effort, despite the fact
that the image of the objects may vary somewhat in
different viewpoints, in many different sizes and
scales or even when they are translated or rotated.
Objects can even be recognized when they are
partially obstructed from view. This task is still a
challenge for computer vision systems. Many
approaches to the task have been implemented over
multiple decades.
Fig 1.1 Vision - Human vs. Machine
II. LITERATURE SURVEY
Automatically generating captions of an image is
a task very close to the heart of scene understanding.
This requires, identifying and detecting objects,
RESEARCH ARTICLE OPEN ACCESS
2. Aswathy K S Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 4) February 2016, pp.87-93
www.ijera.com 88|P a g e
people, scenes etc., reasoning about spatial
relationships and properties of objects, combining
several sources of information into a coherent
sentence. Hence it is a complex task to define an
image or a scene; which is an important problem in
the field of computer vision. Even though it is a
challenging one, a lot of research is going on which
explores the capability of computer vision in the field
of image processing and it helps to narrow the gap
between the computer and the human beings on scene
understanding. The purpose of this survey is to
analyze various techniques used for an image caption
generation using the neural network concepts.
Table 2.1 Comparative Analysis on various methods
Author Year Method Remarks
Kelvin Xu et al 2015 Hard attention
mechanism and Soft
attention mechanism
Three benchmark datasets: Flickr8k, Flickr30k and
MS COCO dataset ;
Evaluated and obtained much better performance than
the other methods;
Oriol Vinyals et al 2015 A generative model
based on a deep
recurrent architecture
Accurate when verified both qualitatively and
quantitatively;
This approach yields 59, to be compared to human
performance around 69 and which far better than
previous method which shows only a score of 25;
Jimmy Lei Ba et al 2015 An attention-based
model for recognizing
multiple objects in
images
Used deep recurrent neural network;
More accurate than the state-of-the-art convolutional
networks and uses fewer parameters and less
computation;
Dzmitry Bahdanau
et al
2015 Soft attention based
encoder–decoder
architecture
Qualitatively good performance, but lacks in
quantitative analysis;
Kyunghyun Cho et
al
2014 RNN model Qualitatively the proposed model learns a
semantically and syntactically meaningful
representation of linguistic phrases ;
Maximize the conditional probability of a target
sequence given a source sequence;
Jeff Donahue et al 2014 Long-term Recurrent
Convolutional
Networks for Visual
Recognition and
Description
Evaluated on various dataset such as flicker320k,
coco2014etc;
Architecture
is not restricted to deep NN inputs but can be cleanly
integrated with other fixed or variable length inputs
from other vision systems;
Junhua Mao et al 2014 Deep Captioning with
Multimodal Recurrent
Neural Networks
Validated on
Four benchmark datasets : iapr tc-12, flickr 8k, flickr 30k
and ms coco;
More improved performance than previous methods;
Andrej Karpathy et
al
2014 Deep Visual-Semantic
Alignments for
Generating Image
Descriptions
Experimented on Flickr8K, Flickr30K and MSCOCO
datasets;
Good performance;
Razvan Pascanu et
al
2014 Deep Recurrent
Neural Networks
Evaluated on the tasks of polyphonic music
Prediction and language modeling;
High performance than conventional RNN;
Bharathi S et al 2014 BoF framework for
remote sensing image
classification
using RANSAC and
SVM
Time complexity of the classification is not very
complex;
It took 3mins for a dataset;
One of the best methods for content based image
classification;
Chih-Fong Tsai et
al
2012 Bag-of-Words
Representation in
Image Annotation
One of the most widely used feature representation
methods ;
Good in performance;
3. Aswathy K S Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 4) February 2016, pp.87-93
www.ijera.com 89|P a g e
Misha Denil et al 2011 Learning where to
Attend with Deep
Architectures for
Image Tracking
Good performance in the presence of partial
information;
Siming Li et al 2011 Composing Simple
Image Descriptions
using Web-scale N-
grams
Viable to generate simple textual descriptions that are
pertinent to the specific content of an image;
Yezhou Yang et al 2011 Corpus-Guided
Sentence Generation
of Natural Images
Strategy of combining vision
And language produces readable and descriptive sentences
compared to naive strategies that use vision alone;
Sentences are the closest in agreement with the
human annotated ones;
More relevant and readable output;
Stephen O’hara et
al
2011 Bag of features
paradigm for image
Classification and
retrieval
Less quantization errors;
Improved feature detection, and speed up image
retrieval;
Xiaoli Yuan et al 2011 A SIFT-LBP image
retrieval model based
on bag-of-features
Better image retrieval even
In the case of noisy background and ambiguous objects;
Average performance is lower than bof model;
Ahmet Aker et al 2010 Generating image
descriptions using
dependency relational
patterns
Better higher scores than former n-gram language
models;
More readable summary obtained on output;
Juan C Caicedo et
al
2009 Histopathology Image
Classification using
Bag of Features and
Kernel Functions
Tested six different codebook sizes starting with 50
code blocks and following with 150, 250, 500,50 and
1000;
The classification performance decreases while the
codebook size increases;
Performance of the sift points decreases faster than
the performance of raw blocks;
Sift-based codebook requires less code blocks to
express all different patterns in the image collection;
A block-based codebook requires a larger size
because it is representing the same visual patterns
using different code blocks;
Eric Nowak et al 2006 Sampling Strategies
for Bag-of-Features
Image Classification
Interest point based samplers such as harris-laplace
and laplacian of gaussian each work well in some
databases for small numbers of sampled patches;
Jim Mutch et al 2006 Biologically inspired
model of visual object
recognition to the
multiclass object
categorization
Utilized neural network concepts;
Better in performance than any model without NN
concepts.
III. METHODOLOGY
In order to decide about the apt feature to be
extracted out of the input image I started ofF with
various types of features of an image and
experimented and analyzed various methods used to
obtain those features. Out of these experiments the
bounding algorithm and Bag of features functions
were found to be useful for the purpose of this
project.
3.1 BOUNDING BOX METHOD
In an image, the edge is a curve that follows a path of
rapid change in image intensity. Edges are often
associated with the boundaries of objects in a scene.
Edge function looks for places in the image where the
intensity changes rapidly, using one of these two
criteria:
Places where the first derivative of the intensity
is larger in magnitude than some threshold
Places where the second derivative of the
intensity has a zero crossing
4. Aswathy K S Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 4) February 2016, pp.87-93
www.ijera.com 90|P a g e
Fig 3.1 Flowchart of bounding box algorithm
The most powerful edge detection method that
edge provides is the canny method. The Canny
method differs from the other edge detection methods
in that it uses two different thresholds (to detect
strong and weak edges), and includes the weak edges
in the output only if they are connected to strong
edges. This method is therefore less likely than the
others to be fooled by noise, and more likely to detect
true weak edges. Dilation, a morphological operation
adds pixels to the boundaries of objects in an image
and the number of pixels added to the objects in an
image depends on the size and shape of the
structuring element used to process the image. The
regional properties of the objects in binary image are
obtained. The properties include three fields:
Area A = 1(𝑟,𝑐∈𝑅) (1)
Centroid r = 1/A 𝑟(𝑟,𝑐∈𝑅) (2)
C = 1/A 𝑐(𝑟,𝑐∈𝑅) (3)
Box dimensions
The smallest rectangle containing the region it
can be specified by:
– the location of the upper left corner
– the width and height
3.2 BAG-OF-FEATURES METHOD
The bag-of-features (BoF) method is largely
inspired by thebag-of-words. In the BoW model, each
word is assumed to be independent. In the BoF
model, each image is described by a set of order less
local features, recent research has demonstrated its
effectiveness in image processing. To extract the
BoW feature from images involves the following
steps:
automatically detect regions/points of interest
compute local descriptors over those
regions/points
quantize the descriptors into words to form the
visual Vocabulary
find the occurrences in the image of each
specific word in the vocabulary for constructing
the BoW feature (or a histogram of word
frequencies)
Fig 3.2 Flowchart of BOF algorithm
3.3 Combination of Bounding Box and BoF
Method
The bounding box method was used to segment
the objects on an image and then provided those
objects to the bag of functions to recognize each
object. The Figure 6.2 shows the flowchart for this
combination of bounding box method and BoF
method. Using this combination method I was able to
recognize different objects on the same image. Again
the degree of the correctness of the output is purely
dependent on the images provided to the algorithm.
Fig 3.3 Flow chart of combination of bounding
method and BoF method
3.4 Scale Invariant Feature Transform (SIFT)
Scale-invariant feature transform (or SIFT) is an
algorithm in computer vision to detect and describe
local features in images. The algorithm was
published by David Lowe. SIFT key points of objects
are first extracted from a set of reference images and
stored in a database. An object is recognized in a new
image by individually comparing each feature from
5. Aswathy K S Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 4) February 2016, pp.87-93
www.ijera.com 91|P a g e
the new image to this database and finding candidate
matching features based on Euclidean distance of
their feature vectors. From the full set of matches,
subsets of key points that agree on the object and its
location, scale, and orientation in the new image are
identified to filter out good matches. Once the
features are obtained it is provided to the Neural
Network Toolbox which utilizes the gradient descent
with momentum and adaptive LR training network.
Fig 3.4 Flowchart of SIFT algorithm
IV. EXPERIMENT ANALYSIS
The various methods done are included in Table
4.1. Among these methods the combination of
bounding and BoF was better. But the feature vector
obtained here is not a constant one. It keeps on
changing upon each run command. Hence decided to
go for another method called SIFT which extracted
the local features of the image and this feature vector
was provided to the NN toolbox for recognition of
new objects.
Table 4.1 Various methods to analyze feature extraction
V. SIMULATION
In order to analyze the performance of various
methods experimented, a specific set of datasets are
utilized. One dataset include different varieties of
fruits. First of all an individual object or a single fruit
was provided as input. For eg: an apple. The neural
network recognized the given image of fruit
correctly. Again different categories of objects were
utilized such as chairs, cars, flowers, books etc and
each of them were recognized correctly. This model
even identifies the objects correctly that are not even
present within the dataset. Afterwards my aim was to
recognize objects of different categories present on a
single image. Thus an image with different kinds of
fruits, flowers etc was given as input and each of
them were recognized correctly. Some simulation
results and plots are included below.
Fig 5.1 Boxed input image
Keypoints Matching
Descriptor Computation
Orientation Assignment
Noise Elimination
Extrema Detection
Create scale-space and DoG
Methods Purpose
Analysis of different features Color, texture, edges, corners, shapes analysis are done on an image set
Using Bounding box To separate objects on an input image
Harris method Corner detection based on intensity variations
SURF method A comparison method to detect an intended portion on a given image
Bag of features method A technique adapted to computer vision from the world of natural
language processing
Combination of bounding box and
bag of features
Recognizes different categories of objects on same image
SIFT method Upon training NN extracted sift feature vectors and recognized the
objects on a blank background correctly and generated output correctly.
6. Aswathy K S Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 4) February 2016, pp.87-93
www.ijera.com 92|P a g e
Fig 5.2 the Network used
Fig 5.3 Output obtained
Fig 5.4 Performance Curve
VI. CONCLUSION
The major part lies in the extraction of the
correct features of the given input image. Various
methods of feature extraction are available. Upon
survey it was found that most of the previous
methods are concentrating on a single feature alone,
which would not aid for my purpose. Hence after
working on various available methods the SURF
features were found to be better as it is independent
of the scale and orientation of an image. But still it
didn’t serve my purpose. Thus decided to choose
another feature extraction process called Bag-Of-
Visual words, which is the better one so far. Finally
utilizing the bounding method to identify objects in a
single image and applied to the BoF method to
recognize each of them. But still the presence of
neural networks is not there as the feature matrix
obtained out of BoF is not a stable one. Hence
utilized the SIFT method - as the name indicates a
method independent of scale and rotation changes.
REFERENCES
[1] Jim Mutch, David G. Lowe, “Multiclass
Object Recognition with Sparse, Localized
Features”, In Proc of the IEEE Computer
Society Conference on Computer Vision and
Pattern Recognition (CVPR’06), 2006.
[2] Eric Nowak, Frederic Jurie, Bill Triggs,
“Sampling Strategies for Bag-of-Features
Image Classification”, Springer-Verlag
Berlin Heidelberg, ECCV Part IV, LNCS
3954, pp. 490–503, 2006.
[3] Juan C. Caicedo, Angel Cru, Fabio A.
Gonzalez, “Histopathology Image
Classification using Bag of Features and
Kernel Functions”, Bioingenium Research
Group, National University of Colombia,
2009.
[4] Ahmet Aker, Robert Gaizauskas¸
“Generating image descriptions using
7. Aswathy K S Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 6, Issue 2, (Part - 4) February 2016, pp.87-93
www.ijera.com 93|P a g e
dependency relational patterns”, In Proc of
the 48th Annual Meeting of the Association
for Computational Linguistics, pages 1250–
1258, Uppsala, Sweden, July 2010.
[5] Xiaoli Yuan, Jing Yu, Zengchang Qin, “A
SIFT-LBP image retrieval model based on
Bag-Of-Features”, 18th
IEEE International
Conference on Image Processing, 2011.
[6] Stephen O’hara AND Bruce A. Draper,
“Introduction to the bag of features
paradigm for image classification and
retrieval”, arXiv: 1101.3354v1 [cs.CV],
January 2011.
[7] Yezhou Yang , Ching Lik Teo, Hal Daume,
Yiannis Aloimonos, “Corpus-Guided
Sentence Generation of Natural Images”, In
Proc of the 2011 Conference on Empirical
Methods in Natural Language Processing,
pages 444–454,Scotland, UK, July, 2011.
[8] Siming Li, Girish Kulkarni, Tamara L Berg,
Alexander C Berg, and Yejin Choi,
“Composing Simple Image Descriptions
using Web-scale N-grams”, In Proc
CoNLL’, 2011.
[9] Misha Denil, Loris Bazzani, Hugo
Larochelle, Nando de Freitas, “Learning
where to Attend with Deep Architectures for
Image Tracking”, arxiv:1109.3737,
September 2011.
[10] Chih-Fong Tsai, F. Camastra, “Bag-of-
Words Representation in Image Annotation:
A Review”, International Scholarly
Research NetworkISRN Artificial
Intelligence, 2012.
[11] Bharathi S, Karthik Kumar S, P Deepa
Shenoy, Venugopal K R, L M Patnaik,
“Bag of Features Based Remote Sensing
Image Classification Using RANSAC And
SVM”, In Proceedings of the International
Multi Conference of Engineers and
Computer Scientists Vol I, IMECS, March
2014.
[12] Razvan Pascanu, Caglar Gulcehre,
Kyunghyun Cho, Yoshua Bengio, “How to
Construct Deep Recurrent Neural
Networks”, arXiv: 1312.6026v5 [cs.NE],
April 2014.
[13] Andrej Karpathy, Li Fei, “Deep Visual-
Semantic Alignments for Generating Image
Descriptions”,cs.stanford.edu/people/karpat
hy/deepimagesent/StanfordUniversity, 2014.
[14] Junhua Mao, Wei Xu, Yi Yang, Jiang Wang,
Zhiheng Huang, Alan L. Yuille, “Deep
Captioning with Multimodal Recurrent
Neural Networks (m-RNN)”, Published as a
conference paper at ICLR 2015,July 2014.
[15] Jeffrey Donahue, Lisa Hendricks, Sergio
Guadarrama, Marcus Rohrbach, Subhashini
Venugopalan, Kate Saenko, Trevor Darrell
“Long-term Recurrent Convolutional
Networks for Visual Recognition and
Description”, University of California at
Berkeley, Technical Report No. UCB/EECS-
2014-180, November 2014.
[16] Kyunghyun Cho, Bart van, Caglar Gulcehre,
Dzmitry Bahdanau, Fethi Bougares, Holger
Schwenk, Yoshua Bengio, “Learning Phrase
Representations using RNN Encoder–
Decoder for Statistical Machine
Translation”, arXiv:1406.1078v3
[cs.CL],September 2014.
[17] Dzmitry Bahdanau, Kyung Hyun Cho,
Yoshua Bengio, “Neural machine translation
by jointly learning to align and translate”,
arXiv: 1409.0473v6 [cs.CL],Google,
Published on ICLR, April 2015.
[18] Jimmy Lei Ba, Volodymyr Minho, Koray
Kavukcuoglu, “Multiple object recognition
with visual attention”, arXiv: 1412.7755v2
[cs.LG], Google, April 2015.
[19] Oriol Vinyals, Alexander Toshev, Samy
Bengio, Dumitru Erhan, “Show and Tell: A
Neural Image Caption Generator”, arXiv:
1411.4555v2 [cs.CV] , Google, 2015.
[20] Kelvin Xu, Jimmy Lei Ba, Ryan Kiros,
Kyunghyun Cho, Aaron Courville, Ruslan
Salakhutdinov, Richard S. Zemel, Yoshua
Bengio, “Show, Attend and Tell: Neural
Image Caption Generation with Visual
Attention”, In Proc of the 32nd
International Conference on Machine
Learning, France, JMLR: W&CP volume
37, February 2015.