A Framework for Human Action Detection via Extraction of Multimodal FeaturesCSCJournals
This work discusses the application of an Artificial Intelligence technique called data extraction and a process-based ontology in constructing experimental qualitative models for video retrieval and detection. We present a framework architecture that uses multimodality features as the knowledge representation scheme to model the behaviors of a number of human actions in the video scenes. The main focus of this paper placed on the design of two main components (model classifier and inference engine) for a tool abbreviated as VASD (Video Action Scene Detector) for retrieving and detecting human actions from video scenes. The discussion starts by presenting the workflow of the retrieving and detection process and the automated model classifier construction logic. We then move on to demonstrate how the constructed classifiers can be used with multimodality features for detecting human actions. Finally, behavioral explanation manifestation is discussed. The simulator is implemented in bilingual; Math Lab and C++ are at the backend supplying data and theories while Java handles all front-end GUI and action pattern updating. To compare the usefulness of the proposed framework, several experiments were conducted and the results were obtained by using visual features only (77.89% for precision; 72.10% for recall), audio features only (62.52% for precision; 48.93% for recall) and combined audiovisual (90.35% for precision; 90.65% for recall).
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...csandit
Motion detection and object segmentation are an important research area of image-video
processing and computer vision. The technique and mathematical modeling used to detect and
segment region of interest (ROI) objects comprise the algorithmic modules of various high-level
techniques in video analysis, object extraction, classification, and recognition. The detection of
moving object is significant in many tasks, such as video surveillance & moving object tracking.
The design of a video surveillance system is directed on involuntary identification of events of
interest, especially on tracking and on classification of moving objects. An entropy based realtime
adaptive non-parametric window thresholding algorithm for change detection is
anticipated in this research. Based on the approximation of the value of scatter of sections of
change in a difference image, a threshold of every image block is calculated discriminatively
using entropy structure, and then the global threshold is attained by averaging all thresholds for
image blocks of the frame. The block threshold is calculated contrarily for regions of change
and background. Investigational results show the proposed thresholding algorithm
accomplishes well for change detection with high efficiency.
Human motion is fundamental to understanding behaviour. In spite of advancement on single image 3 Dimensional pose and estimation of shapes, current video-based state of the art methods unsuccessful to produce precise and motion of natural sequences due to inefficiency of ground-truth 3 Dimensional motion data for training. Recognition of Human action for programmed video surveillance applications is an interesting but forbidding task especially if the videos are captured in an unpleasant lighting environment. It is a Spatial-temporal feature-based correlation filter, for concurrent observation and identification of numerous human actions in a little-light environment. Estimated the presentation of a proposed filter with immense experimentation on night-time action datasets. Tentative results demonstrate the potency of the merging schemes for vigorous action recognition in a significantly low light environment.
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...ijtsrd
This paper proposes extracting salient objects from motion fields. Salient object detection is an important technique for many content-based applications, but it becomes a challenging work when handling the clustered saliency maps, which cannot completely highlight salient object regions and cannot suppress background regions. We present algorithms for recognizing activity in monocular video sequences, based on discriminative gradient Random Field. Surveillance videos capture the behavioral activities of the objects accessing the surveillance system. Some behavior is frequent sequence of events and some deviate from the known frequent sequences of events. These events are termed as anomalies and may be susceptible to criminal activities. In the past, work was based on discovering the known abnormal events. Here, the unknown abnormal activities are to be detected and alerted such that early actions are taken. K. Shankar | Dr. S. Srinivasan | Dr. T. S. Sivakumaran | K. Madhavi Priya"Discovering Anomalies Based on Saliency Detection and Segmentation in Surveillance System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: http://www.ijtsrd.com/papers/ijtsrd5871.pdf http://www.ijtsrd.com/engineering/computer-engineering/5871/discovering-anomalies-based-on-saliency-detection-and-segmentation-in-surveillance-system/k-shankar
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...Kalle
In this paper we propose an implicit relevance feedback method with the aim to improve the performance of known Content Based Image Retrieval (CBIR) systems by re-ranking the retrieved images according to users’ eye gaze data. This represents a new mechanism for implicit relevance feedback, in fact usually the sources taken into account for image retrieval are based on the natural behavior of the user in his/her environment estimated by analyzing mouse and keyboard interactions. In detail, after the retrieval of the images by querying CBIRs with a keyword, our system computes the most salient regions (where users look with a greater interest) of the retrieved images by gathering data from an unobtrusive eye tracker, such as Tobii T60. According to the features, in terms of color, texture, of these relevant regions our system is able to re-rank the images, initially, retrieved by the CBIR. Performance evaluation, carried out on a set of 30 users by using Google Images and “pyramid” like keyword, shows that about the 87% of the users is more satisfied of the output images when the re-raking is applied.
A Framework for Human Action Detection via Extraction of Multimodal FeaturesCSCJournals
This work discusses the application of an Artificial Intelligence technique called data extraction and a process-based ontology in constructing experimental qualitative models for video retrieval and detection. We present a framework architecture that uses multimodality features as the knowledge representation scheme to model the behaviors of a number of human actions in the video scenes. The main focus of this paper placed on the design of two main components (model classifier and inference engine) for a tool abbreviated as VASD (Video Action Scene Detector) for retrieving and detecting human actions from video scenes. The discussion starts by presenting the workflow of the retrieving and detection process and the automated model classifier construction logic. We then move on to demonstrate how the constructed classifiers can be used with multimodality features for detecting human actions. Finally, behavioral explanation manifestation is discussed. The simulator is implemented in bilingual; Math Lab and C++ are at the backend supplying data and theories while Java handles all front-end GUI and action pattern updating. To compare the usefulness of the proposed framework, several experiments were conducted and the results were obtained by using visual features only (77.89% for precision; 72.10% for recall), audio features only (62.52% for precision; 48.93% for recall) and combined audiovisual (90.35% for precision; 90.65% for recall).
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...csandit
Motion detection and object segmentation are an important research area of image-video
processing and computer vision. The technique and mathematical modeling used to detect and
segment region of interest (ROI) objects comprise the algorithmic modules of various high-level
techniques in video analysis, object extraction, classification, and recognition. The detection of
moving object is significant in many tasks, such as video surveillance & moving object tracking.
The design of a video surveillance system is directed on involuntary identification of events of
interest, especially on tracking and on classification of moving objects. An entropy based realtime
adaptive non-parametric window thresholding algorithm for change detection is
anticipated in this research. Based on the approximation of the value of scatter of sections of
change in a difference image, a threshold of every image block is calculated discriminatively
using entropy structure, and then the global threshold is attained by averaging all thresholds for
image blocks of the frame. The block threshold is calculated contrarily for regions of change
and background. Investigational results show the proposed thresholding algorithm
accomplishes well for change detection with high efficiency.
Human motion is fundamental to understanding behaviour. In spite of advancement on single image 3 Dimensional pose and estimation of shapes, current video-based state of the art methods unsuccessful to produce precise and motion of natural sequences due to inefficiency of ground-truth 3 Dimensional motion data for training. Recognition of Human action for programmed video surveillance applications is an interesting but forbidding task especially if the videos are captured in an unpleasant lighting environment. It is a Spatial-temporal feature-based correlation filter, for concurrent observation and identification of numerous human actions in a little-light environment. Estimated the presentation of a proposed filter with immense experimentation on night-time action datasets. Tentative results demonstrate the potency of the merging schemes for vigorous action recognition in a significantly low light environment.
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...ijtsrd
This paper proposes extracting salient objects from motion fields. Salient object detection is an important technique for many content-based applications, but it becomes a challenging work when handling the clustered saliency maps, which cannot completely highlight salient object regions and cannot suppress background regions. We present algorithms for recognizing activity in monocular video sequences, based on discriminative gradient Random Field. Surveillance videos capture the behavioral activities of the objects accessing the surveillance system. Some behavior is frequent sequence of events and some deviate from the known frequent sequences of events. These events are termed as anomalies and may be susceptible to criminal activities. In the past, work was based on discovering the known abnormal events. Here, the unknown abnormal activities are to be detected and alerted such that early actions are taken. K. Shankar | Dr. S. Srinivasan | Dr. T. S. Sivakumaran | K. Madhavi Priya"Discovering Anomalies Based on Saliency Detection and Segmentation in Surveillance System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-1 , December 2017, URL: http://www.ijtsrd.com/papers/ijtsrd5871.pdf http://www.ijtsrd.com/engineering/computer-engineering/5871/discovering-anomalies-based-on-saliency-detection-and-segmentation-in-surveillance-system/k-shankar
Faro Visual Attention For Implicit Relevance Feedback In A Content Based Imag...Kalle
In this paper we propose an implicit relevance feedback method with the aim to improve the performance of known Content Based Image Retrieval (CBIR) systems by re-ranking the retrieved images according to users’ eye gaze data. This represents a new mechanism for implicit relevance feedback, in fact usually the sources taken into account for image retrieval are based on the natural behavior of the user in his/her environment estimated by analyzing mouse and keyboard interactions. In detail, after the retrieval of the images by querying CBIRs with a keyword, our system computes the most salient regions (where users look with a greater interest) of the retrieved images by gathering data from an unobtrusive eye tracker, such as Tobii T60. According to the features, in terms of color, texture, of these relevant regions our system is able to re-rank the images, initially, retrieved by the CBIR. Performance evaluation, carried out on a set of 30 users by using Google Images and “pyramid” like keyword, shows that about the 87% of the users is more satisfied of the output images when the re-raking is applied.
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALIJCSEIT Journal
The proposed approach avoids the semantic gap in image retrieval by combining automatic relevance
feedback and a modified stochastic algorithm. A visual feature database is constructed from the image
database, using combined feature vector. Very few fast-computable features are included in this step. The
user selects the query image, and based on that, the system ranks the whole dataset. The nearest images are
retrieved and the first automatic relevance feedback is generated. The combined similarity of textual and
visual feature space using Latent Semantic Indexing is evaluated and the images are labelled as relevant or
irrelevant. The feedback drives a feature re-weighting process and is routed to the particle swarm
optimizer. Instead of classical swarm update approach, the swarm is split, for each swarm to perform the
search in parallel, thereby increasing the performance of the system. It provides a powerful optimization
tool and an effective space exploration mechanism. The proposed approach aims to achieve the following
goals without any human interaction - to cluster relevant images using meta-heuristics and to dynamically
modify the feature space by feeding automatic relevance feedback.
In our World of today, the quest to get rich at all cost without working for our money has led some of our youth into crimes such as robbery and kidnapping. As a result of this and by the sheer fact that vehicles are now very expensive to buy these days, there is a need for people to safeguard their vehicles against these hoodlums to avoid loss of their precious Assets to these rampaging criminals. Tracking is technology that is used by many companies and individuals to track a vehicle, an individual or an asset by using many ways like GPS that operates using satellites and ground-based stations or by using our approach which depends on the cellular mobile towers. Vehicle tracking system is a system that can be used in monitoring and locating a vehicle, avoid theft or recover a stolen vehicle, for monitoring of vehicle routes to ensure strict compliance to an already defined vehicle routes, monitor driver’s behavior, predict bus arrival as well as for fleet management. Internet of things has made it very possible to devices to inter communicate amongst themselves and exchange information, helping in acquiring and analyzing information faster that we used to know in the past and this has helped more especially in vehicle monitoring to ensure that vehicle owners feel safe about their investments without fearing about their loss. In this paper, we propose a vehicle monitoring system based on IOT technology, using 4G/LTE to get the get the coordinate, speed, and overall condition of the vehicle, process and send to a remote server to be analyzed and used in locating the vehicle and monitor its other configured parameters. This is realized using Raspberry pi, 4G/LTE, GPS, Accelerometer and other sensors with communicate amongst themselves to get the environmental parameters which is processed and sent to a remote server where it is analyzed and represented on a map to locate the vehicle and monitor the other set parameters. 4G/LTE provides fast internet connectivity with overcomes the usual delay usually experienced in sending the acquired signals to be processed. The True Vehicle position is represented using google geolocation service and the actual position triangulated in real-time.
There has been over the past few years, a very increased popularity for yoga. A lot of literatures have been published that claim yoga to be beneficial in improving the overall lifestyle and health especially in rehabilitation, mental health and more. Considering the fast-paced lives that individuals live, people usually prefer to exercise or work-out from the comfort of their homes and with that a need for an instructor arises. Hence why, we have developed a self-assisted system which can be used to detect and classify yoga asanas, which is discussed in-depth in this paper. Especially now when the pandemic has taken over the world, it is not feasible to attend physical classes or have an instructor over. Using the technology of Computer Vision, a computer-assisted system such as the one discussed, comes in very handy. The technologies such as ml5.js, PoseNet and Neural Networks are made use for the human pose estimation and classification. The proposed system uses the above-mentioned technologies to take in a real-time video input and analyze the pose of an individual, and classifies the poses into yoga asanas. It also displays the name of the yoga asana that is detected along with the confidence score.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A multi-task learning based hybrid prediction algorithm for privacy preservin...journalBEEI
There is ever increasing need to use computer vision devices to capture videos as part of many real-world applications. However, invading privacy of people is the cause of concern. There is need for protecting privacy of people while videos are used purposefully based on objective functions. One such use case is human activity recognition without disclosing human identity. In this paper, we proposed a multi-task learning based hybrid prediction algorithm (MTL-HPA) towards realising privacy preserving human activity recognition framework (PPHARF). It serves the purpose by recognizing human activities from videos while preserving identity of humans present in the multimedia object. Face of any person in the video is anonymized to preserve privacy while the actions of the person are exposed to get them extracted. Without losing utility of human activity recognition, anonymization is achieved. Humans and face detection methods file to reveal identity of the persons in video. We experimentally confirm with joint-annotated human motion data base (JHMDB) and daily action localization in YouTube (DALY) datasets that the framework recognises human activities and ensures non-disclosure of privacy information. Our approach is better than many traditional anonymization techniques such as noise adding, blurring, and masking.
Human activity recognition is an important task in computer vision because it has many application areas
such as, healthcare, security, entertainment, and tactical scenarios. This paper presents a methodology to
automatically recognize human activity from input video stream using Histogram of Oriented Gradient
Pattern History (HOGPH) features and SVM classifier. For this purpose, the proposed system extracts
HOG features from a sequence of consecutive video frames and analyzes them to construct HOGPH feature
vector. The HOGPH feature vectors are used to train a multi-class SVM classifier for different human
activities. In test mode, we use the classifier with HOGPH feature vector to recognize human activity. We
have experimented with video data of human activity in real environments for three different tasks
(browsing, reading, and writing). The experimental result and its accuracy reveal that the proposed system
is applicable to recognize human activity in real-life.
Hardoon Image Ranking With Implicit Feedback From Eye MovementsKalle
In order to help users navigate an image search system, one could
provide explicit information on a small set of images as to which
of them are relevant or not to their task. These rankings are learned
in order to present a user with a new set of images that are relevant
to their task. Requiring such explicit information may not
be feasible in a number of cases, we consider the setting where
the user provides implicit feedback, eye movements, to assist when
performing such a task. This paper explores the idea of implicitly
incorporating eye movement features in an image ranking task
where only images are available during testing. Previous work had
demonstrated that combining eye movement and image features improved
on the retrieval accuracy when compared to using each of
the sources independently. Despite these encouraging results the
proposed approach is unrealistic as no eye movements will be presented
a-priori for new images (i.e. only after the ranked images are
presented would one be able to measure a user’s eye movements
on them). We propose a novel search methodology which combines
image features together with implicit feedback from users’
eye movements in a tensor ranking Support Vector Machine and
show that it is possible to extract the individual source-specific
weight vectors. Furthermore, we demonstrate that the decomposed
image weight vector is able to construct a new image-based semantic
space that outperforms the retrieval accuracy than when solely
using the image-features.
META-HEURISTICS BASED ARF OPTIMIZATION FOR IMAGE RETRIEVALIJCSEIT Journal
The proposed approach avoids the semantic gap in image retrieval by combining automatic relevance
feedback and a modified stochastic algorithm. A visual feature database is constructed from the image
database, using combined feature vector. Very few fast-computable features are included in this step. The
user selects the query image, and based on that, the system ranks the whole dataset. The nearest images are
retrieved and the first automatic relevance feedback is generated. The combined similarity of textual and
visual feature space using Latent Semantic Indexing is evaluated and the images are labelled as relevant or
irrelevant. The feedback drives a feature re-weighting process and is routed to the particle swarm
optimizer. Instead of classical swarm update approach, the swarm is split, for each swarm to perform the
search in parallel, thereby increasing the performance of the system. It provides a powerful optimization
tool and an effective space exploration mechanism. The proposed approach aims to achieve the following
goals without any human interaction - to cluster relevant images using meta-heuristics and to dynamically
modify the feature space by feeding automatic relevance feedback.
In our World of today, the quest to get rich at all cost without working for our money has led some of our youth into crimes such as robbery and kidnapping. As a result of this and by the sheer fact that vehicles are now very expensive to buy these days, there is a need for people to safeguard their vehicles against these hoodlums to avoid loss of their precious Assets to these rampaging criminals. Tracking is technology that is used by many companies and individuals to track a vehicle, an individual or an asset by using many ways like GPS that operates using satellites and ground-based stations or by using our approach which depends on the cellular mobile towers. Vehicle tracking system is a system that can be used in monitoring and locating a vehicle, avoid theft or recover a stolen vehicle, for monitoring of vehicle routes to ensure strict compliance to an already defined vehicle routes, monitor driver’s behavior, predict bus arrival as well as for fleet management. Internet of things has made it very possible to devices to inter communicate amongst themselves and exchange information, helping in acquiring and analyzing information faster that we used to know in the past and this has helped more especially in vehicle monitoring to ensure that vehicle owners feel safe about their investments without fearing about their loss. In this paper, we propose a vehicle monitoring system based on IOT technology, using 4G/LTE to get the get the coordinate, speed, and overall condition of the vehicle, process and send to a remote server to be analyzed and used in locating the vehicle and monitor its other configured parameters. This is realized using Raspberry pi, 4G/LTE, GPS, Accelerometer and other sensors with communicate amongst themselves to get the environmental parameters which is processed and sent to a remote server where it is analyzed and represented on a map to locate the vehicle and monitor the other set parameters. 4G/LTE provides fast internet connectivity with overcomes the usual delay usually experienced in sending the acquired signals to be processed. The True Vehicle position is represented using google geolocation service and the actual position triangulated in real-time.
There has been over the past few years, a very increased popularity for yoga. A lot of literatures have been published that claim yoga to be beneficial in improving the overall lifestyle and health especially in rehabilitation, mental health and more. Considering the fast-paced lives that individuals live, people usually prefer to exercise or work-out from the comfort of their homes and with that a need for an instructor arises. Hence why, we have developed a self-assisted system which can be used to detect and classify yoga asanas, which is discussed in-depth in this paper. Especially now when the pandemic has taken over the world, it is not feasible to attend physical classes or have an instructor over. Using the technology of Computer Vision, a computer-assisted system such as the one discussed, comes in very handy. The technologies such as ml5.js, PoseNet and Neural Networks are made use for the human pose estimation and classification. The proposed system uses the above-mentioned technologies to take in a real-time video input and analyze the pose of an individual, and classifies the poses into yoga asanas. It also displays the name of the yoga asana that is detected along with the confidence score.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A multi-task learning based hybrid prediction algorithm for privacy preservin...journalBEEI
There is ever increasing need to use computer vision devices to capture videos as part of many real-world applications. However, invading privacy of people is the cause of concern. There is need for protecting privacy of people while videos are used purposefully based on objective functions. One such use case is human activity recognition without disclosing human identity. In this paper, we proposed a multi-task learning based hybrid prediction algorithm (MTL-HPA) towards realising privacy preserving human activity recognition framework (PPHARF). It serves the purpose by recognizing human activities from videos while preserving identity of humans present in the multimedia object. Face of any person in the video is anonymized to preserve privacy while the actions of the person are exposed to get them extracted. Without losing utility of human activity recognition, anonymization is achieved. Humans and face detection methods file to reveal identity of the persons in video. We experimentally confirm with joint-annotated human motion data base (JHMDB) and daily action localization in YouTube (DALY) datasets that the framework recognises human activities and ensures non-disclosure of privacy information. Our approach is better than many traditional anonymization techniques such as noise adding, blurring, and masking.
Human activity recognition is an important task in computer vision because it has many application areas
such as, healthcare, security, entertainment, and tactical scenarios. This paper presents a methodology to
automatically recognize human activity from input video stream using Histogram of Oriented Gradient
Pattern History (HOGPH) features and SVM classifier. For this purpose, the proposed system extracts
HOG features from a sequence of consecutive video frames and analyzes them to construct HOGPH feature
vector. The HOGPH feature vectors are used to train a multi-class SVM classifier for different human
activities. In test mode, we use the classifier with HOGPH feature vector to recognize human activity. We
have experimented with video data of human activity in real environments for three different tasks
(browsing, reading, and writing). The experimental result and its accuracy reveal that the proposed system
is applicable to recognize human activity in real-life.
Hardoon Image Ranking With Implicit Feedback From Eye MovementsKalle
In order to help users navigate an image search system, one could
provide explicit information on a small set of images as to which
of them are relevant or not to their task. These rankings are learned
in order to present a user with a new set of images that are relevant
to their task. Requiring such explicit information may not
be feasible in a number of cases, we consider the setting where
the user provides implicit feedback, eye movements, to assist when
performing such a task. This paper explores the idea of implicitly
incorporating eye movement features in an image ranking task
where only images are available during testing. Previous work had
demonstrated that combining eye movement and image features improved
on the retrieval accuracy when compared to using each of
the sources independently. Despite these encouraging results the
proposed approach is unrealistic as no eye movements will be presented
a-priori for new images (i.e. only after the ranked images are
presented would one be able to measure a user’s eye movements
on them). We propose a novel search methodology which combines
image features together with implicit feedback from users’
eye movements in a tensor ranking Support Vector Machine and
show that it is possible to extract the individual source-specific
weight vectors. Furthermore, we demonstrate that the decomposed
image weight vector is able to construct a new image-based semantic
space that outperforms the retrieval accuracy than when solely
using the image-features.
Background Subtraction Algorithm Based Human Behavior DetectionIJERA Editor
Consider all the features of subset information in video streaming there is a tremendous processes with real time applications. In this paper we introduce and develop a new video surveillance system. Using this technique we detect human normal and exponential behaviors in realistic format, and also we categories data event generation of human tracking in real time applications. In this technique we apply differencing, threshold segmentation, morphological operations and object tracking. The experimental result show efficient human tracking in video streaming operations.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
System analysis and design for multimedia retrieval systemsijma
Due to the extensive use of information technology and the recent developments in multimedia systems, the
amount of multimedia data available to users has increased exponentially. Video is an example of
multimedia data as it contains several kinds of data such as text, image, meta-data, visual and audio.
Content based video retrieval is an approach for facilitating the searching and browsing of large
multimedia collections over WWW. In order to create an effective video retrieval system, visual perception
must be taken into account. We conjectured that a technique which employs multiple features for indexing
and retrieval would be more effective in the discrimination and search tasks of videos. In order to validate
this, content based indexing and retrieval systems were implemented using color histogram, Texture feature
(GLCM), edge density and motion..
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...cscpconf
Motion detection and object segmentation are an important research area of image-video processing and computer vision. The technique and mathematical modeling used to detect and
segment region of interest (ROI) objects comprise the algorithmic modules of various high-level techniques in video analysis, object extraction, classification, and recognition. The detection of moving object is significant in many tasks, such as video surveillance & moving object tracking. The design of a video surveillance system is directed on involuntary dentification of events of interest, especially on tracking and on classification of moving objects. An entropy based realtime adaptive non-parametric window thresholding algorithm for change detection is anticipated in this research. Based on the approximation of the value of scatter of sections of change in a difference image, a threshold of every image block is calculated discriminatively
using entropy structure, and then the global threshold is attained by averaging all thresholds for image blocks of the frame. The block threshold is calculated contrarily for regions of change
and background. Investigational results show the proposed thresholding algorithm accomplishes well for change detection with high efficiency.
Automatic video censoring system using deep learningIJECEIAES
Due to the extensive use of video-sharing platforms and services, the amount of such all kinds of content on the web has become massive. This abundance of information is a problem controlling the kind of content that may be present in such a video. More than telling if the content is suitable for children and sensitive people or not, figuring it out is also important what parts of it contains such content, for preserving parts that would be discarded in a simple broad analysis. To tackle this problem, a comparison was done for popular image deep learning models: MobileNetV2, Xception model, InceptionV3, VGG16, VGG19, ResNet101 and ResNet50 to seek the one that is most suitable for the required application. Also, a system is developed that would automatically censor inappropriate content such as violent scenes with the help of deep learning. The system uses a transfer learning mechanism using the VGG16 model. The experiments suggested that the model showed excellent performance for the automatic censoring application that could also be used in other similar applications.
Secure IoT Systems Monitor Framework using Probabilistic Image EncryptionIJAEMSJORNAL
In recent years, the modeling of human behaviors and patterns of activity for recognition or detection of special events has attracted considerable research interest. Various methods abounding to build intelligent vision systems aimed at understanding the scene and making correct semantic inferences from the observed dynamics of moving targets. Many systems include detection, storage of video information, and human-computer interfaces. Here we present not only an update that expands previous similar surveys but also a emphasis on contextual abnormal detection of human activity , especially in video surveillance applications. The main purpose of this survey is to identify existing methods extensively, and to characterize the literature in a manner that brings to attention key challenges.
Tag based image retrieval (tbir) using automatic image annotationeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Tag based image retrieval (tbir) using automatic image annotationeSAT Journals
Abstract In recent days, several social networking sites are more popular with digitized images. It comprises the major portion of the databases which makes the search engines to face difficulty in searching. We present a proficient image retrieval technique, which achieves eminent retrieval efficiency. Most of the images are annotated manually, thus the visual content and tags may be mismatched. This leads to poor performance in Tag Based Image Retrieval (TBIR). Automatic Image Annotation (AIA) analyzes the missing and noisy tags and over-refines it to increase the performance of TBIR. AIA can be achieved using the Tag Completion algorithm. The images retrieved from the TBIR are ranked based on the relevancy of the tags and visual content of the images. The relevancy can be evaluated using Content Based Image Retrieval (CBIR) technique. Based on the ranks, the images are indexed in the Tag matrix. Thus the images that match the search query can be retrieved in an optimal way. Keywords: Image Retrieval, Automatic Image Annotation, Tag Based Image Retrieval (TBIR), Tag Completion Algorithm, Content Based Image Retrieval (CBIR), Tag Matrix
Electrically small antennas: The art of miniaturizationEditor IJARCET
We are living in the technological era, were we preferred to have the portable devices rather than unmovable devices. We are isolating our self rom the wires and we are becoming the habitual of wireless world what makes the device portable? I guess physical dimensions (mechanical) of that particular device, but along with this the electrical dimension is of the device is also of great importance. Reducing the physical dimension of the antenna would result in the small antenna but not electrically small antenna. We have different definition for the electrically small antenna but the one which is most appropriate is, where k is the wave number and is equal to and a is the radius of the imaginary sphere circumscribing the maximum dimension of the antenna. As the present day electronic devices progress to diminish in size, technocrats have become increasingly concentrated on electrically small antenna (ESA) designs to reduce the size of the antenna in the overall electronics system. Researchers in many fields, including RF and Microwave, biomedical technology and national intelligence, can benefit from electrically small antennas as long as the performance of the designed ESA meets the system requirement.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
GridMate - End to end testing is a critical piece to ensure quality and avoid...
Volume 2-issue-6-1960-1964
1. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
www.ijarcet.org
1960
Abstract- Vision-based human action recognition is the process
of tagging image sequences with action labels. The identification
of movement can be performed at various levels of abstraction.
In existing system, after collecting an preliminary image set for
each action by querying the Web, fit a logistic regression
classifier to distinguish the foreground features of the
correlated action from the background. In the action
recognition process, PbHOG features can be used, which are
more robust to the background clutter and variance of the
domain. Using the initial classifier, incrementally collect more
action images and, at the same time improve the model. Use
nonnegative matrix factorization on this set to find the diverse
pose clusters for that action and train separate local action
classifiers for each cluster of poses. In proposed system it can be
done by event monitoring to discover the most influential
ordered pair for the human specific action. To this end, it make
use of already annotated motion capture datasets and prepare
action segmentation as a weakly supervised temporal clustering
problem for an unknown number of clusters. Use the
annotations to learn a distance metric for skeleton motion using
relative comparisons in the form of samples of the same action
are more similar than they are to a different action. The learned
distance metric is then used to cluster the test sequences. To this
end, we employ a hierarchical Dirichlet process that also
estimates the number of clusters.
Keywords-Action recognition, Heirarchical Dirichlet process
1.INTRODUCTION
Vision-based human action recognition is the
process of tagging images with action labels. Robust solution
to this problem has applications in domains like visual
surveillance, video retrieval and human–computer
interaction.
Manuscript received June, 2013.
Soumya R PG Scholar, Computer Science and Engineering, Coimbatore
Institute of Engineering and Technology., Narasipuram, Coimbatore, Tamil
Nadu, ,India, 9895426268
R.Gnanakumari, Assistant Professor, Computer Science and
Engineering, Coimbatore Institute of Engineering and Technology,
Narasipuram,Coimbatore,TamilNadu,,India.
The task is difficult due to variations in motion performance.
The task of labeling videos containing human motion with
action classes is motivated by many applications both offline
and online. Automatic annotation [8] of video enables more
efficient searching.
Video annotation is the process of adding
interactive commentary to the videos. That is adding
background information about the video Image annotation is
the process by which a computer system automatically
assigns metadata in the form of captionining or keywords to a
digital image.In machine learning, unsupervised learning [8]
refers to the problem of trying to find hidden structure in
unlabeled data. Since the examples given to the learner are
unlabeled, there is no error or reward signal to evaluate a
potential solution .This distinguishes unsupervised learning
from supervised learning. Unsupervised learning is closely
related to the problem of density estimation in statistics.
However unsupervised learning also encompasses many
other techniques that seek to summarize and explain key
features of the data. Many methods employed in
unsupervised learning are based on data mining method used
to preprocess data. Unsupervised learning studies how
systems can learn to represent particular input patterns in a
way that reflects the statistical structure of the overall
collection of patterns.
The queries can be more naturally specified by the
user in case of automatic image annotation. But it is not
possible in content-based image retrieval. In the case of
CBIR, users requires to search the image concepts such as
color and texture.The traditional methods of image retrieval
such as those used by libraries have relied on manually
annotated images, which is expensive and time-consuming,
especially given the large and constantly-growing image
databases in existence.
Action recognition can be increased by proposing
action pose representation from web,but it needs a large
amount of training videos.And it is a challenging
process,because it needs to find out large labeled data that
covers a diverse set of poses.Action recognition in
uncontrolled videos is a difficult task, where it is very tough
to find the large amount of necessary training videos to model
all the variations of the domain. This problem has been
addressed in this paper by proposing a generic method for
action recognition. The idea is to use images collected from
the Web to discover representations of actions and organize
this knowledge to routinely annotate actions in videos. For
this purpose, first use an incremental image retrieval
procedure to collect and clean up the required training set for
constructing the human pose classifiers. The approach is
unsupervised because it require no human interference other
than simply text querying the name of the action to an
Discovering the Most Influential Human
Action Using Web Based Classifier
Soumya R, R.Gnanakumari
2. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
1961
www.ijarcet.org
internet search engine. Its benefit is two- fold: 1) improve
retrieval of action images, 2) collect a large generic database
of action poses, which can then be used in categorization of
videos. And how the Web-based pose classifiers can be
utilized in conjunction with limited labeled videos can be
explored. Ordered pose pairs(OPP) can be used for encoding
the temporal ordering of poses in action model. Temporal
ordering of pose pairs can increase action recognition
accuracy. Selecting the key poses with the help of Web-based
classifiers, the categorization time can be cheap. Our
experiments demonstrate that, with or without avail-able
video data, the pose models learned from the Web can
improve the performance of the action recognition systems.
First is proposing a system which incrementally
collects action images and videos from the Web by simple
text querying. Second is building action models by using the
noisy set of images in an unsupervised fashion in this present
a method for cleaning the results of keyword retrieval, and
learn pose models based on this cleaned dataset. Third is
proposing PbHOG features, to be used in presence of
background clutter that method denoted as an edge
detector.Use the probability of boundary (Pb) operator
(PbCanny), which is to perform delineating the object
boundaries and then used to extract HOG features based on
Pb responses. The action models can be used to re-rank
retrieved images and improve the retrieval precision.
The action models learned from one set of videos
are adapted for recognition in another set of videos using a
transfer topic model. Fourth is using the action pose models
to annotate human actions in uncontrolled videos (e.g.
YouTube videos). The action pose models learnt from the
Web can be used for locating the distinctive poses inside the
videos, and further, improve the action recognition. This key
pose selection scheme also reduces the training time to a
great extent. Fifth is using collected image data from the Web
jointly with video data for improving action recognition.
Sixth is proposing the OPP method for temporal reasoning
about body poses within each action; and using Web-based
pose classifiers for selecting the key poses from human tracks
for efficient training. The proposed OPP descriptor takes one
step further and models the temporal relationships between
poses. By this, actions that share similar intermediate poses
can be more accurately discriminated.
The main contributions are:
• Proposing a system which incrementally collect
action images from the web by simply text querying.
• Building action models by using the noisy set of
images in an unsupervised fashion, and
• Using the models to annotate human actions in
uncontrolled videos, such as YouTube videos.
2. RELATED WORK
Action recognition [3] can be achieved using local
dimensions in terms of spatiotemporal interest points. In
spatial recognition, local features have recently been joint
with SVM in a robust classification approach. In a similar
manner, here, investigate the combination of local space-time
features and SVM and apply the resulting approach to the
recognition of human actions. Typical scenarios include
scenes with cluttered, moving backgrounds, nonstationary
camera, scale variations, individual variations in appearance
and cloth of people, changes in light and view point and so
forth. All of these conditions introduce challenging problems
that have been addressed in computer vision in the past.
Recognizing human action [2] is a key component
in many computer vision applications, such as video
surveillance, human-computer interface, video indexing and
browsing, recognition of gestures, analysis of sports events
and dance choreography. Some of the recent works done in
the area of action recognition have shown that it is useful to
analyze actions by looking at the video sequence as a
space-time intensity volume. Analyzing actions directly in
the space-time volume avoids some limitations of traditional
approaches that involve key frames.
To automatically categorize or localize different
actions [8] in video sequences is very useful for a variety of
tasks, such as video surveillance, object-level video
summarization, video indexing, digital library organization,
etc. However, it remains a challenging task for computers to
achieve robust action recognition due to cluttered
background, camera motion, occlusion, and geometric and
photometric variances of objects. In this paper, present an
algorithm that aims to account for both of these scenarios. A
lot of previous work has been presented to address these
questions. One popular approach is to apply tracked motion
trajectories of body parts to action recognition. This is done
with much human supervision and the robustness of the
algorithm is highly dependent on the tracking system. Ke et
al. apply spatio-temporal volumetric feature that efficiently
scan video sequences in space and time. Another approach is
to use local space-time patches of videos. Laptev et al.
present a space-time interest point detector based on the idea
of the Harris and F ¨ orstner interest point operators.
3.IMAGE REPRESENTATION
For training classifiers, a large amount of data is
needed, such data is collected manually, which is very costly.
The data collected from web are more diverse and less biased
than the home-made datasets; therefore it may be more
sensible for real-world tasks. Collecting useful training
images from the web is difficult due to various challenges.
For a given query, the ratio of non relevant images in the
retrieved dataset is high. And the relevant image set
comprises irregular subsets. For building a consistent training
set, each of the subsets should be recognized and represented
in the last set. Action images means a set of images in which
there is at least one person engaged in a particular action. For
a given query the number of non relevant images will be
high. Sometimes, more than 50% of the images can be
irrelevant. The results of keyword retrieval must be cleaned,
and then learn pose models based on these cleaned dataset.
After collecting the relevant images, the first step is to extract
the location of the human, if no humans are detected in the
image, then that image is discarded. A human detector can be
used for this purpose, which is effective in detecting [10]
people.
3. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
www.ijarcet.org
1962
Figure1. System Architecture of human action
recognition system
Figure 1 shows the system architecture of human action
recognition system. First collect action images from web
pages by simply text querying the name of the action to a web
search engine. From the images extract the person detector,
and convert the video into frames. Then the action classifier
will classify the actions based on the poses. After identifying
and classifying the actions that can be annotated.
3.1 Image collection from webpages
Collecting useful training image datasets from the
Web can be difficult due to various challenges. First, for a
given keyword- based image search, the ratio of non relevant
images in the retrieved dataset tends to be very high. Second,
the relevant image set mostly comprises discontinuous
subsets, due to different poses, viewpoints and appearances.
In order to build a reliable and effective training set, each of
these subsets should be identified and represented in the final
collected dataset. The number of objects, well as objects’
pose and scale vary quite a bit across retrieved images.
3.2 Person Detection
Within the bounding box the detected humans are
not always centralized. We can solve this issue via an
alignment step based on head area response. Since there is
high variance in the limb areas, head detections are the most
reliable parts of the detector. The head area should be
positioned in the upper center of the bounding box, so for
each image we take the detector’s output for the head and
update the bounding box.
3.3 Feature extraction
once the humans are centralized within the
bounding box, extract an image descriptor for each detected
area. The descriptor is used to provide a good representation
of the poses.For finding the humans from images, Histogram
ofOriented Gradients (HOG) is successful. But the clutter in
the web images makes it difficult to obtain a pose description.
Simple gradient filtering based HOG descriptor is affected by
noisy responses.Probability of boundary (pb)operator can be
used as an edge detector.
3.4 Testing Input
In this using the training videos and testing the input
video using one-vs.-all SVM classifiers over the OPP
descriptors. In the SVM classifier,Hollinger kernel method
can be used, whose feature map can be explicitly computed
by taking the square root of the descriptor values.When video
data is available, it is possible to use this video data to
improve action models that are learned from Web image data.
3.5 Testing Feature Extraction
Web-based classifiers are effective in selecting the
reliable and informative parts of the sequences and use only
those detections for action inference.This selection can lessen
the testing data size and, hence, reduce the computation time
greatly. For this purpose,already trained Web-based pose
classifiers can be used.The selected poses and the associated
local motion information can further be utilized for efficient
action classification.
Figure 2.Shows the output of person detector.
3.6 Action classification using classifier NMF
In this using the training videos, learn one-vs.-all
SVM classifiers over the OPP descriptors. In the SVM
classifier,Hollinger kernel method can be used , whose
feature map can be explicitly computed by taking the square
root of the descriptor values.when video data is available,use
the collected image data together with any available action
videos and find out better classifiers over the combined data.
Another method is to use, the classifiers learned from Web
image data to select the useful parts of the human tracks in
videos to facilitate more effective and efficient recognition.
3.7 Metric Learning From poses for Temporal clustering
of Human Motions
In this using action labels, constraints can be
formulated in terms of similarity and dissimilarity between
triplets of feature vectors. Under such constraints, matrix A
can be learned by employing Information-Theoretic Metric
Action
recognition
Action
pose
models for
annotate
human
actions
Training
process
using
SVM
Video
annotation
Action
classification
Action
classifie
r
Converting
from video
Into frames
Human
action
image
Collect action
images from
web
Person
detectio
n
Feature
Extraction
4. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
1963
www.ijarcet.org
Learning (ITML). ITML finds a suitable matrix A by
formulating the problem in terms of how similar is A to a
given distance parameterized by A0 (typically, the identity or
the sample covariance). Provided that coming under equation
is a Mahalanobis distance,the problem can be treated as the
similarity of two Gaussian distributions parameterized by A
and A0 respectively. That leads to an information theoretic
objective in terms of the Kullback-Leibler divergence
between both Gaussians. This divergence can be expressed as
a LogDet divergence, thus yielding the following
optimization problem:
(1)
Where Dld is the LogDet divergence, c is the vector of
constraints; x is a vector of slack variables (initialized to c
and constrained to be component-wise non-negative) that
guarantees the existence of a solution and l is a parameter
controlling the tradeoff between satisfying the constraints
and minimizing the similarity between distances.
Figure 3.shows annotated video frames
4. PERFORMANCE COMPARISON
To verify the advantages of the proposed work, their
performance have to be evaluated. The objective of this
section is to compare multiple action with singale action
recognition system.
The dataset for the experiment were synthetic
dataset. For multiple action, actions like sitting. Jumping and
walking were collected. And these were annotated.
Table 1 comparison of Accuracy
Methods Accuracy (%)
Single Action Recognition 85%
Multiple Action
Recognition
93%
5. RESULTS AND DISCUSSIONS
Figure 4 shows the multiple action recognition
system based on the parameter accuracy. It is found that
accuracy of multiple action recognition is higher than single
action recognition system.
Figure 4. Performance Comparison
6. CONCLUSION
In this paper,the videos are collected from the web
and based on the pose the actions are identified and
classified.perforformance evaluation shows that multiple
action recognition system is having high accuracy when
compared to single action system.
7. REFERENCES
1. Adolfo Lopez –Mendez,juragen Gall, Joseph R casas
“Metric Learning From Poses For Temporal
Clustering Of Human Action”.
2. Basri.R , Blank.M, Gorelick.L, Shechtman.E, and
Irani.M, (2005) “Actions as space-time shapes,” In
Proc. ICCV, vol. 2, pp. 1395–140
3. Caputo .B, and Schuldt.C, Laptev.I , (2004)
“Recognizing human actions: A local svm
approach,” in Proc. ICPR, pp. 32–36.
4. Cipolla.R , Kim.T.K, and Wong.S.F,(2007) “Tensor
canonical correlation analysis for action
classification,” presented at the
CVPR,Minneapolis,MN.
5. D.Lee and H.Seung, “ Algorithms for non-negative
matrix factorization”,in
Proc.NIPS,2001,pp.556-562.
6. D.Tran and A.sorokin,”Human activity recognition
with metric learning” in proc
ECCV,2008,pp.548-561.
7. F.Schroff,A.Criminisi and A. zisserman,”Harvesting
image databases from the web”,presented at the
ICCV,Rio de Janeiro,Brazil,2007.
5. ISSN: 2278 – 1323
International Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
Volume 2, Issue 6, June 2013
www.ijarcet.org
1964
8. Fei-Fei.L,Niebles J.C ,and Wang.H,(2006)
“Unsupervised learning of human action categories
using spatial-temporal words,” in Proc.BMVC, pp.
1249–1258.
Soumya R is currently pursuing M.E
Computer Science and Engineering at
Coimbatore Institute of Engineering and
Technology, Coimbatore, Tamil Nadu,
(Anna University, Chennai). She completed
her B.Tech in Information Technology
from M.E.S College of Engineeering
,Kuttipuram,Kerala,(Calicut University,
Kerala) in 2010.
R.GnanaKumari is currently Assistant
Professor in the Department of Computer
Science, at Coimbatore Institute of
Engineering and Technology, Coimbatore,
Tamil Nadu, (Anna University, Chennai).
She completed her B.E in Computer
Science and Engineering from Sri
Ramakrishna Engineering college,
Coimbatore in 2002 and M.E in Computer
Science and Engineering from Anna
University of Technology in 2011. She has
about 3 years experience in industry and
7.6 years experience in teaching.