Video summarization of the segmented video is an essential process for video thumbnails, video
surveillance and video downloading. Summarization deals with extracting few frames from each scene and
creating a summary video which explains all course of action of full video with in short duration of time.
The proposed research work discusses about the segmentation and summarization of the frames. A genetic
algorithm (GA) for segmentation and summarization is required to view the highlight of an event by
selecting few important frames required. The GA is modified to select only key frames for summarization
and the comparison of modified GA is done with the GA.
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonCSCJournals
The document presents a novel method for extracting key frames from videos using unsupervised clustering and mutual comparison. It assigns weights of 70% to color (HSV histogram) and 30% to texture (GLCM) when computing frame similarity for clustering. It then performs mutual comparison of extracted key frames to remove near duplicates, improving accuracy. The algorithm is computationally simple and able to detect unique key frames, improving concept detection performance as validated on open databases.
This document discusses using genetic algorithms for image enhancement and segmentation. It begins with an overview of genetic algorithms and how they can be applied to optimization problems like image processing. Specifically, it describes how genetic algorithms use operators like crossover and mutation to evolve solutions over generations. It then discusses how genetic algorithms can be used for two main image processing tasks: image enhancement to improve image quality, and image segmentation to partition an image into meaningful regions. The key steps of the genetic algorithm for these tasks are described, including initializing a population, defining a fitness function, and applying genetic operators to evolve better solutions across generations.
In this paper, several simple regression models are compared with simple deep learning regression models which use both videos as well as semantic features to predict memorability scores.
Vector quantization (VQ) is a powerful technique in the field of digital image compression. The generalized
residual codebook is used to remove the distortion in the reconstructed image for further enhancing the quality of the
image. Already, Generalized Residual Vector Quantization (GRVQ) was optimized by Particle Swarm Optimization (PSO)
and Honey Bee Mating Optimization (HBMO). The performance of GRVQ was degraded due to instability in convergence
of the PSO algorithm when particle velocity is high and the performance of HBMO algorithm is depended on many
parameters which are required to tune for reducing size of codebook. So, in this paper the Artificial Plant Optimization
Algorithm (APOA) is used to optimize the parameters used in GRVQ. The Extensive experiment demonstrates that
proposed APOA-GRVQ algorithm outperforms than existing algorithm in terms of quantization accuracy and computation
accuracy.
IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...IRJET Journal
This document compares an optimized block matching algorithm to the four step search algorithm. It first provides background on block matching algorithms and motion estimation techniques used in video compression. It then describes the existing four step search algorithm and its process of checking 17-27 points to find the best motion vector match. The document proposes a new simpler and more efficient four step search algorithm that separates the search area into quadrants. It checks 3 points in the first phase to select a quadrant, then finds the lowest cost point in the second phase to set as the new origin, reducing computational complexity compared to the standard four step search.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Shot Boundary Detection In Videos Sequences Using Motion ActivitiesCSCJournals
Video segmentation is fundamental to a number of applications related to video retrieval and analysis. To realize the content based video retrieval, the video information should be organized to elaborate the structure of the video. The segmentation video into shot is an important step to make. This paper presents a new method of shot boundaries detection based on motion activities in video sequence. The proposed algorithm is tested on the various video types and the experimental results show that our algorithm is effective and reliably detects shot boundaries.
1. The document proposes an efficient algorithm to retrieve videos from a database using a video clip as a query.
2. Key features like color, texture, edges and motion are extracted from video shots and clusters are created using these features to reduce search time complexity.
3. When a query video is given, its features are used to search the closest cluster. Then sequential matching of additional features and shot lengths is done to find the most similar matching videos from the database.
Video Key-Frame Extraction using Unsupervised Clustering and Mutual ComparisonCSCJournals
The document presents a novel method for extracting key frames from videos using unsupervised clustering and mutual comparison. It assigns weights of 70% to color (HSV histogram) and 30% to texture (GLCM) when computing frame similarity for clustering. It then performs mutual comparison of extracted key frames to remove near duplicates, improving accuracy. The algorithm is computationally simple and able to detect unique key frames, improving concept detection performance as validated on open databases.
This document discusses using genetic algorithms for image enhancement and segmentation. It begins with an overview of genetic algorithms and how they can be applied to optimization problems like image processing. Specifically, it describes how genetic algorithms use operators like crossover and mutation to evolve solutions over generations. It then discusses how genetic algorithms can be used for two main image processing tasks: image enhancement to improve image quality, and image segmentation to partition an image into meaningful regions. The key steps of the genetic algorithm for these tasks are described, including initializing a population, defining a fitness function, and applying genetic operators to evolve better solutions across generations.
In this paper, several simple regression models are compared with simple deep learning regression models which use both videos as well as semantic features to predict memorability scores.
Vector quantization (VQ) is a powerful technique in the field of digital image compression. The generalized
residual codebook is used to remove the distortion in the reconstructed image for further enhancing the quality of the
image. Already, Generalized Residual Vector Quantization (GRVQ) was optimized by Particle Swarm Optimization (PSO)
and Honey Bee Mating Optimization (HBMO). The performance of GRVQ was degraded due to instability in convergence
of the PSO algorithm when particle velocity is high and the performance of HBMO algorithm is depended on many
parameters which are required to tune for reducing size of codebook. So, in this paper the Artificial Plant Optimization
Algorithm (APOA) is used to optimize the parameters used in GRVQ. The Extensive experiment demonstrates that
proposed APOA-GRVQ algorithm outperforms than existing algorithm in terms of quantization accuracy and computation
accuracy.
IRJET- Comparison and Simulation based Analysis of an Optimized Block Mat...IRJET Journal
This document compares an optimized block matching algorithm to the four step search algorithm. It first provides background on block matching algorithms and motion estimation techniques used in video compression. It then describes the existing four step search algorithm and its process of checking 17-27 points to find the best motion vector match. The document proposes a new simpler and more efficient four step search algorithm that separates the search area into quadrants. It checks 3 points in the first phase to select a quadrant, then finds the lowest cost point in the second phase to set as the new origin, reducing computational complexity compared to the standard four step search.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Shot Boundary Detection In Videos Sequences Using Motion ActivitiesCSCJournals
Video segmentation is fundamental to a number of applications related to video retrieval and analysis. To realize the content based video retrieval, the video information should be organized to elaborate the structure of the video. The segmentation video into shot is an important step to make. This paper presents a new method of shot boundaries detection based on motion activities in video sequence. The proposed algorithm is tested on the various video types and the experimental results show that our algorithm is effective and reliably detects shot boundaries.
1. The document proposes an efficient algorithm to retrieve videos from a database using a video clip as a query.
2. Key features like color, texture, edges and motion are extracted from video shots and clusters are created using these features to reduce search time complexity.
3. When a query video is given, its features are used to search the closest cluster. Then sequential matching of additional features and shot lengths is done to find the most similar matching videos from the database.
The document describes a method for image fusion and optimization using stationary wavelet transform and particle swarm optimization. It summarizes that image fusion combines information from multiple images to extract relevant information. The proposed method uses stationary wavelet transform for image decomposition and particle swarm optimization to optimize the fused results. It applies stationary wavelet transform to source images to decompose them into wavelet coefficients. Particle swarm optimization is then used to optimize the transformed images. The inverse stationary wavelet transform is applied to the optimized coefficients to generate the fused image. The method is tested on various images and performance is evaluated using metrics like peak signal-to-noise ratio, entropy, mean square error and standard deviation.
Review of Diverse Techniques Used for Effective Fractal Image CompressionIRJET Journal
This document reviews different techniques for fractal image compression to enhance compression ratio while maintaining image quality. It discusses algorithms like quadtree partitioning with Huffman coding (QPHC), discrete cosine transform based fractal image compression (DCT-FIC), discrete wavelet transform based fractal image compression (DWTFIC), and Grover's quantum search algorithm based fractal image compression (QAFIC). The document also analyzes works applying these techniques and concludes that combining QAFIC with the tiny block size processing algorithm may further improve compression ratio with minimal quality loss.
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint RecognitionCSCJournals
In this paper we have proposed Extended Fuzzy Hyperline Segment Neural Network (EFHLSNN) and its learning algorithm which is an extension of Fuzzy Hyperline Segment Neural Network (FHLSNN). The fuzzy set hyperline segment is an n-dimensional hyperline segment defined by two end points with a corresponding extended membership function. The fingerprint feature extraction process is based on FingerCode feature extraction technique. The performance of EFHLSNN is verified using POLY U HRF fingerprint database. The EFHLSNN is found superior compared to FHLSNN in generalization, training and recall time.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
One-Sample Face Recognition Using HMM Model of Fiducial AreasCSCJournals
In most real world applications, multiple image samples of individuals are not easy to collate for direct implementation of recognition or verification systems. Therefore there is a need to perform these tasks even if only one training sample per person is available. This paper describes an effective algorithm for recognition and verification with one sample image per class. It uses two dimensional discrete wavelet transform (2D DWT) to extract features from images and hidden Markov model (HMM) was used for training, recognition and classification. It was tested with a subset of the AT&T database and up to 90% correct classification (Hit) and false acceptance rate (FAR) of 0.02% was achieved.
The document discusses a system for classifying human actions in videos. It tracks subjects using adaptive background subtraction and extracts their bounding boxes. It then recognizes poses within each frame using Histogram of Oriented Gradients (HOG) templates. It also maintains a queue of the last K frames to classify actions based on sequences of poses over multiple frames. The system was tested on videos of judo matches and exercises performed by the authors.
This document compares image enhancement and analysis techniques using image processing and wavelet techniques on thermal images. It discusses various image enhancement methods such as converting images to grayscale, histogram equalization, contrast enhancement, linear and adaptive filtering, morphology, FFT transforms, and wavelet-based techniques including image fusion, denoising, and compression. Results showing enhanced, denoised, and compressed images are presented and analyzed. The document concludes that wavelet techniques provide better enhancement of thermal images compared to traditional image processing methods.
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...csandit
Motion detection and object segmentation are an important research area of image-video
processing and computer vision. The technique and mathematical modeling used to detect and
segment region of interest (ROI) objects comprise the algorithmic modules of various high-level
techniques in video analysis, object extraction, classification, and recognition. The detection of
moving object is significant in many tasks, such as video surveillance & moving object tracking.
The design of a video surveillance system is directed on involuntary identification of events of
interest, especially on tracking and on classification of moving objects. An entropy based realtime
adaptive non-parametric window thresholding algorithm for change detection is
anticipated in this research. Based on the approximation of the value of scatter of sections of
change in a difference image, a threshold of every image block is calculated discriminatively
using entropy structure, and then the global threshold is attained by averaging all thresholds for
image blocks of the frame. The block threshold is calculated contrarily for regions of change
and background. Investigational results show the proposed thresholding algorithm
accomplishes well for change detection with high efficiency.
An Efficient Method For Gradual Transition Detection In Presence Of Camera Mo...ijafrc
This document summarizes an algorithm for detecting gradual transitions in videos that is robust to camera motion. It uses local key points detected in video frames and matches them between adjacent frames. A contrast context histogram descriptor is used to represent each key point. Transitions are detected by a drop in the number of matched key points between frames. The start and end frames of transitions are refined by looking for stable regions with consistent numbers of matches. A "twin comparison" approach uses two thresholds to distinguish between abrupt cuts and more gradual fades/dissolves based on the number of matched features between frames. The algorithm aims to correctly identify transitions while being robust to changes from camera or object motion within shots.
From Unsupervised to Semi-Supervised Event DetectionVincent Chu
This document outlines research on unsupervised temporal commonality discovery and semi-supervised facial action unit detection. It first discusses previous work on unsupervised commonality discovery in images and videos. It then proposes a method called Temporal Commonality Discovery (TCD) to discover common events in unlabeled videos in an unsupervised manner using integer programming and a branch-and-bound search algorithm. The document also discusses how selective transfer machines can be used to perform personalized facial action unit detection by minimizing person-specific and occurrence biases in the training data. It evaluates TCD on synthesized and real-world video datasets and evaluates selective transfer machines on several facial expression datasets.
IMAGE AUTHENTICATION THROUGH ZTRANSFORM WITH LOW ENERGY AND BANDWIDTH (IAZT)IJNSA Journal
In this paper a Z-transform based image authentication technique termed as IAZT has been proposed to
authenticate gray scale images. The technique uses energy efficient and low bandwidth based invisible data
embedding with a minimal computational complexity. Near about half of the bandwidth is required
compared to the traditional Z–transform while transmitting the multimedia contents such as images with
authenticating message through network. This authenticating technique may be used for copyright
protection or ownership verification. Experimental results are computed and compared with the existing
authentication techniques like Li’s method [11], SCDFT [13], Region-Based method [14] and many more
based on Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR), Image Fidelity (IF), Universal
Quality Image (UQI) and Structural Similarity Index Measurement (SSIM) which shows better performance
in IAZT.
1. The document proposes a Modified Fuzzy C-Means (MFCM) algorithm to segment brain tumors in noisy MRI images.
2. The conventional Fuzzy C-Means algorithm is sensitive to noise, so the MFCM adds an adaptive filtering step during segmentation.
3. The MFCM incorporates neighboring pixel membership values to reduce each pixel's resistance to being clustered, improving segmentation in noisy images.
TARGET DETECTION AND CLASSIFICATION IMPROVEMENTS USING CONTRAST ENHANCED 16-B...sipij
This document describes research on improving target detection and classification in infrared videos using 16-bit data and contrast enhancement techniques. Two contrast enhancement methods are explored: 1) histogram matching 16-bit videos to an 8-bit reference frame, and 2) a second order histogram matching algorithm that preserves the 16-bit nature of videos while enhancing contrast. Experimental results showed that the second method improved target detection performance using YOLO and classification performance using ResNet compared to previous 8-bit results and the first histogram matching method.
This document summarizes a research paper on tracking multiple targets using the mean shift algorithm. It begins by stating that multi-target tracking is challenging due to factors like noise, clutter, occlusions, and sudden changes in velocity. The mean shift algorithm is then introduced as a kernel-based tracking method that works by iteratively shifting target locations to their mean shifts. Targets are represented using histograms within elliptical regions. The Bhattacharyya coefficient is used to measure similarity between target models and candidates. Experimental results on a video sequence show the algorithm can accurately track targets under small displacements but performance degrades for large displacements, fast motion, or occlusions. In conclusion, the mean shift algorithm provides a simple method for multi
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...theijes
In the multimedia environment, digital data has gained more importance in daily routine. Large volume of videos such as entertainment video, news video, cartoon video, sports video is accessed by masses to accomplish their different needs. In the field of video processing Shot boundary detection is current research area. Shot boundary detection has vast impact on effective browsing and retrieving, searching of video. It serves as the beginning to construct the content of videos. Video processing technology has crucial job to provide valid information from videos without loss of any information. This paper is a survey of various novel algorithm for detecting fade-in and fade-out used by renowned personals with different methods. This survey also emphasizes on different core concepts underlying the different detection schemes for the most used video transition effect: fades
This document presents a method for extracting key frames from videos using discrete wavelet transform (DWT) statistics. It begins with background on video frames, scene changes, and DWT wavelet frequency components. It then describes the proposed key frame extraction algorithm, which involves: 1) applying DWT to consecutive video frames, 2) calculating differences between detail coefficients, 3) computing mean and standard deviation of differences, 4) setting thresholds based on mean and standard deviation, and 5) identifying frames where two difference values exceed thresholds as key frames. The method is applied to sample videos and parameters are tuned to effectively detect key frames for video summarization.
This document presents a method for extracting key frames from videos using discrete wavelet transform (DWT) statistics. It begins with background on video frames, scene changes, and DWT wavelet frequency components. The proposed method extracts key frames in 4 steps: 1) applying DWT to consecutive frames and calculating differences between detail coefficients, 2) computing mean and standard deviation of differences, 3) estimating thresholds using mean and standard deviation, 4) comparing differences to thresholds to identify key frames where differences exceed thresholds. Experimental results on test videos demonstrate the method can detect key frames to represent scene changes for video summarization.
VISUAL ATTENTION BASED KEYFRAMES EXTRACTION AND VIDEO SUMMARIZATIONcscpconf
Recent developments in digital video and drastic increase of internet use have increased the
amount of people searching and watching videos online. In order to make the search of the
videos easy, Summary of the video may be provided along with each video. The video summary
provided thus should be effective so that the user would come to know the content of the video
without having to watch it fully. The summary produced should consists of the key frames that
effectively express the content and context of the video. This work suggests a method to extract
key frames which express most of the information in the video. This is achieved by quantifying
Visual attention each frame commands. Visual attention of each frame is quantified using a
descriptor called Attention quantifier. This quantification of visual attention is based on the
human attention mechanism that indicates color conspicuousness and the motion involved seek
more attention. So based on the color conspicuousness and the motion involved each frame is
given a Attention parameter. Based on the attention quantifier value the key frames are extracted and are summarized adaptively. This framework suggests a method to produces meaningful video summary.
Video indexing using shot boundary detection approach and search tracksIAEME Publication
This document summarizes a research paper that proposes a video indexing and retrieval method using shot boundary detection and audio track detection. It first extracts keypoints from divided frames to create a new frame sequence. Support vector machines are then used to match keypoints between frames to detect different types of shot transitions. Audio energy is also analyzed to detect sound tracks. The method aims to reduce computational costs by removing non-boundary frames and representing transition frames as thumbnails. It was tested on CCTV and film videos.
This document summarizes a research paper that proposes using a technique called "tiny video representation" to classify and retrieve video frames and videos. The proposed method involves preprocessing videos by splitting them into frames, removing black bars, resizing frames to 32x32 pixels, and using affinity propagation to cluster unique frames. This creates a "tiny video database" that can be used for content-based copy detection, video categorization through classification of frames, and retrieval of related videos through nearest neighbor searches. Experimental results showed the tiny video database approach improved classification precision and recall compared to using individual frames or videos.
A Novel Approach for Tracking with Implicit Video Shot DetectionIOSR Journals
1) The document presents a novel approach that combines video shot detection and object tracking using a particle filter to create an efficient tracking algorithm with implicit shot detection.
2) It uses a robust pixel difference method for shot detection that is resistant to sudden illumination changes. It then applies a particle filter for tracking that uses color histograms and Bhattacharyya distance to track objects across frames.
3) The key innovation is that the tracking algorithm is only initiated after a shot change is detected, reducing computational costs by discarding unneeded frames and triggering tracking only when needed. This provides a more efficient solution for tracking large video datasets with minimal preprocessing.
The document describes a method for image fusion and optimization using stationary wavelet transform and particle swarm optimization. It summarizes that image fusion combines information from multiple images to extract relevant information. The proposed method uses stationary wavelet transform for image decomposition and particle swarm optimization to optimize the fused results. It applies stationary wavelet transform to source images to decompose them into wavelet coefficients. Particle swarm optimization is then used to optimize the transformed images. The inverse stationary wavelet transform is applied to the optimized coefficients to generate the fused image. The method is tested on various images and performance is evaluated using metrics like peak signal-to-noise ratio, entropy, mean square error and standard deviation.
Review of Diverse Techniques Used for Effective Fractal Image CompressionIRJET Journal
This document reviews different techniques for fractal image compression to enhance compression ratio while maintaining image quality. It discusses algorithms like quadtree partitioning with Huffman coding (QPHC), discrete cosine transform based fractal image compression (DCT-FIC), discrete wavelet transform based fractal image compression (DWTFIC), and Grover's quantum search algorithm based fractal image compression (QAFIC). The document also analyzes works applying these techniques and concludes that combining QAFIC with the tiny block size processing algorithm may further improve compression ratio with minimal quality loss.
Extended Fuzzy Hyperline Segment Neural Network for Fingerprint RecognitionCSCJournals
In this paper we have proposed Extended Fuzzy Hyperline Segment Neural Network (EFHLSNN) and its learning algorithm which is an extension of Fuzzy Hyperline Segment Neural Network (FHLSNN). The fuzzy set hyperline segment is an n-dimensional hyperline segment defined by two end points with a corresponding extended membership function. The fingerprint feature extraction process is based on FingerCode feature extraction technique. The performance of EFHLSNN is verified using POLY U HRF fingerprint database. The EFHLSNN is found superior compared to FHLSNN in generalization, training and recall time.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
One-Sample Face Recognition Using HMM Model of Fiducial AreasCSCJournals
In most real world applications, multiple image samples of individuals are not easy to collate for direct implementation of recognition or verification systems. Therefore there is a need to perform these tasks even if only one training sample per person is available. This paper describes an effective algorithm for recognition and verification with one sample image per class. It uses two dimensional discrete wavelet transform (2D DWT) to extract features from images and hidden Markov model (HMM) was used for training, recognition and classification. It was tested with a subset of the AT&T database and up to 90% correct classification (Hit) and false acceptance rate (FAR) of 0.02% was achieved.
The document discusses a system for classifying human actions in videos. It tracks subjects using adaptive background subtraction and extracts their bounding boxes. It then recognizes poses within each frame using Histogram of Oriented Gradients (HOG) templates. It also maintains a queue of the last K frames to classify actions based on sequences of poses over multiple frames. The system was tested on videos of judo matches and exercises performed by the authors.
This document compares image enhancement and analysis techniques using image processing and wavelet techniques on thermal images. It discusses various image enhancement methods such as converting images to grayscale, histogram equalization, contrast enhancement, linear and adaptive filtering, morphology, FFT transforms, and wavelet-based techniques including image fusion, denoising, and compression. Results showing enhanced, denoised, and compressed images are presented and analyzed. The document concludes that wavelet techniques provide better enhancement of thermal images compared to traditional image processing methods.
VIDEO SEGMENTATION FOR MOVING OBJECT DETECTION USING LOCAL CHANGE & ENTROPY B...csandit
Motion detection and object segmentation are an important research area of image-video
processing and computer vision. The technique and mathematical modeling used to detect and
segment region of interest (ROI) objects comprise the algorithmic modules of various high-level
techniques in video analysis, object extraction, classification, and recognition. The detection of
moving object is significant in many tasks, such as video surveillance & moving object tracking.
The design of a video surveillance system is directed on involuntary identification of events of
interest, especially on tracking and on classification of moving objects. An entropy based realtime
adaptive non-parametric window thresholding algorithm for change detection is
anticipated in this research. Based on the approximation of the value of scatter of sections of
change in a difference image, a threshold of every image block is calculated discriminatively
using entropy structure, and then the global threshold is attained by averaging all thresholds for
image blocks of the frame. The block threshold is calculated contrarily for regions of change
and background. Investigational results show the proposed thresholding algorithm
accomplishes well for change detection with high efficiency.
An Efficient Method For Gradual Transition Detection In Presence Of Camera Mo...ijafrc
This document summarizes an algorithm for detecting gradual transitions in videos that is robust to camera motion. It uses local key points detected in video frames and matches them between adjacent frames. A contrast context histogram descriptor is used to represent each key point. Transitions are detected by a drop in the number of matched key points between frames. The start and end frames of transitions are refined by looking for stable regions with consistent numbers of matches. A "twin comparison" approach uses two thresholds to distinguish between abrupt cuts and more gradual fades/dissolves based on the number of matched features between frames. The algorithm aims to correctly identify transitions while being robust to changes from camera or object motion within shots.
From Unsupervised to Semi-Supervised Event DetectionVincent Chu
This document outlines research on unsupervised temporal commonality discovery and semi-supervised facial action unit detection. It first discusses previous work on unsupervised commonality discovery in images and videos. It then proposes a method called Temporal Commonality Discovery (TCD) to discover common events in unlabeled videos in an unsupervised manner using integer programming and a branch-and-bound search algorithm. The document also discusses how selective transfer machines can be used to perform personalized facial action unit detection by minimizing person-specific and occurrence biases in the training data. It evaluates TCD on synthesized and real-world video datasets and evaluates selective transfer machines on several facial expression datasets.
IMAGE AUTHENTICATION THROUGH ZTRANSFORM WITH LOW ENERGY AND BANDWIDTH (IAZT)IJNSA Journal
In this paper a Z-transform based image authentication technique termed as IAZT has been proposed to
authenticate gray scale images. The technique uses energy efficient and low bandwidth based invisible data
embedding with a minimal computational complexity. Near about half of the bandwidth is required
compared to the traditional Z–transform while transmitting the multimedia contents such as images with
authenticating message through network. This authenticating technique may be used for copyright
protection or ownership verification. Experimental results are computed and compared with the existing
authentication techniques like Li’s method [11], SCDFT [13], Region-Based method [14] and many more
based on Mean Square Error (MSE), Peak Signal to Noise Ratio (PSNR), Image Fidelity (IF), Universal
Quality Image (UQI) and Structural Similarity Index Measurement (SSIM) which shows better performance
in IAZT.
1. The document proposes a Modified Fuzzy C-Means (MFCM) algorithm to segment brain tumors in noisy MRI images.
2. The conventional Fuzzy C-Means algorithm is sensitive to noise, so the MFCM adds an adaptive filtering step during segmentation.
3. The MFCM incorporates neighboring pixel membership values to reduce each pixel's resistance to being clustered, improving segmentation in noisy images.
TARGET DETECTION AND CLASSIFICATION IMPROVEMENTS USING CONTRAST ENHANCED 16-B...sipij
This document describes research on improving target detection and classification in infrared videos using 16-bit data and contrast enhancement techniques. Two contrast enhancement methods are explored: 1) histogram matching 16-bit videos to an 8-bit reference frame, and 2) a second order histogram matching algorithm that preserves the 16-bit nature of videos while enhancing contrast. Experimental results showed that the second method improved target detection performance using YOLO and classification performance using ResNet compared to previous 8-bit results and the first histogram matching method.
This document summarizes a research paper on tracking multiple targets using the mean shift algorithm. It begins by stating that multi-target tracking is challenging due to factors like noise, clutter, occlusions, and sudden changes in velocity. The mean shift algorithm is then introduced as a kernel-based tracking method that works by iteratively shifting target locations to their mean shifts. Targets are represented using histograms within elliptical regions. The Bhattacharyya coefficient is used to measure similarity between target models and candidates. Experimental results on a video sequence show the algorithm can accurately track targets under small displacements but performance degrades for large displacements, fast motion, or occlusions. In conclusion, the mean shift algorithm provides a simple method for multi
Comparative Study of Various Algorithms for Detection of Fades in Video Seque...theijes
In the multimedia environment, digital data has gained more importance in daily routine. Large volume of videos such as entertainment video, news video, cartoon video, sports video is accessed by masses to accomplish their different needs. In the field of video processing Shot boundary detection is current research area. Shot boundary detection has vast impact on effective browsing and retrieving, searching of video. It serves as the beginning to construct the content of videos. Video processing technology has crucial job to provide valid information from videos without loss of any information. This paper is a survey of various novel algorithm for detecting fade-in and fade-out used by renowned personals with different methods. This survey also emphasizes on different core concepts underlying the different detection schemes for the most used video transition effect: fades
This document presents a method for extracting key frames from videos using discrete wavelet transform (DWT) statistics. It begins with background on video frames, scene changes, and DWT wavelet frequency components. It then describes the proposed key frame extraction algorithm, which involves: 1) applying DWT to consecutive video frames, 2) calculating differences between detail coefficients, 3) computing mean and standard deviation of differences, 4) setting thresholds based on mean and standard deviation, and 5) identifying frames where two difference values exceed thresholds as key frames. The method is applied to sample videos and parameters are tuned to effectively detect key frames for video summarization.
This document presents a method for extracting key frames from videos using discrete wavelet transform (DWT) statistics. It begins with background on video frames, scene changes, and DWT wavelet frequency components. The proposed method extracts key frames in 4 steps: 1) applying DWT to consecutive frames and calculating differences between detail coefficients, 2) computing mean and standard deviation of differences, 3) estimating thresholds using mean and standard deviation, 4) comparing differences to thresholds to identify key frames where differences exceed thresholds. Experimental results on test videos demonstrate the method can detect key frames to represent scene changes for video summarization.
VISUAL ATTENTION BASED KEYFRAMES EXTRACTION AND VIDEO SUMMARIZATIONcscpconf
Recent developments in digital video and drastic increase of internet use have increased the
amount of people searching and watching videos online. In order to make the search of the
videos easy, Summary of the video may be provided along with each video. The video summary
provided thus should be effective so that the user would come to know the content of the video
without having to watch it fully. The summary produced should consists of the key frames that
effectively express the content and context of the video. This work suggests a method to extract
key frames which express most of the information in the video. This is achieved by quantifying
Visual attention each frame commands. Visual attention of each frame is quantified using a
descriptor called Attention quantifier. This quantification of visual attention is based on the
human attention mechanism that indicates color conspicuousness and the motion involved seek
more attention. So based on the color conspicuousness and the motion involved each frame is
given a Attention parameter. Based on the attention quantifier value the key frames are extracted and are summarized adaptively. This framework suggests a method to produces meaningful video summary.
Video indexing using shot boundary detection approach and search tracksIAEME Publication
This document summarizes a research paper that proposes a video indexing and retrieval method using shot boundary detection and audio track detection. It first extracts keypoints from divided frames to create a new frame sequence. Support vector machines are then used to match keypoints between frames to detect different types of shot transitions. Audio energy is also analyzed to detect sound tracks. The method aims to reduce computational costs by removing non-boundary frames and representing transition frames as thumbnails. It was tested on CCTV and film videos.
This document summarizes a research paper that proposes using a technique called "tiny video representation" to classify and retrieve video frames and videos. The proposed method involves preprocessing videos by splitting them into frames, removing black bars, resizing frames to 32x32 pixels, and using affinity propagation to cluster unique frames. This creates a "tiny video database" that can be used for content-based copy detection, video categorization through classification of frames, and retrieval of related videos through nearest neighbor searches. Experimental results showed the tiny video database approach improved classification precision and recall compared to using individual frames or videos.
A Novel Approach for Tracking with Implicit Video Shot DetectionIOSR Journals
1) The document presents a novel approach that combines video shot detection and object tracking using a particle filter to create an efficient tracking algorithm with implicit shot detection.
2) It uses a robust pixel difference method for shot detection that is resistant to sudden illumination changes. It then applies a particle filter for tracking that uses color histograms and Bhattacharyya distance to track objects across frames.
3) The key innovation is that the tracking algorithm is only initiated after a shot change is detected, reducing computational costs by discarding unneeded frames and triggering tracking only when needed. This provides a more efficient solution for tracking large video datasets with minimal preprocessing.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
This document describes a system for Tamil video retrieval based on categorization in the cloud. The system first categorizes Tamil videos into subcategories based on camera motion parameters. It then segments the videos into shots and extracts representative key frames from each shot based on edge and color features. These features are stored in a feature library in the cloud. When a Tamil query is submitted, the system retrieves similar videos from the cloud based on matching the query features to the stored features. The system is implemented using the Eucalyptus cloud computing platform for its flexibility and ability to handle large computational loads.
Video Compression Algorithm Based on Frame Difference Approaches ijsc
The huge usage of digital multimedia via communications, wireless communications, Internet, Intranet and cellular mobile leads to incurable growth of data flow through these Media. The researchers go deep in developing efficient techniques in these fields such as compression of data, image and video. Recently, video compression techniques and their applications in many areas (educational, agriculture, medical …) cause this field to be one of the most interested fields. Wavelet transform is an efficient method that can be used to perform an efficient compression technique. This work deals with the developing of an efficient video compression approach based on frames difference approaches that concentrated on the calculation of frame near distance (difference between frames). The
selection of the meaningful frame depends on many factors such as compression performance, frame details, frame size and near distance between frames. Three different approaches are applied for removing the lowest frame difference. In this paper, many videos are tested to insure the efficiency of this technique, in addition a good performance results has been obtained.
Event recognition image & video segmentationeSAT Journals
Abstract This paper gives a clear look at the segmentation process at the basic level. Segmentation is done at multiple levels so that we will get different results. Segmentation of relative motion descriptors gives a clear picture about the segmentation done for a given input video. Relative motion computation and histograms incrementation are used to evaluate this approach. Also here we will give complete information about the related research which is done about how segmentation can be done for the both images and videos. Keywords: Image Segmentation, Video Segmentation.
Key frame extraction for video summarization using motion activity descriptorseSAT Journals
This document presents a method for video summarization using motion activity descriptors. It extracts key frames from videos by comparing motion between consecutive frames using block matching algorithms like diamond search and three step search. These algorithms determine which blocks to compare from consecutive frames to find the closest block match and derive a motion activity descriptor. Frames with high motion descriptors, indicating more difference between frames, are selected as key frames for the video summary. The method was tested on various video categories and showed high precision and summarization for some videos but lower values for others, depending on factors like scene changes, motion detectability, and object/area properties. An effective summary balances high precision with a high summarization factor by selecting frames that best represent the video's
Key frame extraction for video summarization using motion activity descriptorseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Recognition and tracking moving objects using moving camera in complex scenesIJCSEA Journal
1) The document proposes a method for tracking moving objects in videos captured using a moving camera in complex scenes. It involves video stabilization, key frame extraction, object detection/tracking using Gaussian mixture models and Kalman filters, and object recognition using bag of features.
2) Key frame extraction identifies important frames for processing by computing edge differences between frames and selecting frames above a threshold.
3) Moving objects are detected using background subtraction and Gaussian mixture models, and then tracked across frames using Kalman filters.
4) Object recognition is performed using bag of features, which represents objects as histograms of visual word frequencies to classify objects based on characteristic visual parts.
Key frame extraction methodology for video annotationIAEME Publication
This document summarizes a research paper that proposes a key frame extraction methodology to facilitate video annotation. The methodology uses edge difference between consecutive video frames to determine if the content has significantly changed. Frames where the edge difference exceeds a threshold are selected as key frames. The algorithm calculates edge differences for all frame pairs in a video. It then computes statistics like mean and standard deviation to determine a threshold. Frames with differences above this threshold are extracted as key frames. The key frames extracted represent important content changes in the video. Extracting key frames reduces processing requirements for video annotation compared to analyzing all frames. The methodology was tested on videos from domains like transportation and performed well at selecting representative frames.
Optimal Repeated Frame Compensation Using Efficient Video CodingIOSR Journals
1) The document proposes a new video coding standard called Optimal Repeated Frame Compensation (ORFC) which aims to improve compression efficiency. ORFC works by combining repeated frames in a video sequence into a single frame to reduce the total number of frames.
2) The method involves segmenting videos into shots and then analyzing frames within each shot to identify repeated frames. Repeated frames are combined using ORFC to extract key frames, minimizing the number of frames needed to represent the video.
3) Experimental results on test video sequences show the method achieves high compression ratios on average of 99.5% while maintaining good fidelity between 0.75 to 0.78 in extracted key frames. The results indicate OR
In this paper, we propose a novel fast video search algorithm for large video database.
Histogram of Oriented Gradients (HOG) has been reported which can be reliably applied to
object detection, especially pedestrian detection. We use HOG based features as a feature
vector of a frame image in this study. Combined with active search, a temporal pruning
algorithm, fast and robust video search can be achieved. The proposed search algorithm has
been evaluated by 6 hours of video to search for given 200 video clips which each length is 15
seconds. Experimental results show the proposed algorithm can detect the similar video clip
more accurately and robust against Gaussian noise than conventional fast video search
algorithm.
A FAST SEARCH ALGORITHM FOR LARGE VIDEO DATABASE USING HOG BASED FEATUREScscpconf
The document describes a fast video search algorithm that uses Histogram of Oriented Gradients (HOG) features. HOG features are extracted from frames to create feature vectors. These feature vectors are then quantized and histograms are generated for query and database videos. The algorithm uses an active search approach where histogram similarity is calculated and videos are skipped if dissimilar, improving search speed. The algorithm was tested on a 6 hour video database and achieved a search time of 70ms, over 6 times faster than a conventional approach, and was more robust to noise.
A FAST SEARCH ALGORITHM FOR LARGE VIDEO DATABASE USING HOG BASED FEATUREScscpconf
In this paper, we propose a novel fast video search algorithm for large video database. Histogram of Oriented Gradients (HOG) has been reported which can be reliably applied to object detection, especially pedestrian detection. We use HOG based features as a feature vector of a frame image in this study. Combined with active search, a temporal pruning algorithm, fast and robust video search can be achieved. The proposed search algorithm has been evaluated by 6 hours of video to search for given 200 video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip more accurately and robust against Gaussian noise than conventional fast video search algorithm.
Video Content Identification using Video Signature: SurveyIRJET Journal
This document summarizes previous research on video content identification using video signatures. It discusses three types of video signatures (spatial, temporal, and spatio-temporal) that have been used to generate unique descriptors to identify identical video scenes. The document then reviews several existing methods for video signature extraction and matching, including techniques based on ordinal signatures, motion signatures, color histograms, local descriptors using interest points, and compressed video shot matching using dominant color profiles. It concludes by proposing a new temporal signature-based method that aims to accurately detect a video segment embedded in a longer unrelated video by extracting frame-level features, generating fine and coarse signatures, and performing frame-by-frame signature matching.
Similar to VIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHM (20)
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
VIDEO SEGMENTATION & SUMMARIZATION USING MODIFIED GENETIC ALGORITHM
1. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
DOI:10.5121/ijcsa.2018.8501 1
VIDEO SEGMENTATION &
SUMMARIZATION USING MODIFIED
GENETIC ALGORITHM
H S Prashantha
Professor, Department of Electronics & Communication Engineering, Nitte Meenakshi
Institute of Technology, Bangalore, India
ABSTRACT
Video summarization of the segmented video is an essential process for video thumbnails, video
surveillance and video downloading. Summarization deals with extracting few frames from each scene and
creating a summary video which explains all course of action of full video with in short duration of time.
The proposed research work discusses about the segmentation and summarization of the frames. A genetic
algorithm (GA) for segmentation and summarization is required to view the highlight of an event by
selecting few important frames required. The GA is modified to select only key frames for summarization
and the comparison of modified GA is done with the GA.
KEYWORDS
Video segmentation, video summarization, Genetic Algorithm, video streams
1.INTRODUCTION
Segmenting multimedia data streams is a necessary requirement for some of the applications.
Segmentation is the process of partitioning a piece of information into elementary parts termed as
segments. Video segmentation is used to describe a range of different processes for partitioning
the video into meaningful parts of different granularities. Properly summarized video streams can
be better organized and reused after segmentation for some applications. The purpose of
segmentation is to partition the video sequence into shots based on the contents and key frames
are selected. Since the adjacent frames in the video contain similar information and hence distinct
frames need to be considered to summarize the mass of material.
There are two types of scene changes such as abrupt scene change (sudden change in the scene,
i.e. the next scene starts immediately after completion of the previous scene) and Gradual scene
change (change in the scene with a delay). The scene change can be viewed in three ways such as
Fade in, Fade out and Dissolve. Fade in is the change of scene from bright or dark to normal
color. Fade out is the change of scene from normal color to bright (or) dark. Dissolve is the
change in the scene which is the weighted average of both the previous and the next scene.
The classification of the video segmentation algorithms includes pixel comparison, block based
comparison, histogram comparison, feature based comparison, clustering based temporal
segmentation and model driven video segmentation. The different algorithms for segmentation of
the video are studied. Temporal video segmentation for real-time key frame extraction is
discussed (Calic, J 1993). A temporal video segmentation method proposed is based on the
detection of shot abrupt transition and gradual transition, and then takes into account the
conditions of user terminals, which could generate different video summarization for each user
(Chen Yinzi 2010). Implementation of image segmentation using GA involves defining a fitness
evaluation function, designing a ‘Population’ (set of chromosomes), defining genetically inspired
operators such as crossover and mutation to evolve new population and deciding the termination
2. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
2
of the evolutionary search for the optimal solution (Aravind.I 2002). The genetic segmentation
algorithm for video is defined and the evolutionary nature of genetic algorithms offers an
advantage by enabling incremental segmentation (Patric Chiu 2000). In the computational
process, the improved GA adjusts crossover probability and mutation probability automatically
according to the variance between the target and background, thus overcoming the problems of
Simple Genetic Algorithm ( Lei Hui 2008) (Bir Bhanu 1995). A new approach based on GA is
proposed for selection of threshold from the histogram of images (P. Kanungo 2006) (Chi-Chun
Lo 2003). Literature documents the ability of GA for video segmentation. GA imitates the
principles of biological evolution and it is more suitable method for segmentation and
summarization compared to the other traditional methods. The algorithm makes use of global
search and optimization method of video segmentation and summarization GA is a search
algorithm based on the mechanics of natural selection and natural genetics. The algorithm is
probabilistic iterative. GA transforms a set (called a population) of mathematical objects
(typically fixed-length binary character strings), each with an associated fitness value, into a new
population of offspring objects using the Darwinian principle of natural selection and using
operations that are patterned after naturally occurring genetic operations, such as crossover and
mutation. The genetic algorithm works by randomly selecting pairs of individual chromosomes to
reproduce for the next generation. The selection of chromosome is mainly based on the fitness
function value to the other chromosomes in the same segmentation. Usually chromosomes are
randomly split and merged with the consequence that some genes of a child come from one
parent while others come from the other parents. This mechanism is called crossover.
A simple GA consists of following five steps.
1.Start with a randomly generated population of N chromosomes, where N is the size of
population and l is the length of chromosome x.
2.Calculate the fitness value of function φ(x) of each chromosome x in the population.
3.Repeat until N off-springs are created
3.1 Probabilistically select a pair of chromosomes from current population using value of fitness
function.Produce an offspring
3.2 using crossover and mutation operators, where i = 1,2,3,…,.
4.Replace current population with newly created one.
5.Go to step 2.
In case of simple GA, the whole population is formed of strings having the same length. These
strings contain encoded information.
2.METHODOLOGY USED FOR VIDEO SEGMENTATION
The reason for using GA is its ability to deal with large complex search spaces in situations where
only minimum knowledge is available about the objective functions. The corresponding search
space in many situations is quite large and there are complex interactions among parameters.
Most of the video segmentation algorithms have many parameters that need to be adjusted for
effective segmentation. One of the parameter can be considered is the fitness function which
indicates the quality of the individual. GA involves defining a fitness evaluation function,
designing a population (set of chromosomes) defining genetically inspired operators such as cross
over and mutation to evolve new population and deciding the termination of evolutionary search
for the optimal solution.
3. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
3
No
Figure 1: Block diagram of Video Segmentation and Summarization using proposed approach
For the segmentation of video, an efficient and optimal idea is to reduce the number of frames. A
video can have thousands of frames, and many adjacent frames are generally similar to provide
the illusion of continuity. Since adjacent frames are similar, sub sampling is done at a lower rate
to obtain few frames out of many thousands of frames. The selection of the sampling factor
depends on the type of scene changes. For gradual scene changes, choose higher value of
sampling factor and for abrupt scene changes choose lower value of sampling factor. Obtain the
least similar images by measuring their difference with standard techniques of color histograms.
The number of frames depends on the input video to be segmented. The video to be segmented
can have either abrupt scene change or gradual scene changes such as dissolve, fade-in and fade
out. The user can choose the number of segments required for segmentation of the video.
Summarization is carried out for the length mentioned by the user. Selection function selects the
excellent individual so that cross over and mutation operations become more effective. The
4. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
4
research work proposes to modify the given genetic algorithm shown in figure 1 to make it
suitable for the given application of video segmentation and summarization.
Let be the input video to be segmented for the given application. The input video is down
sampled to obtain few frames and 1
represents the down sampled video. By finding the
histogram difference procedure, obtain the video frames which are dissimilar and neglect the
frames which are similar and 11
represents the video obtained after discarding frames. Apply GA
by defining the encoding function and choosing initial candidates to calculate the fitness function.
The fitness function of GA is given by
Where α (i, j) is a function for weighting the histogram differences ℎ( , ).
The GA is modified by changing the fitness function which is given by
Where ℎ( , −1) or dℎ( ) is the histogram difference between ℎand ( −1) ℎ frame and 1
is and user can choose the value. The experiments are conducted for different
values of 1
and the result is discussed by selecting 1
= 5. The proposed modified genetic
algorithm works as follows.
1.If there is a scene change, the segmentation is done encoding it as ‘1’.
If there is a scene change at ℎframe, obviously the ( −1) ℎ
2.frame will be different.
3.Hence, the histogram difference between them may be high.
Perform cross over function to complete the GA.
3.IMPLEMENTATION DETAILS
Table 1: Input video considered for video segmentation & summarization (YUV video)
Video name No of frames Video name No of frames
Akiyo 300 Frames Carphone 382 Frames
Flower 250 Frames Claire 494 Frames
Bridge (Close) 2000 Frames Stefan 90 Frames
Bridge (Far) 210 Frames Suzie 150 Frames
Coastguard 300 Frames Tempete 260 Frames
Hall Monitor 300 Frames Grandma 870 Frames
Bus 150 Frames Mobile 300 Frames
Container 300 Frames Silent 300 Frames
Foreman 300 Frames Waterfall 260 Frames
Highway 2000 Frames News 300 Frames
Miss America 150 Frames Paris 1065 Frames
Mother and Daughter 300 Frames Salesman 449 Frames
5. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
5
The experiments are conducted for different video clips with different types of scene changes
such as abrupt and gradual scene changes. The different videos considered for the
experimentation are shown in table 1. The different sampling factors and number of segments are
considered for experimentation and the sample result is displayed for the discussion.
4.EXPERIMENTAL RESULTS AND DISCUSSIONS
Experiments are conducted for the different videos shown in the table 1 for different value of
sampling factors such as 2, 3, 4, 5, 6, 7, 8, 9 and 10 for sub-sampling. The event considered for
the experimentation consists of different scene changes such as movie scenes, television news,
sports etc where scene changes may be abrupt or gradual. Also the experiments are conducted for
different values of number of segments such as 2, 3, 4, 5, 6, 7, 8, 9 and 10. The summarization is
carried out by considering different number of frames for summarization.
Experiments are conducted for various values of summarization such as 2, 3, 4, 5, 6, 7, 8, 9 and
10.
Input video considered: Hall monitor
Number of frames: 300
Frame Data set ={ 1, 2,………………………………., 300}
Sampling factor = 10
Number of segments=8
Preprocessing step for segmentation is done using sub-sampling to obtain Frame data set 1
given by
1={ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91, 101, 111, 121, 131, 141, 151, 161, 171,
181, 191, 201, 211, 231, 241, 251, 261, 271, 281, 291 }
Least similar images are considered by measuring their difference with the standard technique of
color histograms. On the reduced data set 1
, define the length of an element given by ( )
which is the number of frames in from 1 to the next element in 1
. On 1
, defining ( )
which is the histogram difference between the ( −1) ℎ and
ℎ element of 1. Obtain the data set of frames by considering only those frames which are not
similar by defining histogram difference. The data set contains
11={ 1, 11, 31, 71 81, 101, 141, 151, 211, 221, 231, 251, 261, 271}
Histogram gives an estimate of the difference between the two frames. Two unlike frames with
similar histogram gives rise to falls estimate of frames. The data set 11
may not be properly
segmented the output data set if all the combinations of the histogram difference of the two
frames of 11
are considered. Since most salient images may be similar to each other and too
much repetition may occur in the new data set obtained due to histogram difference. In order to
avoid the computational complexity involved in formulating the histogram data base by
considering each frame distinctly, GA may be applied to achieve the reduction in the data set. GA
is more methodogical evaluation for segmentation and summarization of the frames.
GA starts by using the initialization procedure to generate the first population. The members of
the population are usually strings of symbols (chromosomes) that represent possible solutions to
the problem to be solved. To take into consideration the relative differences among all the
selected images, we define similarity adjacency function. Genetic
6. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
6
segmentation algorithm is applied to obtain key frames. The genetic algorithm can be described
by specifying the encoding, fitness function, and crossover operations. For the encoding of
genetic algorithm, we take a string of 0 and 1 called a chromosome. The bit position of a
chromosome string is an index for an element of the image data stream. Using 1 to denote the
boundaries, the desired segments can be obtained. The genetic mechanism works by randomly
selecting pairs of individual chromosomes to reproduce for the next generation. The probabilities
of a chromosome being selected are proportional to its fitness function value relative to the other
in the same generation. The fitness function depends on the histogram difference and the
chromosome structure depends on the maximum histogram difference. The maximum histogram
difference is considered for summarization and denoted as ‘1’ where as ‘0’ indicates minimum
histogram differences. Since the number of segments is 8, the chromosomes which indicate the
segment boundaries contain 8 ones and remaining are filled with zeros. The structure of
chromosomes follows the order 11
and is given by
{1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0}
Defining similarity adjacency function to obtain the better result of segmentation and the genetic
algorithm is applied to obtain new data set of frames. To reproduce a crossover procedure is
defined. The proposed application of segmentation does not uses mutation operation since it
makes the segments unstable. The data set 111
is obtained by using chromosome patterns and
11
. Eliminate all 0’s and retain 1’s as 0’s correspond to similar frame information. Grouping is
done to obtain the data set given by 111
and the result obtained for the input video hall monitor is
given by
111
={ 1, 11, 31, 81, 151, 211, 251, 261}
The modification of the genetic algorithm is done by modifying the fitness function which gives
better results than the genetic algorithm. The chromosome of the modified GA is given by
{1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0}
Frame number 1 Frame number 11 Frame number 31 Frame number 81
Frame number 151 Frame number 211 Frame number 251 Frame number 261
Figure 2: Summary frames of segmented video with number of segments=8 using GA
7. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
7
Frame number 1 Frame number 31Frame number 81 Frame number 101
Frame number 211 Frame number 221Frame number 251 Frame number 261
Figure 3: Summary frames of segmented video with number of segments=8 using proposed approach
(modified GA)
Grouping is done to obtain the data set given by F111
Table 2: Modified GA comparison with GA for different sampling factors
The modified algorithm removes the frames 11and 151 which contain the information which is
almost same as the information present in frames present in the 1and 81. But frames 101
8. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
8
and 221 are summarized which are more significant for the given application shown in figure 3.
The results show that modification in the genetic algorithm has helped to improve the efficiency
of the GA for the input video clip. Some of the frames with almost similar information content
are repeated in the GA and modified algorithm removes those frames to a greater extent.
Similar procedure is conducted for different value of scaling factors and the table 2 shows the
result obtained for various sampling factors.
The experimental results in the table 2 shows that the result obtained using modified GA is better
than the genetic algorithm in summarizing some of the key frames. The experiments are
conducted for different target segments for the video hall monitor using GA and modified GA
and sampling factor of 10. The table 3 shows the result obtained for the given input video hall
monitor.
Table 3: Modified GA comparison with GA for different target segments
It is seen that if the number of target segments is more, the modified GA result is almost same as
GA. Since most of the applications need is to segment only few frames, modified GA performs
better than GA.
Table 4: Profiling result for scaling factor of 10
The experiments are conducted for the different video clips and the profiling result obtained is
shown in the table 4. The profiling result gives an estimation of clock cycles consumed and the
number of times the function is called. The clock cycle 2 is the number of clock cycles spent ina
function excluding the clock cycle spent in its child functions. The clock cycle 2 includes
overhead resulting from the process of profiling. The profiling result obtained indicate that the
number of clock cycles required to segment and summarize the frames using modified GA is
9. International Journal on Computational Science & Applications (IJCSA) Vol.8, No.4/5, October 2018
9
almost same as GA. Hence, the profiling result in table 4 indicates that the modified GA doesn’t
increases the computational complexity. The modified GA performs better than GA in
segmentation and summarizing the frames without increasing the computational complexity.
5.SUMMARY & CONCLUSIONS
The number of segments can be defined by the user and for every time when the user wants to
change the numbers of segments, the user no need to change the program just the user need to
change the number of segments. For summarization also, the number of frames can be defined by
the user and can be changed whenever the user needs to change. The genetic algorithm which is
used for video segmentation and summarization is useful because it is the random search method
and is an optimization algorithm which estimates the solution and the algorithm approaches the
global value rather than the optimal value. From the results obtained, it is noted that the modified
GA works better than GA in summarizing the important frames which are non-repetitive.
6.REFERENCES
[1] Calic, J, et.al. (1993), “Temporal video segmentation for real-time key frame extraction”, IEEE
International Conference on Acoustics, Speech, and Signal Processing, ICASSP-93.Volume 4.
[2] Chen Yinzi, et. al. (2010), “A Temporal Video Segmentation and Summary Generation Method
Based on Shots’ Abrupt and Gradual Transition Boundary Detecting”, Second International
Conference on Communication Software and Networks.
[3] Aravind. I, et. al. (2002), “Implementation of Image Segmentation and Reconstruction Using Genetic
algorithms”, IEEE, ICIT’02, Bangkok, Thailand
[4] Patrick Chiu, et.al. (2000), “A genetic algorithm for video segmentation summarization”, IEEE
International conference on multimedia and expo ICME2000, Proceedings, Vol.3, pp.1329-133.
[5] Lei Hui, et. al.(2008), “Application of an Improved Genetic Algorithm in image segmentation”,
International Conference on computer Science and Software Engineering”.
[6] Bir Bhanu, et. al. (1995), “Adaptive Image segmentation using a genetic algorithm”, IEEE transaction
on systems, man, and cybernetics, vol 25, No.12, pp.1543-1567.
[7] Chi-Chun Lo, et. al. (2001),”Video segmentation using a histogram-based fuzzy c-means clustering
algorithm”, Computer Standards & Interfaces 23, pp 429–438.
[8] P.Kanungo, et. al.(2006),”Parallel Genetic Algorithm Based Thresholding for Image Segmentation”,
Proceedings of the National Seminar on IT and Soft computing”, ITSC06, India.
[9] Chi-Chun Lo, et. al. (2003), “A histogram-based moment-preserving clustering algorithm for video
segmentation”, Pattern Recognition Letters 24, pp. 2209–2218.
Author
Dr. Prasantha. H S received Bachelor degree from Bangalore University, Master
Degree from V.T.U, Belgaum, and Ph.D from Anna University, Chennai, in the area of
Multimedia and Image Processing. He has 19+ years of teaching and research
experience. His research interest includes Multimedia and SignalProcessing and
published over 30 papers in conferences and Journals. He is currently guiding students
for their research program under VTU and other university. Currently, he is working as a Professor in the
department of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology,
Bangalore.