With today’s digital revolution, many people communicate and collaborate in cyberspace. Users rely on social media platforms, such as Facebook, YouTube and Twitter, all of which exert a considerable impact on human lives. In particular, watching videos has become more preferable than simply browsing the internet because of many reasons. However, difficulties arise when searching for specific videos accurately in the same domains, such as entertainment, politics, education, video and TV shows. This problem can be solved through web video categorization (WVC) approaches that utilize video textual information, visual features, or audio approaches. However, retrieving or obtaining videos with similar content with high accuracy is challenging. Therefore, this paper proposes a novel mode for enhancing WVC that is based on user comments and weighted features from video descriptions. Specifically, this model uses supervised learning, along with machine learning classifiers (MLCs) and deep learning (DL) models. Two experiments are conducted on the proposed balanced dataset on the basis of the two proposed algorithms based on multi-classes, namely, education, politics, health and sports. The model achieves high accuracy rates of 97% and 99% by using MLCs and DL models that are based on artificial neural network (ANN) and long short-term memory (LSTM), respectively.
Video content analysis and retrieval system using video storytelling and inde...IJECEIAES
Videos are used often for communicating ideas, concepts, experience, and situations, because of the significant advances made in video communication technology. The social media platforms enhanced the video usage expeditiously. At, present, recognition of a video is done, using the metadata like video title, video descriptions, and video thumbnails. There are situations like video searcher requires only a video clip on a specific topic from a long video. This paper proposes a novel methodology for the analysis of video content and using video storytelling and indexing techniques for the retrieval of the intended video clip from a long duration video. Video storytelling technique is used for video content analysis and to produce a description of the video. The video description thus created is used for preparation of an index using wormhole algorithm, guarantying the search of a keyword of definite length L, within the minimum worst-case time. This video index can be used by video searching algorithm to retrieve the relevant part of the video by virtue of the frequency of the word in the keyword search of the video index. Instead of downloading and transferring a whole video, the user can download or transfer the specifically necessary video clip. The network constraints associated with the transfer of videos are considerably addressed.
Content based video retrieval using discrete cosine transformnooriasukmaningtyas
A content based video retrieval (CBVR) framework is built in this paper.
One of the essential features of video retrieval process and CBVR is a color
value. The discrete cosine transform (DCT) is used to extract a query video
features to compare with the video features stored in our database. Average
result of 0.6475 was obtained by using the DCT after implementing it to the
database we created and collected, and on all categories. This technique was
applied on our database of video, check 100 database videos, 5 videos in
Keywords: each category.
On-Demand Video Tagging, Annotation, and Segmentation in Lecture Recordings t...IJITE
The COVID-19 pandemic has forced much of the academic world to transition into online operations and online learning. Interactions between the teachers and students are carried out via online video conferencing software where possible. All video conferencing software available today is designed for general usage and not for classroom teaching and learning. In this study, we analyzed the features and effectiveness of more than a dozen major video conferencing software that are being used to replace the physical face-to-face learning experiences. While some of the video conferencing software has pause feature but none allow annotation and segmentation of the recording. We propose tagging and annotation during the live streaming to improve direct access to any portion of the recorded video. We also propose automatic segmentation of the video based on the tagging so that the video is short, targeted, and can easily be identified.
consumers use a smartphone device to display the media contents for work and entertainment purposes, as well as watching online video. Online video streaming is the main cause that consume smartphone’s energy quickly. To overcome this problem, smartphone’s energy management is crucial. Thus, a hybrid energy-aware profiler is proposed. Basically, a profiler will monitor and manage the energy consumption in the smartphone devices. The hybrid energy-aware profiler will set up a protocol preference of both the user and the device. Then, it will estimates the energy consumption in smartphone. However, saving energy alone can contribute to the Quality of Experience (QoE) neglection, thus the proposed solution takes into account the client QoE. Even though there are several existing energy-aware profilers that have been developed to manage energy use in smartphones however, most energy-aware profilers does not consider QoE at the same time. The proposed solution consider both, the performance of the hybrid energy-aware profiler is compared with the baseline energy models against a variation of content adaptation according to the pre-defined variables. Three types of variables were determined; resolution, frame rate and energy consumption in smartphone devices. In this area, QoE subjective methods based on MOS (Mean Opinion Score) are the most commonly used approaches for defining and quantifying real video quality. Nevertheless, although these approaches have been established to consistently quantify users’ amounts of approval, they do not adequately realize which the criteria of video attribute that important are. In this paper, we conducted an experiment with a certain devices to measures user’s QoE and energy usage of video attribute in smartphone devices. Our results demonstrate that the list of possible solution is a relevant and useful video attribute that satify the users.
Video Data Visualization System : Semantic Classification and Personalization ijcga
We present in this paper an intelligent video data visualization tool, based on semantic classification, for
retrieving and exploring a large scale corpus of videos. Our work is based on semantic classification
resulting from semantic analysis of video. The obtained classes will be projected in the visualization space.
The graph is represented by nodes and edges, the nodes are the keyframes of video documents and the
edges are the relation between documents and the classes of documents. Finally, we construct the user’s
profile, based on the interaction with the system, to render the system more adequate to its preferences.
Video Data Visualization System : Semantic Classification and Personalization ijcga
We present in this paper an intelligent video data visualization tool, based on semantic classification, for retrieving and exploring a large scale corpus of videos. Our work is based on semantic classification resulting from semantic analysis of video. The obtained classes will be projected in the visualization space. The graph is represented by nodes and edges, the nodes are the keyframes of video documents and the
edges are the relation between documents and the classes of documents. Finally, we construct the user’s profile, based on the interaction with the system, to render the system more adequate to its preferences.
Video content analysis and retrieval system using video storytelling and inde...IJECEIAES
Videos are used often for communicating ideas, concepts, experience, and situations, because of the significant advances made in video communication technology. The social media platforms enhanced the video usage expeditiously. At, present, recognition of a video is done, using the metadata like video title, video descriptions, and video thumbnails. There are situations like video searcher requires only a video clip on a specific topic from a long video. This paper proposes a novel methodology for the analysis of video content and using video storytelling and indexing techniques for the retrieval of the intended video clip from a long duration video. Video storytelling technique is used for video content analysis and to produce a description of the video. The video description thus created is used for preparation of an index using wormhole algorithm, guarantying the search of a keyword of definite length L, within the minimum worst-case time. This video index can be used by video searching algorithm to retrieve the relevant part of the video by virtue of the frequency of the word in the keyword search of the video index. Instead of downloading and transferring a whole video, the user can download or transfer the specifically necessary video clip. The network constraints associated with the transfer of videos are considerably addressed.
Content based video retrieval using discrete cosine transformnooriasukmaningtyas
A content based video retrieval (CBVR) framework is built in this paper.
One of the essential features of video retrieval process and CBVR is a color
value. The discrete cosine transform (DCT) is used to extract a query video
features to compare with the video features stored in our database. Average
result of 0.6475 was obtained by using the DCT after implementing it to the
database we created and collected, and on all categories. This technique was
applied on our database of video, check 100 database videos, 5 videos in
Keywords: each category.
On-Demand Video Tagging, Annotation, and Segmentation in Lecture Recordings t...IJITE
The COVID-19 pandemic has forced much of the academic world to transition into online operations and online learning. Interactions between the teachers and students are carried out via online video conferencing software where possible. All video conferencing software available today is designed for general usage and not for classroom teaching and learning. In this study, we analyzed the features and effectiveness of more than a dozen major video conferencing software that are being used to replace the physical face-to-face learning experiences. While some of the video conferencing software has pause feature but none allow annotation and segmentation of the recording. We propose tagging and annotation during the live streaming to improve direct access to any portion of the recorded video. We also propose automatic segmentation of the video based on the tagging so that the video is short, targeted, and can easily be identified.
consumers use a smartphone device to display the media contents for work and entertainment purposes, as well as watching online video. Online video streaming is the main cause that consume smartphone’s energy quickly. To overcome this problem, smartphone’s energy management is crucial. Thus, a hybrid energy-aware profiler is proposed. Basically, a profiler will monitor and manage the energy consumption in the smartphone devices. The hybrid energy-aware profiler will set up a protocol preference of both the user and the device. Then, it will estimates the energy consumption in smartphone. However, saving energy alone can contribute to the Quality of Experience (QoE) neglection, thus the proposed solution takes into account the client QoE. Even though there are several existing energy-aware profilers that have been developed to manage energy use in smartphones however, most energy-aware profilers does not consider QoE at the same time. The proposed solution consider both, the performance of the hybrid energy-aware profiler is compared with the baseline energy models against a variation of content adaptation according to the pre-defined variables. Three types of variables were determined; resolution, frame rate and energy consumption in smartphone devices. In this area, QoE subjective methods based on MOS (Mean Opinion Score) are the most commonly used approaches for defining and quantifying real video quality. Nevertheless, although these approaches have been established to consistently quantify users’ amounts of approval, they do not adequately realize which the criteria of video attribute that important are. In this paper, we conducted an experiment with a certain devices to measures user’s QoE and energy usage of video attribute in smartphone devices. Our results demonstrate that the list of possible solution is a relevant and useful video attribute that satify the users.
Video Data Visualization System : Semantic Classification and Personalization ijcga
We present in this paper an intelligent video data visualization tool, based on semantic classification, for
retrieving and exploring a large scale corpus of videos. Our work is based on semantic classification
resulting from semantic analysis of video. The obtained classes will be projected in the visualization space.
The graph is represented by nodes and edges, the nodes are the keyframes of video documents and the
edges are the relation between documents and the classes of documents. Finally, we construct the user’s
profile, based on the interaction with the system, to render the system more adequate to its preferences.
Video Data Visualization System : Semantic Classification and Personalization ijcga
We present in this paper an intelligent video data visualization tool, based on semantic classification, for retrieving and exploring a large scale corpus of videos. Our work is based on semantic classification resulting from semantic analysis of video. The obtained classes will be projected in the visualization space. The graph is represented by nodes and edges, the nodes are the keyframes of video documents and the
edges are the relation between documents and the classes of documents. Finally, we construct the user’s profile, based on the interaction with the system, to render the system more adequate to its preferences.
The ubiquitous and connected nature of camera loaded mobile devices has greatly estimated the value and importance of visual information they capture. Today, sending videos from camera phones uploaded by unknown users is relevant on news networks, and banking customers expect to be able to deposit checks using mobile devices. In this paper we represent Movee, a system that addresses the fundamental question of whether the visual stream exchange by a user has been captured live on a mobile device, and has not been tampered with by an adversary. Movee leverages the mobile device motion sensors and the inherent user movements during the shooting of the video. Movee exploits the observation that the movement of the scene recorded on the video stream should be related to the movement of the device simultaneously captured by the accelerometer. the last decade e-lecturing has become more and more popular. We model the distribution of correlation of temporal noise residue in a forged video as a Gaussian mixture model (GMM). We propose a twostep scheme to estimate the model parameters. Consequently, a Bayesian classifier is used to find the optimal threshold value based on the estimated parameters. Cyrus Deboo | Shubham Kshatriya | Rajat Bhat"Video Liveness Verification" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd12772.pdf http://www.ijtsrd.com/computer-science/other/12772/video-liveness-verification/cyrus-deboo
Automatic video censoring system using deep learningIJECEIAES
Due to the extensive use of video-sharing platforms and services, the amount of such all kinds of content on the web has become massive. This abundance of information is a problem controlling the kind of content that may be present in such a video. More than telling if the content is suitable for children and sensitive people or not, figuring it out is also important what parts of it contains such content, for preserving parts that would be discarded in a simple broad analysis. To tackle this problem, a comparison was done for popular image deep learning models: MobileNetV2, Xception model, InceptionV3, VGG16, VGG19, ResNet101 and ResNet50 to seek the one that is most suitable for the required application. Also, a system is developed that would automatically censor inappropriate content such as violent scenes with the help of deep learning. The system uses a transfer learning mechanism using the VGG16 model. The experiments suggested that the model showed excellent performance for the automatic censoring application that could also be used in other similar applications.
A Survey on Multimedia Content Protection Mechanisms IJECEIAES
Cloud computing has emerged to influence multimedia content providers like Disney to render their multimedia services. When content providers use the public cloud, there are chances to have pirated copies further leading to a loss in revenues. At the same time, technological advancements regarding content recording and hosting made it easy to duplicate genuine multimedia objects. This problem has increased with increased usage of a cloud platform for rendering multimedia content to users across the globe. Therefore it is essential to have mechanisms to detect video copy, discover copyright infringement of multimedia content and protect the interests of genuine content providers. It is a challenging and computationally expensive problem to be addressed considering the exponential growth of multimedia content over the internet. In this paper, we surveyed multimedia-content protection mechanisms which throw light on different kinds of multimedia, multimedia content modification methods, and techniques to protect intellectual property from abuse and copyright infringement. It also focuses on challenges involved in protecting multimedia content and the research gaps in the area of cloud-based multimedia content protection.
System analysis and design for multimedia retrieval systemsijma
Due to the extensive use of information technology and the recent developments in multimedia systems, the
amount of multimedia data available to users has increased exponentially. Video is an example of
multimedia data as it contains several kinds of data such as text, image, meta-data, visual and audio.
Content based video retrieval is an approach for facilitating the searching and browsing of large
multimedia collections over WWW. In order to create an effective video retrieval system, visual perception
must be taken into account. We conjectured that a technique which employs multiple features for indexing
and retrieval would be more effective in the discrimination and search tasks of videos. In order to validate
this, content based indexing and retrieval systems were implemented using color histogram, Texture feature
(GLCM), edge density and motion..
SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...ijma
Streaming of high-quality video contents as part of multimedia communication, has become essential
nowadays. Video delivered over the network suffers from different kind of impairments, which degrades its
quality. Such network impairments effect differs among different types of codec used. This paper present the
effects of network degradation factors such as packet loss and jitter over H.264 and H.265 encoded video
sequences. In addition to the codec used, we also focused on the different level of temporal and spatial
aspect within the videos. Among different basic test methods, double stimulus impairments scale was used
to complete the experiment as subjective measures of assessment metric from user’s perspective.The result
illustrates that differently encoded video sequences react differently to the network impairments and are
very sensitive to a transmission error. Similarly, it also shows that user’s experience is affected according
to the motion level of video.
Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...ijma
Streaming of high-quality video contents as part of multimedia communication, has become essential nowadays. Video delivered over the network suffers from different kind of impairments, which degrades its quality. Such network impairments effect differs among different types of codec used. This paper present the effects of network degradation factors such as packet loss and jitter over H.264 and H.265 encoded video sequences. In addition to the codec used, we also focused on the different level of temporal and spatial aspect within the videos. Among different basic test methods, double stimulus impairments scale was used to complete the experiment as subjective measures of assessment metric from user’s perspective.The result illustrates that differently encoded video sequences react differently to the network impairments and are very sensitive to a transmission error. Similarly, it also shows that user’s experience is affected according to the motion level of video.
The advents in this technological era have resulted into enormous pool of information. This information is
stored at multiple places globally, in multiple formats. This article highlights a methodology for extracting
the video lectures delivered by experts in the domain of Computer Science by using Generalized Gamma
Mixture Model. The feature extraction is based on the DCT transformations. In order to propose the model,
the data set is pooled from the YouTube video lectures in the domain of Computer Science. The outputs
generated are evaluated using Precision and Recall.
Multimodal video abstraction into a static document using deep learning IJECEIAES
Abstraction is a strategy that gives the essential points of a document in a short period of time. The video abstraction approach proposed in this research is based on multi-modal video data, which comprises both audio and visual data. Segmenting the input video into scenes and obtaining a textual and visual summary for each scene are the major video abstraction procedures to summarize the video events into a static document. To recognize the shot and scene boundary from a video sequence, a hybrid features method was employed, which improves detection shot performance by selecting strong and flexible features. The most informative keyframes from each scene are then incorporated into the visual summary. A hybrid deep learning model was used for abstractive text summarization. The BBC archive provided the testing videos, which comprised BBC Learning English and BBC News. In addition, a news summary dataset was used to train a deep model. The performance of the proposed approaches was assessed using metrics like Rouge for textual summary, which achieved a 40.49% accuracy rate. While precision, recall, and F-score used for visual summary have achieved (94.9%) accuracy, which performed better than the other methods, according to the findings of the experiments.
Tutorial on "Video Summarization and Re-use Technologies and Tools", delivered at IEEE ICME 2020. These slides correspond to the first part of the tutorial, presented by Vasileios Mezaris and Evlampios Apostolidis. This part deals with automatic video summarization, and includes a presentation of the video summarization problem definition and a literature overview; an in-depth discussion on a few unsupervised GAN-based methods; and a discussion on video summarization datasets, evaluation protocols and results, and future directions.
TikTok is a video-sharing social networking service that is rapidly growing in popularity. It was the second most downloaded app in the app world in 2020. While the platform is known for having users post videos of themselves dancing, lip-syncing, or showing off other talents, videos of users sharing specific knowledge have increased because of initiatives such as #learnontiktok. This study aims to assess the types of knowledge and learnings shared on TikTok and the profile of its users. We collect a set of videos. Trough and innovative framework implemented with computer vision, natural language processing, and machine learning techniques, we show the main teaching topics published in the #learnontiktok campaign and the disciplines with the highest audience engagement.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
More Related Content
Similar to Enhancing multi-class web video categorization model using machine and deep learning approaches
The ubiquitous and connected nature of camera loaded mobile devices has greatly estimated the value and importance of visual information they capture. Today, sending videos from camera phones uploaded by unknown users is relevant on news networks, and banking customers expect to be able to deposit checks using mobile devices. In this paper we represent Movee, a system that addresses the fundamental question of whether the visual stream exchange by a user has been captured live on a mobile device, and has not been tampered with by an adversary. Movee leverages the mobile device motion sensors and the inherent user movements during the shooting of the video. Movee exploits the observation that the movement of the scene recorded on the video stream should be related to the movement of the device simultaneously captured by the accelerometer. the last decade e-lecturing has become more and more popular. We model the distribution of correlation of temporal noise residue in a forged video as a Gaussian mixture model (GMM). We propose a twostep scheme to estimate the model parameters. Consequently, a Bayesian classifier is used to find the optimal threshold value based on the estimated parameters. Cyrus Deboo | Shubham Kshatriya | Rajat Bhat"Video Liveness Verification" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-3 , April 2018, URL: http://www.ijtsrd.com/papers/ijtsrd12772.pdf http://www.ijtsrd.com/computer-science/other/12772/video-liveness-verification/cyrus-deboo
Automatic video censoring system using deep learningIJECEIAES
Due to the extensive use of video-sharing platforms and services, the amount of such all kinds of content on the web has become massive. This abundance of information is a problem controlling the kind of content that may be present in such a video. More than telling if the content is suitable for children and sensitive people or not, figuring it out is also important what parts of it contains such content, for preserving parts that would be discarded in a simple broad analysis. To tackle this problem, a comparison was done for popular image deep learning models: MobileNetV2, Xception model, InceptionV3, VGG16, VGG19, ResNet101 and ResNet50 to seek the one that is most suitable for the required application. Also, a system is developed that would automatically censor inappropriate content such as violent scenes with the help of deep learning. The system uses a transfer learning mechanism using the VGG16 model. The experiments suggested that the model showed excellent performance for the automatic censoring application that could also be used in other similar applications.
A Survey on Multimedia Content Protection Mechanisms IJECEIAES
Cloud computing has emerged to influence multimedia content providers like Disney to render their multimedia services. When content providers use the public cloud, there are chances to have pirated copies further leading to a loss in revenues. At the same time, technological advancements regarding content recording and hosting made it easy to duplicate genuine multimedia objects. This problem has increased with increased usage of a cloud platform for rendering multimedia content to users across the globe. Therefore it is essential to have mechanisms to detect video copy, discover copyright infringement of multimedia content and protect the interests of genuine content providers. It is a challenging and computationally expensive problem to be addressed considering the exponential growth of multimedia content over the internet. In this paper, we surveyed multimedia-content protection mechanisms which throw light on different kinds of multimedia, multimedia content modification methods, and techniques to protect intellectual property from abuse and copyright infringement. It also focuses on challenges involved in protecting multimedia content and the research gaps in the area of cloud-based multimedia content protection.
System analysis and design for multimedia retrieval systemsijma
Due to the extensive use of information technology and the recent developments in multimedia systems, the
amount of multimedia data available to users has increased exponentially. Video is an example of
multimedia data as it contains several kinds of data such as text, image, meta-data, visual and audio.
Content based video retrieval is an approach for facilitating the searching and browsing of large
multimedia collections over WWW. In order to create an effective video retrieval system, visual perception
must be taken into account. We conjectured that a technique which employs multiple features for indexing
and retrieval would be more effective in the discrimination and search tasks of videos. In order to validate
this, content based indexing and retrieval systems were implemented using color histogram, Texture feature
(GLCM), edge density and motion..
SUBJECTIVE QUALITY EVALUATION OF H.264 AND H.265 ENCODED VIDEO SEQUENCES STRE...ijma
Streaming of high-quality video contents as part of multimedia communication, has become essential
nowadays. Video delivered over the network suffers from different kind of impairments, which degrades its
quality. Such network impairments effect differs among different types of codec used. This paper present the
effects of network degradation factors such as packet loss and jitter over H.264 and H.265 encoded video
sequences. In addition to the codec used, we also focused on the different level of temporal and spatial
aspect within the videos. Among different basic test methods, double stimulus impairments scale was used
to complete the experiment as subjective measures of assessment metric from user’s perspective.The result
illustrates that differently encoded video sequences react differently to the network impairments and are
very sensitive to a transmission error. Similarly, it also shows that user’s experience is affected according
to the motion level of video.
Subjective Quality Evaluation of H.264 and H.265 Encoded Video Sequences Stre...ijma
Streaming of high-quality video contents as part of multimedia communication, has become essential nowadays. Video delivered over the network suffers from different kind of impairments, which degrades its quality. Such network impairments effect differs among different types of codec used. This paper present the effects of network degradation factors such as packet loss and jitter over H.264 and H.265 encoded video sequences. In addition to the codec used, we also focused on the different level of temporal and spatial aspect within the videos. Among different basic test methods, double stimulus impairments scale was used to complete the experiment as subjective measures of assessment metric from user’s perspective.The result illustrates that differently encoded video sequences react differently to the network impairments and are very sensitive to a transmission error. Similarly, it also shows that user’s experience is affected according to the motion level of video.
The advents in this technological era have resulted into enormous pool of information. This information is
stored at multiple places globally, in multiple formats. This article highlights a methodology for extracting
the video lectures delivered by experts in the domain of Computer Science by using Generalized Gamma
Mixture Model. The feature extraction is based on the DCT transformations. In order to propose the model,
the data set is pooled from the YouTube video lectures in the domain of Computer Science. The outputs
generated are evaluated using Precision and Recall.
Multimodal video abstraction into a static document using deep learning IJECEIAES
Abstraction is a strategy that gives the essential points of a document in a short period of time. The video abstraction approach proposed in this research is based on multi-modal video data, which comprises both audio and visual data. Segmenting the input video into scenes and obtaining a textual and visual summary for each scene are the major video abstraction procedures to summarize the video events into a static document. To recognize the shot and scene boundary from a video sequence, a hybrid features method was employed, which improves detection shot performance by selecting strong and flexible features. The most informative keyframes from each scene are then incorporated into the visual summary. A hybrid deep learning model was used for abstractive text summarization. The BBC archive provided the testing videos, which comprised BBC Learning English and BBC News. In addition, a news summary dataset was used to train a deep model. The performance of the proposed approaches was assessed using metrics like Rouge for textual summary, which achieved a 40.49% accuracy rate. While precision, recall, and F-score used for visual summary have achieved (94.9%) accuracy, which performed better than the other methods, according to the findings of the experiments.
Tutorial on "Video Summarization and Re-use Technologies and Tools", delivered at IEEE ICME 2020. These slides correspond to the first part of the tutorial, presented by Vasileios Mezaris and Evlampios Apostolidis. This part deals with automatic video summarization, and includes a presentation of the video summarization problem definition and a literature overview; an in-depth discussion on a few unsupervised GAN-based methods; and a discussion on video summarization datasets, evaluation protocols and results, and future directions.
TikTok is a video-sharing social networking service that is rapidly growing in popularity. It was the second most downloaded app in the app world in 2020. While the platform is known for having users post videos of themselves dancing, lip-syncing, or showing off other talents, videos of users sharing specific knowledge have increased because of initiatives such as #learnontiktok. This study aims to assess the types of knowledge and learnings shared on TikTok and the profile of its users. We collect a set of videos. Trough and innovative framework implemented with computer vision, natural language processing, and machine learning techniques, we show the main teaching topics published in the #learnontiktok campaign and the disciplines with the highest audience engagement.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Similar to Enhancing multi-class web video categorization model using machine and deep learning approaches (20)
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
This research is developing an incubator system that integrates the internet of things and artificial intelligence to improve care for premature babies. The system workflow starts with sensors that collect data from the incubator. Then, the data is sent in real-time to the internet of things (IoT) broker eclipse mosquito using the message queue telemetry transport (MQTT) protocol version 5.0. After that, the data is stored in a database for analysis using the long short-term memory network (LSTM) method and displayed in a web application using an application programming interface (API) service. Furthermore, the experimental results produce as many as 2,880 rows of data stored in the database. The correlation coefficient between the target attribute and other attributes ranges from 0.23 to 0.48. Next, several experiments were conducted to evaluate the model-predicted value on the test data. The best results are obtained using a two-layer LSTM configuration model, each with 60 neurons and a lookback setting 6. This model produces an R 2 value of 0.934, with a root mean square error (RMSE) value of 0.015 and a mean absolute error (MAE) of 0.008. In addition, the R 2 value was also evaluated for each attribute used as input, with a result of values between 0.590 and 0.845.
A review on internet of things-based stingless bee's honey production with im...IJECEIAES
Honey is produced exclusively by honeybees and stingless bees which both are well adapted to tropical and subtropical regions such as Malaysia. Stingless bees are known for producing small amounts of honey and are known for having a unique flavor profile. Problem identified that many stingless bees collapsed due to weather, temperature and environment. It is critical to understand the relationship between the production of stingless bee honey and environmental conditions to improve honey production. Thus, this paper presents a review on stingless bee's honey production and prediction modeling. About 54 previous research has been analyzed and compared in identifying the research gaps. A framework on modeling the prediction of stingless bee honey is derived. The result presents the comparison and analysis on the internet of things (IoT) monitoring systems, honey production estimation, convolution neural networks (CNNs), and automatic identification methods on bee species. It is identified based on image detection method the top best three efficiency presents CNN is at 98.67%, densely connected convolutional networks with YOLO v3 is 97.7%, and DenseNet201 convolutional networks 99.81%. This study is significant to assist the researcher in developing a model for predicting stingless honey produced by bee's output, which is important for a stable economy and food security.
A trust based secure access control using authentication mechanism for intero...IJECEIAES
The internet of things (IoT) is a revolutionary innovation in many aspects of our society including interactions, financial activity, and global security such as the military and battlefield internet. Due to the limited energy and processing capacity of network devices, security, energy consumption, compatibility, and device heterogeneity are the long-term IoT problems. As a result, energy and security are critical for data transmission across edge and IoT networks. Existing IoT interoperability techniques need more computation time, have unreliable authentication mechanisms that break easily, lose data easily, and have low confidentiality. In this paper, a key agreement protocol-based authentication mechanism for IoT devices is offered as a solution to this issue. This system makes use of information exchange, which must be secured to prevent access by unauthorized users. Using a compact contiki/cooja simulator, the performance and design of the suggested framework are validated. The simulation findings are evaluated based on detection of malicious nodes after 60 minutes of simulation. The suggested trust method, which is based on privacy access control, reduced packet loss ratio to 0.32%, consumed 0.39% power, and had the greatest average residual energy of 0.99 mJoules at 10 nodes.
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersIJECEIAES
In real world applications, data are subject to ambiguity due to several factors; fuzzy sets and fuzzy numbers propose a great tool to model such ambiguity. In case of hesitation, the complement of a membership value in fuzzy numbers can be different from the non-membership value, in which case we can model using intuitionistic fuzzy numbers as they provide flexibility by defining both a membership and a non-membership functions. In this article, we consider the intuitionistic fuzzy linear programming problem with intuitionistic polygonal fuzzy numbers, which is a generalization of the previous polygonal fuzzy numbers found in the literature. We present a modification of the simplex method that can be used to solve any general intuitionistic fuzzy linear programming problem after approximating the problem by an intuitionistic polygonal fuzzy number with n edges. This method is given in a simple tableau formulation, and then applied on numerical examples for clarity.
The performance of artificial intelligence in prostate magnetic resonance im...IJECEIAES
Prostate cancer is the predominant form of cancer observed in men worldwide. The application of magnetic resonance imaging (MRI) as a guidance tool for conducting biopsies has been established as a reliable and well-established approach in the diagnosis of prostate cancer. The diagnostic performance of MRI-guided prostate cancer diagnosis exhibits significant heterogeneity due to the intricate and multi-step nature of the diagnostic pathway. The development of artificial intelligence (AI) models, specifically through the utilization of machine learning techniques such as deep learning, is assuming an increasingly significant role in the field of radiology. In the realm of prostate MRI, a considerable body of literature has been dedicated to the development of various AI algorithms. These algorithms have been specifically designed for tasks such as prostate segmentation, lesion identification, and classification. The overarching objective of these endeavors is to enhance diagnostic performance and foster greater agreement among different observers within MRI scans for the prostate. This review article aims to provide a concise overview of the application of AI in the field of radiology, with a specific focus on its utilization in prostate MRI.
Seizure stage detection of epileptic seizure using convolutional neural networksIJECEIAES
According to the World Health Organization (WHO), seventy million individuals worldwide suffer from epilepsy, a neurological disorder. While electroencephalography (EEG) is crucial for diagnosing epilepsy and monitoring the brain activity of epilepsy patients, it requires a specialist to examine all EEG recordings to find epileptic behavior. This procedure needs an experienced doctor, and a precise epilepsy diagnosis is crucial for appropriate treatment. To identify epileptic seizures, this study employed a convolutional neural network (CNN) based on raw scalp EEG signals to discriminate between preictal, ictal, postictal, and interictal segments. The possibility of these characteristics is explored by examining how well timedomain signals work in the detection of epileptic signals using intracranial Freiburg Hospital (FH), scalp Children's Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) databases, and Temple University Hospital (TUH) EEG. To test the viability of this approach, two types of experiments were carried out. Firstly, binary class classification (preictal, ictal, postictal each versus interictal) and four-class classification (interictal versus preictal versus ictal versus postictal). The average accuracy for stage detection using CHB-MIT database was 84.4%, while the Freiburg database's time-domain signals had an accuracy of 79.7% and the highest accuracy of 94.02% for classification in the TUH EEG database when comparing interictal stage to preictal stage.
Analysis of driving style using self-organizing maps to analyze driver behaviorIJECEIAES
Modern life is strongly associated with the use of cars, but the increase in acceleration speeds and their maneuverability leads to a dangerous driving style for some drivers. In these conditions, the development of a method that allows you to track the behavior of the driver is relevant. The article provides an overview of existing methods and models for assessing the functioning of motor vehicles and driver behavior. Based on this, a combined algorithm for recognizing driving style is proposed. To do this, a set of input data was formed, including 20 descriptive features: About the environment, the driver's behavior and the characteristics of the functioning of the car, collected using OBD II. The generated data set is sent to the Kohonen network, where clustering is performed according to driving style and degree of danger. Getting the driving characteristics into a particular cluster allows you to switch to the private indicators of an individual driver and considering individual driving characteristics. The application of the method allows you to identify potentially dangerous driving styles that can prevent accidents.
Hyperspectral object classification using hybrid spectral-spatial fusion and ...IJECEIAES
Because of its spectral-spatial and temporal resolution of greater areas, hyperspectral imaging (HSI) has found widespread application in the field of object classification. The HSI is typically used to accurately determine an object's physical characteristics as well as to locate related objects with appropriate spectral fingerprints. As a result, the HSI has been extensively applied to object identification in several fields, including surveillance, agricultural monitoring, environmental research, and precision agriculture. However, because of their enormous size, objects require a lot of time to classify; for this reason, both spectral and spatial feature fusion have been completed. The existing classification strategy leads to increased misclassification, and the feature fusion method is unable to preserve semantic object inherent features; This study addresses the research difficulties by introducing a hybrid spectral-spatial fusion (HSSF) technique to minimize feature size while maintaining object intrinsic qualities; Lastly, a soft-margins kernel is proposed for multi-layer deep support vector machine (MLDSVM) to reduce misclassification. The standard Indian pines dataset is used for the experiment, and the outcome demonstrates that the HSSF-MLDSVM model performs substantially better in terms of accuracy and Kappa coefficient.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Planning Of Procurement o different goods and services
Enhancing multi-class web video categorization model using machine and deep learning approaches
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 12, No. 3, June 2022, pp. 3176~3191
ISSN: 2088-8708, DOI: 10.11591/ijece.v12i3.pp3176-3191 3176
Journal homepage: http://ijece.iaescore.com
Enhancing multi-class web video categorization model using
machine and deep learning approaches
Wael M. S. Yafooz1
, Abdullah Alsaeedi1
, Reyadh Alluhaibi1
, Abdel-Hamid Mohamed Emara1,2
1
Department of Computer Science, College of Computer Science and Engineering, Taibah University, Madinah, Saudi Arabia
2
Computers and Systems Engineering Department, Faculty of Engineering, Al-Azhar University, Cairo, Egypt
Article Info ABSTRACT
Article history:
Received Jul 19, 2021
Revised Jan 19, 2022
Accepted Feb 8, 2022
With today’s digital revolution, many people communicate and collaborate
in cyberspace. Users rely on social media platforms, such as Facebook,
YouTube and Twitter, all of which exert a considerable impact on human
lives. In particular, watching videos has become more preferable than simply
browsing the internet because of many reasons. However, difficulties arise
when searching for specific videos accurately in the same domains, such as
entertainment, politics, education, video and TV shows. This problem can be
solved through web video categorization (WVC) approaches that utilize
video textual information, visual features, or audio approaches. However,
retrieving or obtaining videos with similar content with high accuracy is
challenging. Therefore, this paper proposes a novel mode for enhancing
WVC that is based on user comments and weighted features from video
descriptions. Specifically, this model uses supervised learning, along with
machine learning classifiers (MLCs) and deep learning (DL) models. Two
experiments are conducted on the proposed balanced dataset on the basis of
the two proposed algorithms based on multi-classes, namely, education,
politics, health and sports. The model achieves high accuracy rates of 97%
and 99% by using MLCs and DL models that are based on artificial neural
network (ANN) and long short-term memory (LSTM), respectively.
Keywords:
Classification
Deep learning
Machine learning
Web video categorization
YouTube
This is an open access article under the CC BY-SA license.
Corresponding Author:
Wael M. S. Yafooz
Department of Computer Science College of Computer Science and Engineering, Taibah University
Madinah, Saudi Arabia
Email: waelmohammed@hotmail.com
1. INTRODUCTION
The convenient accessibility and speed of the internet has made it a staple tool for many people. The
most noticeable and rapidly growing spheres in the context of videos are Daily motion and YouTube.
YouTube is known as the largest repository of videos and is widely used for video sharing by billions of
users [1]–[4]. However, given the massive number of videos on the web, users face difficulties in accurately
retrieving and obtaining the videos they need [5], [6]. The best method to examine, extract and classify web
videos on the basis of content similarity is web video categorization (WVC) [7]–[10]. As the number of
videos on the web has increased exponentially, the traditional way of manually processing video
categorization has become time consuming and thus requires much effort [11]. Along with software
applications for categorization purposes, human intervention is sometimes necessary for refining
categorization. Therefore, extensive effort is spent on areas of WVC with automatic concepts that can help
improve video retrieval accuracy retrieve videos with high content similarity. The similarity of user queries is
also used to increase user satisfaction with viewing relevant and required videos.
2. Int J Elec & Comp Eng ISSN: 2088-8708
Enhancing multi-class web video categorization model using machine and deep … (Wael M. S. Yafooz)
3177
Existing research has focused on WVC that uses classification [7], [12]–[17] or clustering
techniques [18]–[21] and surveys [9], [10], [14], [22], [23]. Categorizing web videos is generally based on
visual, audio or textual information. In visual categorization, the main focus is to extract video frames whilst
dealing with them as images. The features extracted, such as faces, objects, colors and shapes, are used to
compare and classify processes. Audio-based features are extracted from videos, those features are the
signals from sounds such as music, loudness and pitch that represent the values used in the classification
process. For example, the sound of music is different from the sound of speech. Moreover, a male voice is
different from that of a female. Perceptual features, such as music, and violent words differ from each other.
Finally, in textual information, authors use the textual information of video titles or video descriptions or
their metadata. The combination of visual-based and audio-based categorization can result in satisfactory
improvement. However, WVC is a massive challenge in computer vision and machine learning [20], [21].
People share their thoughts, ideas, beliefs, daily activities, experiences, entertainment, feelings and
academic knowledge in the form of comments [24], [25]. Users comment on and like and dislike videos to
express their ideologies. These comments are considered unstructured data that can be relevant or irrelevant
for video content [26]. Such relevant data can be useful for further processing in WVC, particularly in
platforms such as YouTube. Therefore, the current work explores the existing methods and techniques for
WVC. In addition, this study proposes a novel model called the enhanced multiclass web video
categorization model (EMVC). The proposed EMVC enhances the way in which WVC is conducted by
utilizing and extracting user comments and weighted features from video descriptions using machine and
deep learning (DL) approaches as a form of supervised learning. In addition, this work examines the
ma-chine learning classifiers (MLCs) and DL models for the proposed algorithms by using the proposed
dataset. The dataset was collected from four types of YouTube videos, namely, sports, health, education and
politics, as predefined classes. A total of 86 videos with 42,668 user comments and video descriptions were
used. The dataset called Arabic multi-classification dataset (AMCD), publicly available in [27]. AMCD was
subjected to several steps, including annotation, noise removal, data cleaning and data pre-processing, model
building and model evaluation. After the completion of the pre-processing steps, the dataset was reduced to
8,046 user comments and was thus considered balanced. The two distinct experiments were conducted using
MLCs and DL models on the basis of two proposed algorithms. These algorithms utilized the textual
information extracted from user comments and video descriptions to extract informative features. These are
given weights based on term frequency-inverse document frequency (TD-IFD) and the average and
maximum weights of term frequency-inverse document frequency (TF-IDF) of user comments to the video
description. The model showed good accuracies of 97% and 99% using MLCs and DL models that were
based on artificial neural network (ANN) and long short-term memory (LSTM), respectively.
The main contributions of this work: i) it explores the existing techniques for WVC and highlights
the importance of using user comments and video metadata to enhance WVC; ii) it proposes the EMVC that
is based on video descriptions and user comments to enhance WVC through MLCs and DL models; iii) it
introduces a new dataset (AMCD) that is based on the Arabic dialect collected from 86 YouTube videos with
8,046 user comments; iv) it proposes a novel mathematical equation for improving WVC through two
scenarios and by using two proposed algorithms that utilizes the average and maximum TF-IDF weights of
user comments to the video descriptions; and v) it examines the importance of using user comments and
video descriptions in video classification.
The rest of the paper is organized: in section 3 explains proposed methods, model architecture and
design. Section 3 presents the proposed mathematical equation while the experiments, results and discussion
are presented in section 4. Finally, the conclusion of this paper is described in section 5.
2. PROPOSED METHOD
This section demonstrates the methods and system architecture of the propose model for WVC as
shown in Figure 1. The system architecture consists of six main interrelated phases namely; data acquisition,
pre-processing, term extraction and word representation, term weighting, classification methods, and model
evaluation.
2.1. Data acquisition phase
The first level in the model is known as the input phase. In this phase, data is collected from
YouTube videos to use in the video categorization process. According to the core objective of this research,
the enhanced video categorization is based upon four predefined classes; including health, sport, politics, and
education. The determining criteria are the video description and video comments required to extract. The
Arabic videos and their associated Arabic comments will be used in the experiments on the condition that the
video publication date ranged from 2015 till 2020 and obtained more than 2000 comments. Python 3.6 and
YouTube are used for data collection. During the extraction process of the video description and user
3. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3176-3191
3178
comments, the required information is extracted into a single file for each video. In addition, some of the
attributes in the file are removed. The output of this phase is used as the input in the pre-processing phase.
Figure 1. EMVC architecture
2.2. Pre-processing phase
There are three steps in the pre-processing phase. They are data cleaning, annotation process, and
data pre-processing. In data cleaning, the initial process is to remove the duplicate records and remove any
English comments, numbers, or tags. The data in each file that belongs to one video is cleaned. The
annotation process is started in the second phase with help of three Native Arabic annotators in computer
science. All three annotators are scholars with PhDs. In this process, if two annotators agreed on one
classified video that belongs to one of the four classes, the decision is taken that the video belongs to a
specific predefined class. Otherwise, the comments are removed if they are not clear or ambiguous. During
the annotation process, the class labelling is given for health is “1”, for education “2”, for politics “3” and
sport is “4”. After the annotation process, the files are collected in one single file. In the data pre-processing
step, Python 3.6 is utilized to perform automatic pre-processing for the dataset. Several steps such as the
removal of any HTML tags, numbers, English characters, character extensions, and repeated characters using
regular expressions are performed. The porter stemming was used to obtain the root of the words.
2.3. Term extraction and word representation phase
In this phase, the terms are extracted from both the video description (VD) and user comments (UC)
as described in definition 1 and definition 2. For each comment, a set of extracted words with videos
description is called word representation. These comments are transformed into vector representation using
“TfidfTransformer” in the “sklearn”. For each video, the set of comments is called vector representation and
denoted by VR. The set of vector representations is called a data collection, denoted by DC, containing sets
of comments for all videos. For each video, the combination of terms presented in users comments and video
description is called word representation (WR).
- Definition 1. Given a video v∈V, the set of user comment 𝑈𝐶𝑣
and video description 𝑉𝐷𝑣
for v is
defined:
{𝑈𝐶𝑉
+ 𝑉𝐷𝑉
}
- Definition 2. Given n videos, the set of word representation (WR) is defined:
𝑊𝑅 = {{ 𝑈𝐶1
+ 𝑉𝐷1}, { 𝑈𝐶2
+ 𝑉𝐷2}, { 𝑈𝐶3
+ 𝑉𝐷3} ⋯ ⋯ { 𝑈𝐶𝑛
+ 𝑉𝐷𝑛}}
4. Int J Elec & Comp Eng ISSN: 2088-8708
Enhancing multi-class web video categorization model using machine and deep … (Wael M. S. Yafooz)
3179
- Definition 3. Given n videos, the set of comments is called vector representation (VR) is defined:
𝑉𝑅 = { 𝑊𝑅1
, 𝑊𝑅2
, 𝑊𝑅3
, ⋯ ⋯ ⋯ 𝑊𝑅𝑛}
- Definition 4. Given n videos, the set of vector representation is called data collection (DC) is defined:
𝐷𝐶 = { 𝑉𝑅1
, 𝑉𝑅2
, 𝑉𝑅3
, ⋯ ⋯ ⋯ 𝑉𝑅𝑛
}
2.4. Term weighting phase
In the term weighting phase, the proposed mathematical formula has been employed based on the
TF-IDF. The TF-IDF has been extracted from user comments and YouTube metadata, particularly on the
video description only, the mathematical formula as shown in (1):
𝑊(𝑤, 𝐶) = 𝑇𝐹 (𝑤) 𝐶 𝐿𝑜𝑔
𝑁
𝐶𝐹(𝑇)
(1)
where, TF(w) C is denotes number of word (w) in comment (C). CF(T) is denotes number of comments
containing word (w). N is denotes is the total number of comments in dataset.
2.5. Classification methods phase
In this phase, two types of classification methods were used are; classical machine learning
classifiers (MLC) and deep learning. In classical machine learning classifiers, k-nearest neighbours (KNN),
naive Bayes (NB), decision trees (DT), random forest (RF), support vector machine (SVM) and regular
regression.
2.5.1. Naive Bayes (NB)
Naive Bayes (NB) classifiers, known as a parametric classifier which is based on some parameters.
It is a simple probabilistic classifier based on concepts of the Bayes theorem in statistics. It is used to solve
the classification problem with assigned data points to class label with an independence concept. The next
mathematical has been used in the proposed model:
𝑃 (
𝐵
𝐴
) =
𝑃(
𝐴
𝐵
)∗𝑃(𝐵)
𝑃(𝐴)
(2)
where, B is the collection of text in specific class/classes, Let B={Education, Health, Sport and Politics}. A is
the word or comments, Let A={User comments and Video Descriptions}. P(A/B) is probability of that word
or comment B is belong to class A. P(B/A) is Probability of that the word or comment (A) in the specific
class (B).
2.5.2. K-nearest neighbours (KNN)
KNN is non-parametric classification algorithm. It is classifying dataset based on the distance
between data points using similarity measure using distance function such as Euclidean distance, Manhattan
distance, cosine similarity, chi-square and correlation. KNN classify data points to its close neighbours so the
more close distance is assigned to the same category. The k represents to which group the data point is
assigned known as nearest neighbours, if the K is odd the voting will be considered and the majority will be
considered.
2.5.3. Decision tree (DT)
Decision tree is non-parameter machine learning classifier. It uses a concept of tree structure which
consist of root, children/internal nodes and tree leaf nodes. In this way, the dataset into split based on
threshold and some conditions from the tree root until reach tree leaves. The tree internal node represents the
testing process on the features while tree leaf represent the decisions and class labels as shown in Figure 2. In
the decision tree classifier, we need to start with one root of the extracted features in DC in order to do this,
we are required to know the highest information gain of the extracted features. This can be calculated using
the (3) which is based on calculating the entropy using the (4). Where, Pi is the probability of that features in
the data collection (Call) belong to class i.
𝐼𝑛𝑓𝑜(𝐶𝑎𝑙𝑙) = − ∑ 𝑃𝑖 log2(𝑃𝑖)
𝑚
𝑖=1 (3)
𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝐺𝑎𝑖𝑛 (𝑓𝑒𝑎𝑡𝑢𝑟𝑒) = (𝐼𝑛𝑓𝑜(𝐶𝑎𝑙𝑙) − 𝐼𝑛𝑓𝑜𝑓𝑒𝑎𝑡𝑢𝑟𝑒(𝐶𝑎𝑙𝑙) (4)
5. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3176-3191
3180
Figure 2. Decision tree (DT)
2.5.4. Random forest (RF)
Random forest (RF) is conation several DTs as shown in Figure 3. The dataset is divided randomly
into all the DT and also can be duplicate to DT. The final results of the model are based on the majority vote
of outcomes of DTs model. In addition, a large number for DT increase the model performance accuracy.
Figure 3. Random forest (RF)
2.5.5. Support vector machine (SVM)
SVM is mainly used for binary classification problem. It is divided the data points in the
multidimensional space into two classes based on the supports vectors which are closest to the hyperplane. In
this case which is multi-classification, SVM breaks down the problem into binary classification problem
based on two main approaches are one-to-one or one-to-rest.
2.5.6. Deep learning model
In deep learning, this study used an ANN model which known as multilayer perceptron (MLP) [28]
in order to classify the proposed data and to examine the model performance. Generally, the deep learning
model consists of three layers namely; hidden and output players. In this study, the input layer received it
from the maximum or average of the TF-IDF. The output layers consist of four neurons for the four classes
(education, health, politics and sport). The ANN as shown in Figure 4.
6. Int J Elec & Comp Eng ISSN: 2088-8708
Enhancing multi-class web video categorization model using machine and deep … (Wael M. S. Yafooz)
3181
Figure 4. Artificial neural network architecture
2.6. Model evaluation
In order to evaluate the model performance, the most popular methods have been used the confusion
matrix as shown in Figure 5 and the cross-validation process. The confusion matrix has been used to evaluate
the model performance in the accuracy. In the confusion matrix, the recall, precision, F-score and accuracy
have been utilized based on next the mathematical formulas (5)-(8):
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃+𝐹𝑃
(5)
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃
𝑇𝑃+𝐹𝑁
(6)
𝐹1 − 𝑠𝑐𝑜𝑟𝑒 =
2∗(𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
(7)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁
𝑇𝑃+ 𝑇𝑁+ 𝐹𝑃 + 𝐹𝑁
(8)
where, TP is true positive, TN is true negative, FP is false positive and FN is false negative.
Figure 5. Confusion matrix
In addition, cross-validation had performed on the proposed dataset in order to examine the mode on
the introduced dataset. The process was carried into three types 3, 5 and 10 folds, in all the experiments the
results show that the difference between the validation and training data is between 1-3% only which
indicates no overfitting or underfitting issues.
7. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3176-3191
3182
3. MATHEMATICAL FORMULA AND PROPOSED ALGORITHMS
This section explains the mathematical equations and the proposed algorithms used in WVC based
on the introduced dataset. The dataset is a collection of unstructured data which consist of video descriptions
and user comments. The comments collection denoted by Call and Aall represents a collection of video
descriptions that needs to be classified into one of the four classifications. Both Call and Aall are used to
extract the feature representations. The proposed feature representation can be represented by the following
terms.
- Definition 1. There exists user comment (UC) and a collection of user comments (Call) where Call
represent all features in C. Therefore, Call=UC1
, UC2
, UC3
, ......, UCn
where n is the total, can be
defined:
∃ 𝑈𝐶 ∈ 𝐶𝑎𝑙𝑙
- Definition 2. There exists a video description (VDall) where Aall consists of one or several video.
Therefore, VDall=VD1
, VD2
, VD3
, ....., VDm
, where m is the total video, can be defined:
∃ 𝑉𝐷 ∈ 𝑉𝐷𝑎𝑙𝑙
- Definition 3. There also exists a set of extracted features (FCall) from (UC) in Call that can be
represented as FCall=FC1
, FC2
, FC3
......,FCr
, where FC is word/term and r is the number of features, can
be defined:
FCall={∃ 𝐹𝐶 ∈ 𝐹𝐶𝑎𝑙𝑙 ∧ 𝐹𝐶𝑎𝑙𝑙 ∈ 𝐶 ∧ 𝐶 ∈ 𝐶𝑎𝑙𝑙 }
- Definition 4. There also exists a set of extracted features (FAall) from (VD) in Aall that can be
represented as FAall=FA1
, FA2
, FA3
......,FAs
, where FA is word/term and s is the number of features, can
be defined:
𝐹𝐴𝑎𝑙𝑙 = {∃ 𝐹𝐴 ∈ 𝐹𝐴𝑎𝑙𝑙| 𝐹𝐴𝑎𝑙𝑙 ∈ 𝑉𝐷 ∧ 𝑉𝐷 ∈ 𝑉𝐷𝑎𝑙𝑙}
- Definition 5. The maximum value of TF-IDF of extracted features (FCall_TF-IDF) from collection of
comments Call, denoted by MaxFC, as defined in (9).
𝑀𝑎𝑥𝐹𝐶 = {∀ 𝐹𝐶 ∈ 𝐹𝐶𝑎𝑙𝑙|(𝐹𝐶all (max(𝑇𝐹−𝐼𝐷𝐹)))} (9)
- Definition 6. The maximum value TF-IDF of extracted features (FCall) assigned to FAall, if it greater
than TF-IDF (FAall), denoted by FAall_TF-IDF as defined in (10).
𝐹𝐴𝑎𝑙𝑙_𝑇𝐹𝐼𝐷𝐹 = max(𝑇𝐹𝐼𝐷𝐹 (𝐹𝐴𝑎𝑙𝑙) ∨ MaxFC ) (10)
- Definition 7. The average value TF-IDF of extracted features (FCall) assigned to FAall, if it greater than
TF-IDF (FAall), denoted by FAall_TF-IDF, as defined in (11).
𝐹𝐴𝑎𝑙𝑙𝑇𝐹−𝐼𝐷𝐹 = Average (𝑇𝐹 − 𝐼𝐷𝐹 (𝐹𝐴𝑎𝑙𝑙) ∨ MaxFC ) (11)
- Definition 8. Based on obtain the max and assigned to 𝐹𝐴𝑎𝑙𝑙_𝑇𝐹𝐼𝐷𝐹, as defined in (12).
𝑀𝑎𝑡𝑟𝑖𝑥 𝑅𝑒𝑝𝑟𝑒𝑠𝑒𝑛𝑡𝑎𝑡𝑖𝑜𝑛 = 𝐹𝐶𝑎𝑙𝑙𝑇𝐹𝐼𝐷𝐹
∪ 𝐹𝐴𝑎𝑙𝑙𝑇𝐹𝐼𝐷𝐹
(12)
Based on the aforementioned equations, there are two scenarios, the first scenario has applied the algorithm 1
which is considered the average TF-IDF of the user comments to the extracted features (terms) of video
description while the second scenario is algorithm 2 which consider the maximum TF-IDF of user comments
to extract features (terms) of the video description.
4. EXPERIMENTS, RESULTS AND DISCUSSION
This section demonstrates the experiment results of the proposed models and algorithms using
Classical MLCs and DL Models. The dataset description is included in the subsequent section.
8. Int J Elec & Comp Eng ISSN: 2088-8708
Enhancing multi-class web video categorization model using machine and deep … (Wael M. S. Yafooz)
3183
4.1. Dataset
In this section, the dataset used in the experiments is textual data collected from YouTube videos
that contain metadata and user comments. It consists of four classes are health, education, politics and sport,
the total number of comments is 8,046 after pre-processing steps. The maximum length of a comment is 1235
in political class while the minimum length is one in all three classes. The detailed description of the dataset
is shown in Table 1.
Algorithm 1: Matrix representation for TF-IDF for user comments and average weighted feature of video
description
0 INPUT:
1 User Comments (UC)
2 Video Description (VD)
3 Call denoted collection of comments
4 Aall denoted collection of video description
5 OUTPUT: Matrix Representation (TF-IDF (Call and Aall))
6 BEGIN
7 INT FCall_TF-IDF, FAall_TF-IDF;
8 CHAR Cal , Aall, UC,VD;
9 While true Do
10 Call=UC++;
11 Aall=VD++;
12 End;
13 While true Do
14 FCall_TF-IDF=TF-IDF(Call);
15 FAall_TF-IDF=TF-IDF(Aall);
16 END;
17 While true Do
18 If Average (FCall_TF-IDF)>FAall_TF-IDF Then
19 FAall_TF-IDF=Average (Call_TF-IDF);
20 End IF;
21 End;
22 Matrix_ Representation=FCall_TF-IDF +FAall_TF-IDF;
23 End;
Algorithm 2: Matrix representation for TF-IDF for user comments and maximum weighted feature of video
description
0 INPUT:
1 User Comments (UC)
2 Video Description (VD)
3 Call denoted collection of comments
4 Aall denoted collection of video description
5 OUTPUT: Matrix Representation (TF-IDF (Call and Aall))
6 BEGIN
7 INT FCall_TF-IDF, FAall_TF-IDF ;
8 CHAR Cal , Aall, UC,VD;
9 While true Do
10 Call=UC++;
11 Aall=VD++;
12 End;
13 While true Do
14 FCall_TF-IDF=TF-IDF(Call);
15 FAall_TF-IDF=TF-IDF(Aall);
16 End;
17 While true Do
18 If Max(FCall_TF-IDF)>FAall_TF-IDF Then
19 FAall_TF-IDF =Max(Call_TF-IDF);
20 End IF;
21 End;
22 Matrix_ Representation=FCall_TF-IDF +FAall_TF-IDF ;
23 END;
Table 1. Dataset description
Description/Item Class Val. Comments Number Percentage Max. Length Min. Length
1 Education 1 2001 24.8% 99 1
2 Health 2 2021 25.2% 970 2
3 Politics 3 2017 25.1% 1235 1
4 Sport 4 2007 24.9% 119 1
Total 8046 100%
9. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3176-3191
3184
4.2. MLCs experiments
In this section, the experiment based on classical machine learning classifiers was carried out using
the most common classifiers; KNN, SVM, NB, LR, DT, SGD, and RF. This experiment was performed using
Python 3.6 with the aforementioned pre-processing steps. The model performance with/out proposed
algorithms were measured using the confusion matrix in precision, recall, f-score, and accuracy.
There were four types of experiments carried out. These are as follows; experiment based on MCLs,
experiment MCLs with applied algorithms 1 and 2 with 30 features, experiment MCLs with applied
algorithms 1 and 2 with 40 features, and experiment based on MCLs with applied algorithms 1 and 2 with 50
features. In the first experiment, the experiment based on MCLs was performed using N-grams in form of
bigrams and trigrams without applied proposed algorithms on the proposed dataset. In addition, the user
comments have been utilized only in this experiment. This is aimed to examine the model performance
before using the proposed algorithms, the results are as shown in Table 2. Table 2 shows the comparative
analysis of the results on model performance based on the first experiment between MCLs. The model
accuracy using LR and SGD reached 87% and 88%, respectively with bigram. However, no improvement
was recorded using trigram for all MLCs, practically, the experiments were repeated several times.
Consequently, all the experiments were carried out only using the bigram.
Table 2. Results of MLC without algorithm 1 and 2
Bigram Trigram
MLCs Class Precision Recall F-score Accuracy MLCs Class Precision Recall F-score Accuracy
KNN 1 74% 61% 66%
62% KNN
1 73% 63% 67%
63 %
2 55% 87% 67% 2 55% 88% 68%
3 47% 71% 56% 3 44% 85% 58%
4 73% 49% 58% 4 80% 47% 59%
SVM 1 85% 88% 86%
87% SVM
1 85% 87% 86% 87%
2 90% 93% 91% 2 90% 93% 91%
3 87% 90% 89% 3 87% 91% 89%
4 86% 78% 82% 4 86% 78% 82%
NB 1 75% 93% 83%
83% NB
1 73% 94% 82%
83%
2 87% 87% 87% 2 87% 87% 87%
3 80% 89% 84% 3 80% 89% 84%
4 90% 67% 77% 4 90% 67% 77%
LR 1 87% 88% 87%
88% LR
1 87% 88% 87%
88%
2 91% 92% 92% 2 91% 92% 92%
3 88% 91% 89% 3 88% 91% 90%
4 85% 81% 83% 4 86% 81% 83%
DT 1 38% 97% 54%
56% DT
1 38% 97% 54%
56%
2 54% 96% 70% 2 54% 96% 69%
3 36% 94% 52% 3 36% 94% 52%
4 99% 36% 53% 4 99% 36% 53%
SGD 1 87% 87% 87%
87% SGD
1 88% 86% 87%
88%
2 91% 93% 92% 2 91% 93% 92%
3 88% 89% 89% 3 88% 91% 89%
4 86% 82% 84% 4 85% 82% 84%
RF 1 84% 87% 85%
86% RF
1 85% 86% 85%
86%
2 89% 91% 90% 2 87% 91% 89%
3 86% 88% 87% 3 86% 88% 87%
4 85% 78% 81% 4 85% 79% 82%
In the second experiment, 30 features were extracted from the video description and applied MCLs
with applied algorithm 1 and algorithm 2. The outcome of this experiment has been compared with the
results of the first experiment in order to measure the improvement of the model performance using the
proposed algorithms. In this experiment, the algorithm 1 was applied that used the average of TF-IDF which
outperformed model performance in term of accuracy in the first experiment as shown in Figure 6.
Additionally, the results of this experiment show that algorithm 2 had been recorded a significant
improvement compared to algorithm 1 shown in Figure 7.
In the third experiment, the extracted features for MCLs with applied algorithms 1 and 2 were
increased to 40. Therefore, the model performance recorded the highest accuracy compared to the 30 features
shown in Figure 8. The highest accuracy recorded using RF reached 97, whereas using KNN recorded the
worst. Thus, the model performance using algorithm 2 outperformed algorithm 1.
The fourth experiment was to examine the performance of the 50 model-based features extracted
from the video description or MCLs with applied algorithms (1) and (2). All the MCLs achieved the highest
accuracy compared to all aforementioned experiments. The RF, NB, SGD reached 99% in terms of accuracy
10. Int J Elec & Comp Eng ISSN: 2088-8708
Enhancing multi-class web video categorization model using machine and deep … (Wael M. S. Yafooz)
3185
as shown in Figure 9. Overall, based on the results of the fours experiment using the MCLs with applied the
proposed algorithms, the highest accuracy has been attained using TF-IDF with algorithm 2 with 50 features.
The experiments were repeated several times with more features however, the accuracy not improved.
Figure 6. The model performance between normal and algorithm 2
Figure 7. Accuracy of MCLs (max. and average TF-IDF of 30 features)
Figure 8. Accuracy of MCLs (max. and average TF-IDF of 40 features)
Figure 9. Accuracy of MCLs (Max. TF-IDF of 50 features)
62%
87%
83%
88%
56%
88%
86%
79% 87%
92%
88% 86% 88%
97%
0%
20%
40%
60%
80%
100%
KNN SVM NB LR DT SGD RF
Accuracy
Machine Learning Classifiers
Without With Average TFIDF (30)
91% 96% 93% 96% 86% 96% 97%
79% 87% 92% 88% 86% 88%
97%
0%
20%
40%
60%
80%
100%
KNN SVM NB LR DT SGD RF
Accuracy
Machine Learning Classifiers
Max.TFIDF(30) Average TFIDF(30)
93% 97% 95% 97% 88% 97% 98%
83% 87% 94% 88% 88% 88%
97%
0%
20%
40%
60%
80%
100%
KNN SVM NB LR DT SGD RF
Accuracy
Machine Learning Classifiers
Max.TFIDF (40) With Average TFID (40)
91%
96%
93%
96%
86%
96% 97%
93%
97%
95%
97%
88%
97% 98%
95%
99% 97% 99%
93%
99% 99%
75%
80%
85%
90%
95%
100%
KNN SVM NB LR DT SGD RF
Accuracy
Machine Learning Classifiers
Max.TFIDF (30) Max. TFIDF (40) Max. TFIDF (50)
11. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3176-3191
3186
4.3. Deep learning models
This section explains the second type of experiment that is conducted using the deep neural network
models based MPL on the proposed dataset. This is to measure the proposed model and algorithms using
deep learning models and to examine the results and model performance compared with MCLs. Two model
architectures have been applied are ANN and LSTM using the two proposed algorithms.
In the ANN experiment, the model builds from 4 layers using Keras. The hyper parameter of the
first input layer uses the input dimension of 1000 and an output of 128 neurons the activation function is
“relu”. The second and the third layers are hidden. Their output shape consists of 64 neurons and 32 neurons
with dropout (0.5), each using “relu” as an activation function. The output layer consists of four neurons
using “softmax” as an activation function. The optimizer used is ‘Adam’ with a learning rate of 0.001. The
loss function is ‘sparse_categorical_crossentropy’ and the accuracy is the training performance. A model
training with 70% of the dataset that includes 20 epochs of 64 size batches is used in the training phase. Both
proposed algorithms were applied, the validation and training accuracy loss has been decreased as shown in
Figures 10 and 11 for algorithm 1 and Figures 12 and 13 for algorithm 2. The model has achieved high
performance in the validation and training process in terms of accuracy. In the test phase that uses 30% of the
dataset, the model performance has been achieved is approximately 93% and 99% in terms of testing
accuracy using algorithm 1 and algorithm 2, respectively. Besides, the experiments with the same
configurations were repeated with different learning rates as shown in Table 3.
Figure 10. Training and validation loss-algorithm 1
Figure 11. Training and validation accuracy-algorithm 1
Training and Validation loss (Adam=Learning Rate 0.001)
Algorithm (1)
Training and Validation accuracy (Adam=Learning Rate 0.001)
Algorithm (1)
12. Int J Elec & Comp Eng ISSN: 2088-8708
Enhancing multi-class web video categorization model using machine and deep … (Wael M. S. Yafooz)
3187
Figure 12. Training and validation loss-algorithm 2
Figure 13. Training and validation accuracy-algorithm 2
Table 3. Experiment results of Adam optimizer with different learning rate
Learning Rate Algorithm Loss Accuracy
Training Validation Testing
0.01 1 0.0430 0.98 0.9172 0.9171
2 0.0100 0.9972 0.9962 0.9962
0.001 1 0.0351 0.9878 0.9307 0.9307
2 0.0065 0.9981 0.9974 0.9973
0.0001 1 0.4638 0.8474 0.8584 0.8584
2 0.0812 0.9798 0.9962 0.9962
In LSTM experiment, the same dataset is used with a different model architecture. The LSTM
model architecture consists of a stack of layers of three LSTM layers. In the first layer, the shape of the
output includes 128 LSTM units with an input dimension of 5,392 user comments and 500 features and 50
features of video description. The second layer contains the same hyperparameters and 64 LSTM units while
the third 32 LSTM units. The out layer with four units and the activation function is ‘softmax’. The input
shape of the batch size is 64 and the number of training iterations is 20 epochs. The optimizer used is ‘Adam’
with a learning rate of 0.001. The loss function is ‘sparse_categorical_crossentropy’ and the accuracy as the
training performance. The validation and training accuracy loss has been decreased with stability of
improvement with 20 epochs as shown in Figures 14 and 15 for algorithm 1 and Figure 16 and 17 for
algorithm 2.
Training and Validation loss (Adam=Learning Rate 0.001)
Algorithm (2)
Training and Validation accuracy (Adam=Learning Rate 0.001)
Algorithm (2)
13. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3176-3191
3188
Figure 14. Training and validation loss (algorithm 1)
Figure 15. Training and validation accuracy (algorithm 1)
Figure 16. Training and validation loss (algorithm 2)
14. Int J Elec & Comp Eng ISSN: 2088-8708
Enhancing multi-class web video categorization model using machine and deep … (Wael M. S. Yafooz)
3189
Figure 17. Training and validation accuracy (algorithm 2)
4.4. Results discussion
The majority of the related studies focus on the WVC based on video and audio methodologies. The
textual information plays a significant role in achieving a high accuracy rate in enhancing WVC. This is due
to many people using social media as a platform to express their opinions by commenting on videos. In
addition, the majority of the used dataset as benchmarks such as MCG-WEBV and YouTube-8m is based on
the English language that contains an image, video, audio, and meta-data. This study focuses on enhancing
the multi-class WVC based on combined textual information that are user comments and video description
through proposing two algorithms that utilize the user comments and weight TF-IDF through average or
maximum to be assigned to the extracted features from video description. In order to evaluate the model
performance, we have conducted experiments based on machine learning and deep learning methods. The
experiment results using proposed algorithm 1 and algorithm 2 outperform existing methods, this is found
when the experiment results of the model performance in terms of accuracy with more closed classifications
approaches are compared. The comparison conducted on the existing approaches are mainly based on textual
information. Meanwhile, the results were compared with the approaches with literature. In [29], the accuracy
reached 80% through a combination of three approaches that utilize video and audio. In [30], model
performance reached 64% for the visual information extracted from frames using a VSM classifier. The work
in [31] also highlighted the importance of using sentiment analysis in retrieving data, with the model
achieving 75.43% accuracy.
In our experiment, we use a multi-class WVC for four classes (M=4), namely, sports, economics,
health and education and the introduced dataset based on Arabic language. For the machine learning
classifier, the model achieves the highest accuracies of 97%, 93% and 96% when algorithm 2 was applied to
30 features for the RF, NB and stochastic gradient descent (SGD) classifiers, respectively. As for algorithm
1, the accuracies reach 98%, 95% and 97% given 40 features for the RF, NB and SGD classifiers. KNN
achieves the worst accuracies of 79% and 83% when algorithm 1 was applied to 30 and 40 features. In the
machine learning experiments, the RF, NB and SGD classifiers always outperform the other classifiers. On
the other hand, in the deep learning experiments, ANN and LSTM are applied. Both models produce the
highest accuracy of up to 99% given a few numbers of layers, neurons and iterations (epochs). The loss
function also decreases with a few iterations, with the learning rate being 0.001, which is better than 0.01 and
0.0001. Generally, this result reflects the importance of utilizing user comments and video descriptions as
informative features for enhancing WVC. Specifically, the average or maximum TF-IDF weights of user
comments to be assigned to video descriptions’ extracted features are calculated using the proposed
algorithms. In addition, 30, 40 and 50 extracted features of video descriptions are used in this study, with the
category comprising 50 features achieving the best performance.
5. CONCLUSION
This paper focuses on enhanced video categorization based on user comments and video
descriptions. There are various methods of WVC. They are visual-based, Audio-based, and textual-based.
Hybrid methods such as video-based and Audio-based are given more attention in scholarly articles. This
15. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 12, No. 3, June 2022: 3176-3191
3190
proposed model utilizes the user comments and YouTube video metadata, specifically on video description in
enhancing the WVC. Four experiments are carried out using MLCs and DL models with the proposed
datasets, and two algorithms. TF-IDF extracted from the video description are used in three categories 30, 40,
and 50. The results of these experiments emphasize the usage of the hyper user comments and video
descriptions outperform the Standard methods that focus purely on comments. The usage of the third
category with the 50 extracted features recorded the highest model performance in terms of accuracy.
REFERENCES
[1] W. M. S. Yafooz and A. Alsaeedi, “Sentimental analysis on health-related information with improving model performance using
machine learning,” Journal of Computer Science, vol. 17, no. 2, pp. 112–122, Feb. 2021, doi: 10.3844/jcssp.2021.112.122.
[2] R. F. Alhujaili and W. M. S. Yafooz, “Sentiment analysis for YouTube videos with user comments: review,” in 2021
International Conference on Artificial Intelligence and Smart Systems (ICAIS), Mar. 2021, pp. 814–820, doi:
10.1109/ICAIS50930.2021.9396049.
[3] H. Eksi Ozsoy, “Evaluation of YouTube videos about smile design using the DISCERN tool and Journal of the American Medical
Association benchmarks,” The Journal of Prosthetic Dentistry, vol. 125, no. 1, pp. 151–154, Jan. 2021, doi:
10.1016/j.prosdent.2019.12.016.
[4] N. N. Moon et al., “Natural language processing based advanced method of unnecessary video detection,” International Journal
of Electrical and Computer Engineering (IJECE), vol. 11, no. 6, pp. 5411–5419, Dec. 2021, doi: 10.11591/ijece.v11i6.pp5411-
5419.
[5] Z. A. A. Ibrahim, S. Haidar, and I. Sbeity, “Large-scale text-based video classification using contextual features,” European
Journal of Electrical Engineering and Computer Science, vol. 3, no. 2, Apr. 2019, doi: 10.24018/ejece.2019.3.2.68.
[6] J. Jacob, M. S. Elayidom, and V. P. Devassia, “Video content analysis and retrieval system using video storytelling and indexing
techniques,” International Journal of Electrical and Computer Engineering (IJECE), vol. 10, no. 6, pp. 6019–6025, Dec. 2020,
doi: 10.11591/ijece.v10i6.pp6019-6025.
[7] Y.-L. Chen, C.-L. Chang, and C.-S. Yeh, “Emotion classification of YouTube videos,” Decision Support Systems, vol. 101,
pp. 40–50, Sep. 2017, doi: 10.1016/j.dss.2017.05.014.
[8] P. D. P. Woogue, G. A. A. Pineda, and C. V Maderazo, “Automatic web page categorization using machine learning and
educational-based corpus,” International Journal of Computer Theory and Engineering, vol. 9, no. 6, pp. 427–432, 2017, doi:
10.7763/IJCTE.2017.V9.1180.
[9] M. Tavakolian and A. Hadid, “Deep discriminative model for video classification,” in Computer Vision textendash ECCV 2018,
Springer International Publishing, 2018, pp. 401–418.
[10] P. Rani, J. Kaur, and S. Kaswan, “Automatic video classification: a review,” EAI Endorsed Transactions on Creative
Technologies, vol. 7, no. 24, Art. no. 163996, Jun. 2020, doi: 10.4108/eai.13-7-2018.163996.
[11] M. Afzal, N. Shah, and T. Muhammad, “Web video classification with visual and contextual semantics,” International Journal of
Communication Systems, vol. 32, no. 13, Art. no. e3994, Sep. 2019, doi: 10.1002/dac.3994.
[12] C. Huang, T. Fu, and H. Chen, “Text-based video content classification for online video-sharing sites,” Journal of the American
Society for Information Science and Technology, vol. 61, no. 5, pp. 891–906, May 2010, doi: 10.1002/asi.21291.
[13] L. Bahatti, O. Bouattane, M. Elhoussine, and M. Hicham, “An efficient audio classification approach based on support vector
machines,” International Journal of Advanced Computer Science and Applications, vol. 7, no. 5, 2016, doi:
10.14569/IJACSA.2016.070530.
[14] Z. Wu, T. Yao, Y. Fu, and Y.-G. Jiang, “Deep learning for video classification and captioning,” in Frontiers of Multimedia
Research, ACM, 2017, pp. 3–29.
[15] Y. Yang, D. Krompass, and V. Tresp, “Tensor-train recurrent neural networks for video classification,” Proceedings of the 34th
International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017.
[16] S. Hershey et al., “CNN architectures for large-scale audio classification,” in 2017 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP), Mar. 2017, pp. 131–135, doi: 10.1109/ICASSP.2017.7952132.
[17] G. S. Kalra, R. S. Kathuria, and A. Kumar, “YouTube video classification based on title and description text,” in 2019
International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Oct. 2019, pp. 74–79, doi:
10.1109/ICCCIS48478.2019.8974514.
[18] P. Q. Nguyen, A.-T. Nguyen-Thi, T. D. Ngo, and T.-A. H. Nguyen, “Using textual semantic similarity to improve clustering
quality of web video search results,” in 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE),
Oct. 2015, pp. 156–161, doi: 10.1109/KSE.2015.47.
[19] D. Saravanan, “Efficient video indexing and retrieval using hierarchical clustering technique,” in Proceedings of the Second
International Conference on Computational Intelligence and Informatics, Springer Singapore, 2018, pp. 1–8.
[20] X. Long, C. Gan, G. de Melo, J. Wu, X. Liu, and S. Wen, “Attention clusters: purely attention based local feature integration for
video classification,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp. 7834–7843,
doi: 10.1109/CVPR.2018.00817.
[21] V. Mekthanavanh, T. Li, J. Hu, and Y. Yang, “Social web video clustering based on multi-modal and clustering ensemble,”
Neurocomputing, vol. 366, pp. 234–247, Nov. 2019, doi: 10.1016/j.neucom.2019.07.097.
[22] J. Wu, S. Zhong, J. Jiang, and Y. Yang, “A novel clustering method for static video summarization,” Multimedia Tools and
Applications, vol. 76, no. 7, pp. 9625–9641, Apr. 2017, doi: 10.1007/s11042-016-3569-x.
[23] D. Brezeale and D. J. Cook, “Automatic video classification: a survey of the literature,” IEEE Transactions on Systems, Man, and
Cybernetics, Part C (Applications and Reviews), vol. 38, no. 3, pp. 416–430, May 2008, doi: 10.1109/TSMCC.2008.919173.
[24] O. J. Ying, M. M. A. Zabidi, N. Ramli, and U. U. Sheikh, “Sentiment analysis of informal Malay tweets with deep learning,” IAES
International Journal of Artificial Intelligence (IJ-AI), vol. 9, no. 2, pp. 212–220, Jun. 2020, doi: 10.11591/ijai.v9.i2.pp212-220.
[25] N. S. A. Abu Bakar, R. Aziehan Rahmat, and U. Faruq Othman, “Polarity classification tool for sentiment analysis in Malay
language,” IAES International Journal of Artificial Intelligence (IJ-AI), vol. 8, no. 3, pp. 259–263, Dec. 2019, doi:
10.11591/ijai.v8.i3.pp259-263.
[26] S. Choi and A. Segev, “Finding informative comments for video viewing,” in 2016 IEEE International Conference on Big Data
(Big Data), Dec. 2016, pp. 2457–2465, doi: 10.1109/BigData.2016.7840882.
16. Int J Elec & Comp Eng ISSN: 2088-8708
Enhancing multi-class web video categorization model using machine and deep … (Wael M. S. Yafooz)
3191
[27] “Arabic multi classification dataset-AMCD.” Github. https://github.com/waelyafooz/Arabic-Multi-Classification-Dataset-AMCD
(accessed Aug. 10, 2021).
[28] F. Albu, A. Mateescu, and N. Dumitriu, “Architecture selection for a multilayer feedforward network,” International Conference
on Microelectronics and Computer Science, pp. 131–134, 1997.
[29] H. M. Al Amin, M. S. Arefin, and P. K. Dhar, “A method for video categorization by analyzing text, audio, and frames,”
International Journal of Information Technology, vol. 12, no. 3, pp. 889–898, Sep. 2020, doi: 10.1007/s41870-019-00338-2.
[30] C. Ortega-León, P. A. Marín-Reyes, J. Lorenzo-Navarro, M. Castrillón-Santana, and E. Sánchez-Nielsen, “Video categorisation
mimicking text mining,” in Advances in Computational Intelligence, Springer International Publishing, 2019, pp. 292–301.
[31] H. Bhuiyan, J. Ara, R. Bardhan, and M. R. Islam, “Retrieving YouTube video by sentiment analysis on user comment,” in 2017
IEEE International Conference on Signal and Image Processing Applications (ICSIPA), Sep. 2017, pp. 474–478, doi:
10.1109/ICSIPA.2017.8120658.
BIOGRAPHIES OF AUTHORS
Wael M. S. Yafooz is an associate professor in the computer Science Department,
Taibah University, Saudi Arabia. He received his bachelor degree in the area of computer
science from Egypt in 2002 while a master of science in computer Science from the University
of MARA Technology (UiTM)- Malaysia 2010 as well as a PhD in Computer Science in 2014
from UiTM. He was awarded many Gold and Silver Medals for his contribution to a local and
international expo of innovation and invention in the area of computer science. Besides, he was
awarded the Excellent Research Award from UiTM. He served as a member of various
committees in many international conferences. Additionally, he chaired IEEE international
conferences in Malaysia and China. Besides, He is a volunteer reviewer with different peer-
review journals. Moreover, he supervised number of students at the master and PhD levels.
Furthermore, He delivered and conducted many workshops in the research area and practical
courses in data management, visualization and curriculum design in area of computer science.
He was invited as a speaker in many international conferences held in Bangladesh, Thailand,
India, China and Russia. His research interest includes, Data Mining, Machine Learning, Deep
Learning, Natural Language Processing, Social Network Analytics and Data Management. He
can be contacted at email: waelmohammed@hotmail.com/wyafooz@taibahu.edu.sa.
Abdullah Alsaeedi received the B.Sc. degree in computer science from the
College of computer science and engineering, Taibah University, Madinah, Saudi Arabia, in
2008, M.Sc. degree in Advanced software engineering, The University of Sheffield,
department of computer science, Sheffield, UK, in 2011, and the Ph.D. degree in computer
science from the University of Sheffield, UK, in 2016. He is currently an Assistant Professor at
the Computer Science Department, Taibah University, Madinah, Saudi Arabia. His research
interests include software engineering, software model inference, grammar inference, machine
learning. He can be contacted at email: aasaeedi@taibahu.edu.sa.
Reyadh Alluhaibi received the B.E degree from Taibah University in 2005. He
received M.Sc. degree from Tulsa University in 2009. After working as a lecturer (from 2009
to 2012) in the Dept. of Computer Science, Taibah University. He received the PhD degree
from Manchester University in 2017. After working as an assistant professor (from 2017) in
the Dept. of Computer Science, Taibah University. His research interest includes Machine
Learning, Natural Language Processing, Computational Linguistics, Computational Semantics,
Knowledge Representation, and Temporal Logics. He can be contacted at email:
rluhaibi@taibahu.edu.sa.
Abdel-Hamid Emara is an associate professor in the Department of Computers
and Systems Engineering, Faculty of Engineering, Al-Azhar University, Cairo, Egypt He
received his BSc, MSc, and PhD in computers and Systems engineering from Al-Azhar
university in 1992, 2000, 2006, respectively. He supervised number of students at the master
and PhD levels. Furthermore, He delivered and conducted many workshops in the research
area He served as a member of various committees in many international conferences. He is a
volunteer reviewer with different peer-review journals. He is currently an Assistant Professor
in Computer Science Department at University of Taibah at Al-Madinah Al Monawarah, KSA.
His research interest includes, Data Mining, Machine Learning, Deep Learning, Natural
Language Processing, Social Network Analytics and Data Management. He published several
research papers and participated in several in International journal and local/international
conferences. He can be contacted at email aemara@taibahu.edu.sa.