This document proposes a method to improve image classification of similar looking crowd images by exploiting semantic keywords relevant to malicious events. The key points are:
1. A new "malicious crowd" dataset is collected containing crowd images labeled as benign or malicious, which look similar but involve opposite events.
2. Amazon Mechanical Turk is used to collect keywords describing images, and keywords most relevant to malicious images like "police", "fire", "smoke" are identified.
3. Models are trained to detect these keywords in images, and their outputs are fused with image classification results using different fusion methods, improving classification accuracy over using images alone.
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
Scene Description From Images To SentencesIRJET Journal
This document presents an approach for generating sentences to describe images using distributed intelligence. It involves detecting objects in images using YOLO detection, finding relative positions of objects, labeling background scenes, generating tuples of objects/scenes/relations, extracting candidate sentences from Wikipedia containing tuple elements, searching images for each sentence and selecting the sentence whose images most closely match the input image. The approach is compared to the Babytalk model using BLEU and ROUGE scores, showing comparable performance. Future work to improve object detection and use larger knowledge sources is discussed.
Improve malware classifiers performance using cost-sensitive learning for imb...IAESIJAI
In recent times, malware visualization has become very popular for malware classification in cybersecurity. Existing malware features can easily identify known malware that have been already detected, but they cannot identify new and infrequent malwares accurately. Moreover, deep learning algorithms show their power in term of malware classification topic. However, we found the use of imbalanced data; the Malimg database which contains 25 malware families don’t have same or near number of images per class. To address these issues, this paper proposes an effective malware classifier, based on cost-sensitive deep learning. When performing classification on imbalanced data, some classes get less accuracy than others. Cost-sensitive is meant to solve this issue, however in our case of 25 classes, classical cost-sensitive weights wasn’t effective is giving equal attention to all classes. The proposed approach improves the performance of malware classification, and we demonstrate this improvement using two Convolutional Neural Network models using functional and subclassing programming techniques, based on loss, accuracy, recall and precision.
sketch to photo matching in criminal investigationsChippy Thomas
The document describes an enhanced algorithm for matching sketches to photos that could help with criminal investigations. It discusses an existing FaceSketchID system that uses holistic and component-based algorithms to match sketches to mugshot photos. The paper proposes improvements to the system by using an enhanced holistic algorithm that extracts features from the whole face as well as an enhanced component-based algorithm that extracts features from individual facial components (eyes, nose, etc.) and combines the results.
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...Zakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
COMPUTER VISION PERFORMANCE AND IMAGE QUALITY METRICS: A RECIPROCAL RELATION csandit
Computer vision algorithms are essential components of many systems in operation today. Predicting the robustness of such algorithms for different visual distortions is a task which can
be approached with known image quality measures. We evaluate the impact of several image distortions on object segmentation, tracking and detection, and analyze the predictability of this impact given by image statistics, error parameters and image quality metrics. We observe that
existing image quality metrics have shortcomings when predicting the visual quality of virtual or augmented reality scenarios. These shortcomings can be overcome by integrating computer vision approaches into image quality metrics. We thus show that image quality metrics can be
used to predict the success of computer vision approaches, and computer vision can be employed to enhance the prediction capability of image quality metrics – a reciprocal relation.
Computer Vision Performance and Image Quality Metrics : A Reciprocal Relation cscpconf
Computer vision algorithms are essential components of many systems in operation today.
Predicting the robustness of such algorithms for different visual distortions is a task which can
be approached with known image quality measures. We evaluate the impact of several image
distortions on object segmentation, tracking and detection, and analyze the predictability of this
impact given by image statistics, error parameters and image quality metrics. We observe that
existing image quality metrics have shortcomings when predicting the visual quality of virtual
or augmented reality scenarios. These shortcomings can be overcome by integrating computer
vision approaches into image quality metrics. We thus show that image quality metrics can be
used to predict the success of computer vision approaches, and computer vision can be
employed to enhance the prediction capability of image quality metrics – a reciprocal relation.
A study of cyberbullying detection using Deep Learning and Machine Learning T...IRJET Journal
This document discusses a study on detecting cyberbullying using machine learning and deep learning techniques. Specifically, it examines using a hybrid model combining K-Nearest Neighbors, Support Vector Machine, and Random Forest algorithms, as well as a Convolutional Neural Network. The study uses a Twitter dataset to classify tweets as not bullying, racism, or sexism. It finds that the CNN model produces more accurate predictions than the hybrid stacking algorithm. The document provides background on related work applying machine and deep learning to cyberbullying detection, particularly using content-based and user-based approaches.
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
Scene Description From Images To SentencesIRJET Journal
This document presents an approach for generating sentences to describe images using distributed intelligence. It involves detecting objects in images using YOLO detection, finding relative positions of objects, labeling background scenes, generating tuples of objects/scenes/relations, extracting candidate sentences from Wikipedia containing tuple elements, searching images for each sentence and selecting the sentence whose images most closely match the input image. The approach is compared to the Babytalk model using BLEU and ROUGE scores, showing comparable performance. Future work to improve object detection and use larger knowledge sources is discussed.
Improve malware classifiers performance using cost-sensitive learning for imb...IAESIJAI
In recent times, malware visualization has become very popular for malware classification in cybersecurity. Existing malware features can easily identify known malware that have been already detected, but they cannot identify new and infrequent malwares accurately. Moreover, deep learning algorithms show their power in term of malware classification topic. However, we found the use of imbalanced data; the Malimg database which contains 25 malware families don’t have same or near number of images per class. To address these issues, this paper proposes an effective malware classifier, based on cost-sensitive deep learning. When performing classification on imbalanced data, some classes get less accuracy than others. Cost-sensitive is meant to solve this issue, however in our case of 25 classes, classical cost-sensitive weights wasn’t effective is giving equal attention to all classes. The proposed approach improves the performance of malware classification, and we demonstrate this improvement using two Convolutional Neural Network models using functional and subclassing programming techniques, based on loss, accuracy, recall and precision.
sketch to photo matching in criminal investigationsChippy Thomas
The document describes an enhanced algorithm for matching sketches to photos that could help with criminal investigations. It discusses an existing FaceSketchID system that uses holistic and component-based algorithms to match sketches to mugshot photos. The paper proposes improvements to the system by using an enhanced holistic algorithm that extracts features from the whole face as well as an enhanced component-based algorithm that extracts features from individual facial components (eyes, nose, etc.) and combines the results.
A Comparative Study of Data Mining Methods to Analyzing Libyan National Crime...Zakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
COMPUTER VISION PERFORMANCE AND IMAGE QUALITY METRICS: A RECIPROCAL RELATION csandit
Computer vision algorithms are essential components of many systems in operation today. Predicting the robustness of such algorithms for different visual distortions is a task which can
be approached with known image quality measures. We evaluate the impact of several image distortions on object segmentation, tracking and detection, and analyze the predictability of this impact given by image statistics, error parameters and image quality metrics. We observe that
existing image quality metrics have shortcomings when predicting the visual quality of virtual or augmented reality scenarios. These shortcomings can be overcome by integrating computer vision approaches into image quality metrics. We thus show that image quality metrics can be
used to predict the success of computer vision approaches, and computer vision can be employed to enhance the prediction capability of image quality metrics – a reciprocal relation.
Computer Vision Performance and Image Quality Metrics : A Reciprocal Relation cscpconf
Computer vision algorithms are essential components of many systems in operation today.
Predicting the robustness of such algorithms for different visual distortions is a task which can
be approached with known image quality measures. We evaluate the impact of several image
distortions on object segmentation, tracking and detection, and analyze the predictability of this
impact given by image statistics, error parameters and image quality metrics. We observe that
existing image quality metrics have shortcomings when predicting the visual quality of virtual
or augmented reality scenarios. These shortcomings can be overcome by integrating computer
vision approaches into image quality metrics. We thus show that image quality metrics can be
used to predict the success of computer vision approaches, and computer vision can be
employed to enhance the prediction capability of image quality metrics – a reciprocal relation.
A study of cyberbullying detection using Deep Learning and Machine Learning T...IRJET Journal
This document discusses a study on detecting cyberbullying using machine learning and deep learning techniques. Specifically, it examines using a hybrid model combining K-Nearest Neighbors, Support Vector Machine, and Random Forest algorithms, as well as a Convolutional Neural Network. The study uses a Twitter dataset to classify tweets as not bullying, racism, or sexism. It finds that the CNN model produces more accurate predictions than the hybrid stacking algorithm. The document provides background on related work applying machine and deep learning to cyberbullying detection, particularly using content-based and user-based approaches.
A study of cyberbullying detection using Deep Learning and Machine Learning T...IRJET Journal
This document summarizes a research paper that studied the detection of cyberbullying using machine learning and deep learning techniques. Specifically, it used a hybrid model combining KNN, SVM and Random Forest algorithms (stacking algorithm) and a Convolutional Neural Network (CNN) on a Twitter dataset. The stacking algorithm achieved an accuracy of X% while the CNN achieved a higher accuracy of Y%. A comparison of the two models found that CNN produced a more precise prediction of cyberbullying. The document also reviewed related work on cyberbullying detection using content-based, user-based and network-based approaches with machine learning algorithms like SVM, Naive Bayes and deep learning methods like CNN.
This document presents a novel algorithm for grouping 3D object models based on their appearance in a hierarchical structure, similar to human perception. The algorithm uses clustering to divide objects into subclasses and then further divides each subclass into predefined groups to build the hierarchy. Principal component analysis is used to compactly represent appearance models from multiple images of each object. Distances between manifolds, subspaces, and individual models are calculated to measure similarity and perform the grouping. The goal is to develop a more natural object classifier that mimics how humans differentiate and classify objects based on visual characteristics like shape, color, and texture.
An electronic Medical record (EHR) is a of a computerized version of a patient's paper record. Our reality
has been drastically changed by advanced innovation like – PDAs, tablets, and web-empowered gadgets
have changed our day to day lives and the manner in which we impart. Medication is a data rich enterprise.
EHR incorporate the clinical and treatment chronicles of patients, an EHR framework is worked to go past
standard clinical information gathered in a supplier's office and can be a more extensive perspective on a
patient's consideration. Electronic Health Record (EHR) frameworks face issues with respect toinformation
security, honesty and the board. We could execute blockchain innovation to change the EHR frameworks
and could be an answer of these issues. The main goal of our proposed structure is to implement and
execute blockchain innovation for EHR and furthermore to give secure capacity of electronic records by
characterizing granular access rules for the clients of the proposed framework. Thus this structure furnishes
the EHR framework with the advantages of having a versatile, secure and necessary integral chain-based
arrangement
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The document discusses tackling steganography apps on Android devices. It presents work on detecting stego images from mobile apps using two approaches: signature detection and machine learning methods. A key challenge is generating a large database of stego images from different apps at various embedding rates. The authors developed tools to automatically generate such a database using Android emulators and reverse engineering techniques. They analyzed several stego apps and found the embedding algorithms were often simple, like least significant bit changes. Some apps had detectable signatures. The authors tested signature detection and machine learning on images from seven stego apps to evaluate detecting stego images from apps.
This document discusses face detection and recognition techniques using MATLAB. It begins with an abstract describing face detection as determining the location and size of faces in images and ignoring other objects. It then discusses implementing an algorithm to recognize faces from images in near real-time by calculating the difference between an input face and the average of faces in a training set. The document then provides details on various face recognition methods, the 5 step process of facial recognition, benefits and applications, and concludes that recent algorithms are much more accurate than older ones.
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
Violent Scenes Detection Using Mid-Level Violence Clusteringcsandit
This document proposes a system for detecting violent scenes in videos using a combination of visual and audio features analyzed at the segment level. The system applies multiple kernel learning to make full use of the multimodal nature of video data. It introduces "Mid-level Violence Clustering" which groups violent segments into clusters to implicitly learn mid-level concepts of violence without using manually tagged annotations. The system is trained on a dataset from MediaEval 2013 and evaluated using its official metric, outperforming the best score from that evaluation.
Violent Scenes Detection Using Mid-Level Violence Clusteringcsandit
This document proposes a system for detecting violent scenes in videos using a combination of visual and audio features analyzed at the segment level. The system applies multiple kernel learning to make use of the multimodal nature of video data. It also introduces "Mid-level Violence Clustering" which groups violent segments into clusters to implicitly learn mid-level concepts of violence without using manually tagged annotations. The system is trained on a dataset from MediaEval 2013 and evaluated using its official metric, outperforming the best score from that evaluation.
Face recognition is the ability of categorize a set of images based on certain discriminatory features. Classification of the recognition patterns can be difficult problem and it is still very active field of research. The paper introduces conceptual framework for descriptive study on techniques of face recognition systems. It aims to describe the previous researches have been study the face recognition system, in order scope on the algorithms, usages, benefits , challenges and problems in this felids, the paper proposed the face recognition as sensitive learning task experiments on a large face databases demonstrate of the new feature. The researcher recommends that there's a needs to evaluate the previous studies and researches, especially on face recognition field and 3D, hopeful for advanced techniques and methods in the near future.
Faces in the Distorting Mirror: Revisiting Photo-based Social AuthenticationFACE
This document summarizes research on revisiting photo-based social authentication. The researchers:
1) Demonstrate a new attack on social authentication that matches photos from challenges to a collection of the victim's photos, which is more effective than face recognition attacks.
2) Conduct a user study showing people can identify friends in photos with unrecognizable faces over 99% of the time, whereas software fails.
3) Design a new social authentication system that selects "medium" photos software cannot recognize but people can, and transforms photos via overlays and perspective changes to block matching attacks while retaining human usability, passing 94.38% of challenges in a preliminary study.
My research project involved investigating the potential to apply Machine Learning techniques and Artificial Intelligence to distinguish criminal tendencies in people. I know that the pursuit of this kind of classification is highly controversial but more and more machine learning and deep learning are being applied to data of all types. While my project inherently ran classifiers on images datasets of criminal and non criminal individuals, the main focus was to investigate the presence and effects of biases in the image sets. Quite recently, there was a Chinese paper by Wu and Zhang (2016) that claimed very high performance in discriminating between a criminal and non-criminal dataset. It received widespread criticism
so I endeavoured to investigate using my own assembled datasets whether I could show
that the performance decreased with the removal of biases such as emotion imbalance across sets and background colouring and texture. It mainly involved web-scraping, image
analysis using facial recognition functionality, emotion detection and a Deep Learning Neural Net classifier.
Image Classification and Annotation Using Deep LearningIRJET Journal
This document presents a new deep learning model for jointly performing image classification and annotation. The model uses a convolutional neural network (CNN) to extract features from images and classify semantic objects. It then annotates the images based on the identified objects. The model is evaluated on standard datasets like CIFAR-10, CIFAR-100 as well as a new dataset collected by the authors. Results show the model achieves comparable or better performance than baseline methods, while also enabling fast image annotation. A novel scalable implementation allows annotating large datasets within seconds.
The document analyzes a method for detecting human errors in image files by comparing histograms of two images. It discusses using histograms to visualize pixel intensity distributions and compare overall contrast and dynamic range. The method involves reading and resizing two images, calculating their histograms, and checking for differences to detect errors. Matlab and Labview code examples demonstrate comparing histogram plots and pixel counts to determine if two images match or not. Test results show the histogram method can effectively detect errors by identifying mismatches between images.
The document analyzes a method for detecting human errors in image files by comparing histograms of two images. It discusses using histograms to visualize pixel intensity distributions and compare overall contrast and dynamic range. The method involves reading and resizing two images, calculating their histograms, and checking for differences to detect errors. Matlab and Labview code examples demonstrate comparing histogram plots and pixel counts to determine if two images match or not. Test results show the histogram method can effectively detect errors by identifying mismatches between images.
This project report describes research on using convolutional neural networks to classify gender and age from facial images. The goal is to automatically estimate a person's gender and age based solely on their facial appearance in an image. The report provides background on related work, describes the dataset collected from LinkedIn profiles, and explains the methodology used, including logistic regression and CNN models. The CNN approach achieved 81% accuracy for gender classification and 68% for age classification on test data. Areas for future improvement are also discussed, such as collecting more training data across all age groups.
This document proposes a method for harvesting training examples of bi-concepts (images containing two visual concepts) from social media images to build bi-concept detectors. It presents a multi-modal approach that uses both visual features and semantic text to gather positive and negative bi-concept examples at a large scale from sources like Flickr. Experiments show this approach can accurately learn bi-concept detectors for complex queries, outperforming combinations of single concept detectors. The method introduces a framework for collecting examples, detecting bi-concepts in unlabeled images, and iteratively improving bi-concept retrieval.
A study of cyberbullying detection using Deep Learning and Machine Learning T...IRJET Journal
This document summarizes a research paper that studied the detection of cyberbullying using machine learning and deep learning techniques. Specifically, it used a hybrid model combining KNN, SVM and Random Forest algorithms (stacking algorithm) and a Convolutional Neural Network (CNN) on a Twitter dataset. The stacking algorithm achieved an accuracy of X% while the CNN achieved a higher accuracy of Y%. A comparison of the two models found that CNN produced a more precise prediction of cyberbullying. The document also reviewed related work on cyberbullying detection using content-based, user-based and network-based approaches with machine learning algorithms like SVM, Naive Bayes and deep learning methods like CNN.
This document presents a novel algorithm for grouping 3D object models based on their appearance in a hierarchical structure, similar to human perception. The algorithm uses clustering to divide objects into subclasses and then further divides each subclass into predefined groups to build the hierarchy. Principal component analysis is used to compactly represent appearance models from multiple images of each object. Distances between manifolds, subspaces, and individual models are calculated to measure similarity and perform the grouping. The goal is to develop a more natural object classifier that mimics how humans differentiate and classify objects based on visual characteristics like shape, color, and texture.
An electronic Medical record (EHR) is a of a computerized version of a patient's paper record. Our reality
has been drastically changed by advanced innovation like – PDAs, tablets, and web-empowered gadgets
have changed our day to day lives and the manner in which we impart. Medication is a data rich enterprise.
EHR incorporate the clinical and treatment chronicles of patients, an EHR framework is worked to go past
standard clinical information gathered in a supplier's office and can be a more extensive perspective on a
patient's consideration. Electronic Health Record (EHR) frameworks face issues with respect toinformation
security, honesty and the board. We could execute blockchain innovation to change the EHR frameworks
and could be an answer of these issues. The main goal of our proposed structure is to implement and
execute blockchain innovation for EHR and furthermore to give secure capacity of electronic records by
characterizing granular access rules for the clients of the proposed framework. Thus this structure furnishes
the EHR framework with the advantages of having a versatile, secure and necessary integral chain-based
arrangement
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
The document discusses tackling steganography apps on Android devices. It presents work on detecting stego images from mobile apps using two approaches: signature detection and machine learning methods. A key challenge is generating a large database of stego images from different apps at various embedding rates. The authors developed tools to automatically generate such a database using Android emulators and reverse engineering techniques. They analyzed several stego apps and found the embedding algorithms were often simple, like least significant bit changes. Some apps had detectable signatures. The authors tested signature detection and machine learning on images from seven stego apps to evaluate detecting stego images from apps.
This document discusses face detection and recognition techniques using MATLAB. It begins with an abstract describing face detection as determining the location and size of faces in images and ignoring other objects. It then discusses implementing an algorithm to recognize faces from images in near real-time by calculating the difference between an input face and the average of faces in a training set. The document then provides details on various face recognition methods, the 5 step process of facial recognition, benefits and applications, and concludes that recent algorithms are much more accurate than older ones.
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
With the advent of the Internet and social media, while hundreds of people have benefitted from the vast sources of information available, there has been an enormous increase in the rise of cyber-crimes, particularly targeted towards women. According to a 2019 report in the [4] Economics Times, India has witnessed a 457% rise in cybercrime in the five year span between 2011 and 2016. Most speculate that this is due to impact of social media such as Facebook, Instagram and Twitter on our daily lives. While these definitely help in creating a sound social network, creation of user accounts in these sites usually needs just an email-id. A real life person can create multiple fake IDs and hence impostors can easily be made. Unlike the real world scenario where multiple rules and regulations are imposed to identify oneself in a unique manner (for example while issuing one’s passport or driver’s license), in the virtual world of social media, admission does not require any such checks. In this paper, we study the different accounts of Instagram, in particular and try to assess an account as fake or real using Machine Learning techniques namely Logistic Regression and Random Forest Algorithm.
Violent Scenes Detection Using Mid-Level Violence Clusteringcsandit
This document proposes a system for detecting violent scenes in videos using a combination of visual and audio features analyzed at the segment level. The system applies multiple kernel learning to make full use of the multimodal nature of video data. It introduces "Mid-level Violence Clustering" which groups violent segments into clusters to implicitly learn mid-level concepts of violence without using manually tagged annotations. The system is trained on a dataset from MediaEval 2013 and evaluated using its official metric, outperforming the best score from that evaluation.
Violent Scenes Detection Using Mid-Level Violence Clusteringcsandit
This document proposes a system for detecting violent scenes in videos using a combination of visual and audio features analyzed at the segment level. The system applies multiple kernel learning to make use of the multimodal nature of video data. It also introduces "Mid-level Violence Clustering" which groups violent segments into clusters to implicitly learn mid-level concepts of violence without using manually tagged annotations. The system is trained on a dataset from MediaEval 2013 and evaluated using its official metric, outperforming the best score from that evaluation.
Face recognition is the ability of categorize a set of images based on certain discriminatory features. Classification of the recognition patterns can be difficult problem and it is still very active field of research. The paper introduces conceptual framework for descriptive study on techniques of face recognition systems. It aims to describe the previous researches have been study the face recognition system, in order scope on the algorithms, usages, benefits , challenges and problems in this felids, the paper proposed the face recognition as sensitive learning task experiments on a large face databases demonstrate of the new feature. The researcher recommends that there's a needs to evaluate the previous studies and researches, especially on face recognition field and 3D, hopeful for advanced techniques and methods in the near future.
Faces in the Distorting Mirror: Revisiting Photo-based Social AuthenticationFACE
This document summarizes research on revisiting photo-based social authentication. The researchers:
1) Demonstrate a new attack on social authentication that matches photos from challenges to a collection of the victim's photos, which is more effective than face recognition attacks.
2) Conduct a user study showing people can identify friends in photos with unrecognizable faces over 99% of the time, whereas software fails.
3) Design a new social authentication system that selects "medium" photos software cannot recognize but people can, and transforms photos via overlays and perspective changes to block matching attacks while retaining human usability, passing 94.38% of challenges in a preliminary study.
My research project involved investigating the potential to apply Machine Learning techniques and Artificial Intelligence to distinguish criminal tendencies in people. I know that the pursuit of this kind of classification is highly controversial but more and more machine learning and deep learning are being applied to data of all types. While my project inherently ran classifiers on images datasets of criminal and non criminal individuals, the main focus was to investigate the presence and effects of biases in the image sets. Quite recently, there was a Chinese paper by Wu and Zhang (2016) that claimed very high performance in discriminating between a criminal and non-criminal dataset. It received widespread criticism
so I endeavoured to investigate using my own assembled datasets whether I could show
that the performance decreased with the removal of biases such as emotion imbalance across sets and background colouring and texture. It mainly involved web-scraping, image
analysis using facial recognition functionality, emotion detection and a Deep Learning Neural Net classifier.
Image Classification and Annotation Using Deep LearningIRJET Journal
This document presents a new deep learning model for jointly performing image classification and annotation. The model uses a convolutional neural network (CNN) to extract features from images and classify semantic objects. It then annotates the images based on the identified objects. The model is evaluated on standard datasets like CIFAR-10, CIFAR-100 as well as a new dataset collected by the authors. Results show the model achieves comparable or better performance than baseline methods, while also enabling fast image annotation. A novel scalable implementation allows annotating large datasets within seconds.
The document analyzes a method for detecting human errors in image files by comparing histograms of two images. It discusses using histograms to visualize pixel intensity distributions and compare overall contrast and dynamic range. The method involves reading and resizing two images, calculating their histograms, and checking for differences to detect errors. Matlab and Labview code examples demonstrate comparing histogram plots and pixel counts to determine if two images match or not. Test results show the histogram method can effectively detect errors by identifying mismatches between images.
The document analyzes a method for detecting human errors in image files by comparing histograms of two images. It discusses using histograms to visualize pixel intensity distributions and compare overall contrast and dynamic range. The method involves reading and resizing two images, calculating their histograms, and checking for differences to detect errors. Matlab and Labview code examples demonstrate comparing histogram plots and pixel counts to determine if two images match or not. Test results show the histogram method can effectively detect errors by identifying mismatches between images.
This project report describes research on using convolutional neural networks to classify gender and age from facial images. The goal is to automatically estimate a person's gender and age based solely on their facial appearance in an image. The report provides background on related work, describes the dataset collected from LinkedIn profiles, and explains the methodology used, including logistic regression and CNN models. The CNN approach achieved 81% accuracy for gender classification and 68% for age classification on test data. Areas for future improvement are also discussed, such as collecting more training data across all age groups.
This document proposes a method for harvesting training examples of bi-concepts (images containing two visual concepts) from social media images to build bi-concept detectors. It presents a multi-modal approach that uses both visual features and semantic text to gather positive and negative bi-concept examples at a large scale from sources like Flickr. Experiments show this approach can accurately learn bi-concept detectors for complex queries, outperforming combinations of single concept detectors. The method introduces a framework for collecting examples, detecting bi-concepts in unlabeled images, and iteratively improving bi-concept retrieval.
1. JOINT DEEP EXPLOITATION OF SEMANTIC KEYWORDS AND VISUAL FEATURES FOR
MALICIOUS CROWD IMAGE CLASSIFICATION
Joel Levis1
, Hyungtae Lee23
, Heesung Kwon3
, James Michaelis3
, Michael Kolodny3
, and Sungmin Eum34
1
Ohio University, Athens, Ohio, U.S.A.
2
Booz Allen Hamilton Inc., McLean, Virginia U.S.A.
3
U.S. Army Research Laboratory, Adelphi, Maryland, U.S.A.
4
University of Maryland, College Park, Maryland, U.S.A.
jl359113@ohio.edu, lee hyungtae@bah.com, heesung.kwon.civ@mail.mil
james.r.michaelis2.civ@mail.mil, michael.a.kolodny.ctr@mail.mil, smeum@umiacs.umd.edu
ABSTRACT
General image classification approaches differentiate classes
using strong distinguishing features but some classes cannot
be easily separated because of very similar visual features.
To deal with this problem, we can use keywords relevant to
a particular class. To implement this concept we have newly
constructed a malicious crowd dataset which contains crowd
images with two events, benign and malicious, which look
similar yet involve opposite semantic events. We also created
a set of five malicious event-relevant keywords such as police
and fire. In the evaluation, integrating malicious event classi-
fication with recognition output of these keywords enhances
the overall performance on the malicious crowd dataset.
Index Terms— malicious crowd dataset, semantic key-
word, image classification
1. INTRODUCTION
General image classification methods have drawn upon the
fact that images of differing classes have strong distinguish-
ing features. [1, 2, 3, 4] However, certain classes involve very
different events but can be represented with very similar im-
age features, such as objects, that mainly appeared in the asso-
ciated images. For example, in Figure 1, two images seem to
contain similar event because persons are outstanding in both
images. We can discern, however, that the two images involve
opposite semantic events, which are benign and malicious.
The right image is malicious due to several odd objects, such
as smoke and police equipment. General image classification
may not perform well without semantically crucial object in-
formation, which may or may not be notable from the im-
age, but can still be important keywords to guess which event
occurs. We address this problem by identifying semantically
unique keywords, which occur in higher frequency among the
malicious images, and use these identified words to improve
classification accuracy.
benign malicious
Relevant keywords
street, store, sign, flower, people
Relevant keywords
police, smoke, protest, crowd, fire
Fig. 1. A pair of similar looking crowd images with unique
object contents
Since most benchmark datasets [5, 6, 7] collected for
event classification do not deal with this problem, we col-
lected a novel “malicious crowd” dataset, which contains
crowd images with two events: benign and malicious. Along
with event-level labels, we also collected a number of key-
words that appeared in each image in the dataset, as listed
below each image in Figure 1. We used Amazon Mechani-
cal Turk to describe the semantic contents of each image in
terms of keywords. Then we collected all the keywords for
both classes and created a set of words used at most for each
event. We selected non-overlapping distinctive keywords for
the malicious event, which we aim to identify and treat them
as the representative “semantic keywords”.
To identify semantic keywords from a test image, we used
a well known detection method, the deformable part model
(DPM) [8], and a classification algorithm, which is a fine-
tuned AlexNet [9]. Among various keywords, some such as
police, helmet, and car have rigid appearance but the others
such as fire and smoke do not. DPM was used to detect the
objects with rigid appearance whereas the finetuned AlexNet
was employed to detect less rigid objects such as smoke and
fire. We also built an additional fine-tuned AlexNet archi-
tecture to classify benign/malicious crowd images. Finally,
we used several late fusion approaches to integrate the mali-
2. benign
malicious
(a)
category keywords
benign crowd, people, city, building, men, women, group, road, sidewalk, sign, race, tree, event, fans, gathering, . . .
malicious crowd, people, protext, police, fire, street, riot, city, building, smoke, men, sign, flag, night, man, helmet, signs, group, violence, car, . . .
(b)
Fig. 2. Malicious Crowd Dataset: (a) several example images for the benign and malicious events are shown in the first
and second rows, respectively. (b) keywords mainly seen in the images for the benign and malicious images are listed. Red
keywords are relevant keywords for the malicious event.
cious crowd image classification result with the keyword de-
tection/classification results. Our experiments show that fu-
sion of image and keyword classifications outperforms the
case when only the image classification is used. This supports
the effectiveness of exploiting semantic keyword relevant to
the malicious crowd images.
Our contributions are summarized as follows:
1. We introduce a new task of image classification where
classes cannot be easily seperated from each other un-
like general image classification.
2. To deal with this problem we collect a malicious crowd
dataset which consists of two classes, malicious and be-
nign crowds, which look similar but contain opposite
semantic events.
3. We exploit semantic keywords only relevant to ma-
licious crowd images to differentiate the malicious
crowd images from the benign ones.
4. Integrating image features with these semantic key-
word information increases image classification accu-
racy in the malicious crowd dataset.
2. MALICIOUS CROWD DATASET AND SEMANTIC
KEYWORDS
2.1. Malicious Crowd Dataset
The “malicious crowd” dataset that was used to test our hy-
pothesis contains 1133 crowd images equally split into two
classes: benign and malicious. The intuition behind the label-
ing of the images was that, a benign crowd would be some-
thing a passerby would not be alarmed or concerned to see,
while a malicious image would be alarming and potentially
dangerous.
0
10
20
30
40
50
60
crowdpeopleprote
stpolice
firestreet
rio
t
citybuildin
gsm
oke
m
en
sig
n
fla
g
nig
ht
m
anhelm
etsig
ns
groupvio
le
nce
car
0
10
20
30
40
50
60
crowdpeople
street
citybuildin
g
m
enwom
engroup
road
sid
ewalk
sig
n
race
tree
event
fans
fla
g
gath
erin
g
frequency(%)
keywords
benign malicious
Fig. 3. Histograms of relevant keywords: The left and right
histogram show the histograms of keywords relevant to be-
nign and malicious classes, respectively. The keywords are
listed according to their frequency of appearance in the im-
ages.
These images were gathered using Google Images using
various search terms. For benign images, search terms such as
marathon, pedestrian crowd, parade, and concert were used.
Riot and protest were used as search terms to gather the ma-
licious crowd images. Figure 2(a) illustrates some example
images from each class.
2.2. Semantic Keywords
To describe the contents of each of the crowd images, Ama-
zon Mechanical Turk was used. A human was responsible for
assigning five keywords to each image based on what objects
are observed within. To ensure the accuracy of the Machan-
ical Turk results, we manually removed the keywords which
were incorrectly assigned.
After successfully collecting the crowd images and corre-
sponding keywords, identifying keywords only relevant to the
malicious class was necessary. We then constructed two key-
word sets, each acquired by selecting the most frequently ap-
pearing keywords in the two given classes. In practice, words
that are commonly annotated in 5% or more images in each
3. Table 1. Number of images where each keyword relevant to
the malicious event appears.
class images police fire smoke helmet car
benign 557 8 1 2 7 57
malicious 576 205 144 150 206 65
class were selected. As a result of this thresholding, the num-
bers of selected words for the benign and malicious classes
are 17 and 20, respectively. Selected words and those fre-
quency for both classes can be seen in Figure 3. We have re-
fined the malicious keyword set by eliminating the keywords
that appear in both classes. This elimination resulted in nine
malicious keywords as shown in red in Figure 2(b). Lastly, we
further eliminated keywords indicating particular phenomena
such as protest, riot, night, and violence. Then police, fire,
smoke, helmet, and car were included in the final set of mali-
cious semantic keywords.
Table 2.2 shows the number of images where each key-
word (object) actually appears. While police, fire, smoke, and
helmet seem to be closely associated with the malicious event,
car is seen in both events with a similar frequency. Note that
the numbers in the table do not necessarily match the his-
togram of malicious semantic keywords obtained from Ama-
zon Mechanical Turk. For example, police appears in 205
out of all 576 malicious images at a rate of 35.59%, but is
assigned only to 28.50% of the malicious image by Amazon
Mechanical Turk. This is because the visual contents associ-
ated with these keywords are not overly notable in several im-
ages. We can observe that the frequencies of the selected se-
mantic keywords show a notable gap between the two classes,
indicating that the purpose of the proposed keyword selection
process is achieved.
3. THE PROPOSED APPROACH
To identify semantic keywords from the test images, keyword
detectors/classifiers were trained. For objects with rigid ap-
pearance such as police, helmet, and car, deformable part
models (DPM) [8] were trained. For fire and smoke which
are objects with non-rigid appearance, convolutional neural
network (CNN) classifiers finetuned on the AlexNet architec-
ture [9] were used. Since the object detectors output multiple
detections for an image, we select one detection with a max-
imum score and use that score to represent the confidence of
the object presence in that image. We also built a CNN clas-
sifier to output the confidence score for the maliciousness of
an image. Multiple late fusion approaches were utilized to
combine the output of all keyword detectors/classifiers and
the malicious image classifier.
3.1. Learning Keyword Detectors
DPM detectors which are used to identify police and helmet
were trained on 400 annotated images, made up of all auxil-
iary images from Google Images. For a car detector, we used
the DPM trained on PASCAL VOC 2007 dataset [10].
3.2. Learning Malicious Event/Keyword Classifiers
Firstly, a finetuned AlexNet deep convolutional neural net-
work (DCNN) was trained to classify images as benign or
malicious. The training set includes 905 images randomly se-
lected from the malicious crowd dataset. Finetuning was con-
ducted on all eight layers of AlexNet, with the eighth layer
learning with a learning rate of 20 and a learning rate of 2 for
all others. The last layer was replaced so as to have a binary
output in contrast to the 1000 class output of AlexNet.
The fire and smoke DCNN-based classifiers were also
trained in a similar way to the previously described DCNN.
Each of these models was trained on 300 images. These
contain images from our dataset and the auxiliary images
gathered from Google Images. We used seperate networks
for the two keywords instead of one network with multiple la-
bels because both keywords may appear in the same training
image.
3.3. Late Fusion
A late fusion was performed on the output of the six streams
which are the malicious crowd image classifier, three detec-
tors for police, helmet, and car, and two classifiers for fire and
smoke. The late fusion is used to enhance the baseline classi-
fier with the thought that additional object information would
help to increase classification accuracy. In an attempt to test
which fusion method would be most effective, the streams
were tested using various fusion methods. These included
Linear Discriminant Analysis (LD) [11], Logistic Regression
(LR) [12], Support Vector Machines (SVM) [13], k-Nearest
Neighbor Classifiers (kNN) [14], Subspace-based Ensemble
Classifiers (EC) [15], and a Dynamic Belief Fusion (DBF)
[16]. For SVM, we used two different kernels which are a lin-
ear kernel (SVM-lin) and RBF kernel (SVM-rbf). For kNN,
we used 100 clusters and these clusters are clustered accord-
ing to the Euclidean distance. As the EC, we used a subspace
ensemble classifier with a set of 30 weak models.
4. EXPERIMENTS
4.1. Dataset Partition and Evaluation Protocol
The Malicious Crowd Dataset consists of 1133 images - 576
of 1133 are labeled as the malicious crowd image and the
rest are labeled as benign. The same training dataset men-
tioned in Section 3.2 (905 images) is used to train the fusion
approaches. The rest (228 images) are used as the test set.
4. malicious crowd: 0.9889
fire: 0.5951
smoke: 0.4704
late fusion: 0.7149
malicious crowd: 0.9467
fire: 0.8124
smoke: 0.7568
late fusion: 0.6641
malicious crowd: 0.9161
fire: 0.9664
smoke: 0.7447
late fusion: 0.6408
malicious crowd: 0.9713
fire: 0.9646
smoke: 0.8696
late fusion: 0.6323
malicious crowd: 0.9917
fire: 0.9998
smoke: 0.9755
late fusion: 0.6458
malicious crowd: 0.9560
fire: 0.8126
smoke: 0.4524
late fusion: 0.6338
malicious crowd: 0.6913
fire: 0.4499
smoke: 0.2671
late fusion: 0.5652
malicious crowd: 0.8942
fire: 0.5211
smoke: 0.0564
late fusion: 0.5604
-0.0009-0.0009 -0.0002-0.0008 -0.8191
-0.4941
-0.9403
-0.0007
-0.0007
-0.9282 -0.8655 -0.8554 -0.7348
-0.7150
-0.9376
-0.8387
0.0030
0.0020
0.0040
0.0020
0.0030
-0.9335
-0.9042
-0.8583
-0.8004
-0.8827
-0.9198
-0.8925
-0.9344
-0.9225
-0.0868
malicious
benign
Fig. 4. Output of malicious crowd image classification and keyword detectors/classifiers: the first and second row show
four examples with largest fusion scores for malicious event from malicious and benign crowd images, respectively. Bounding
box with color of red, green, and blue indicates detection of police, helmet, and car detector, respectively. A late fusion score is
obtained by EC (a subspace ensemble classifier).
Table 2. Malicious Crowd Image Classification Accuracy measured by AP
keyword late fusion
baseline police fire smoke helmet car SVM-rbf DBF SVM-lin kNN LD LR EC
AP .722 .586 .563 .689 .532 .491 742 .757 .758 .758 .760 .763 .771
Gain · · · · · · +.020 +.035 +.036 +.036 +.038 +.041 +.049
Averge precision (AP) is used as an evaluation metric in our
experiments.
4.2. Results
Table 2 shows a malicious crowd image classification ac-
curacy in AP for a baseline malicious crowd image clas-
sification, keyword detections/classifications, and various
late fusion approaches. Note that, for a keyword detec-
tion/classification, classification accuracy was calculated for
recognizing malicious image instead of each associated key-
word. For example, when the test image is originally ma-
licious while not containing any police in the image, if the
police detector does not detect any police in the image, the
result is still considered false negative. Using the car detector
does not provide competitive accuracy because, as shown in
the Table 2.2, car is not significantly relevant to the mali-
cious crowd. Other keyword detectors do not provide better
classification accuracy than baseline malicious crowd image
classification. This is because these sematic keywords (ob-
jects) are only seen in small portions in the dataset. However,
integrating the baseline with the output of these keyword
classifiers/detectors enhanced the classification accuracy by
approximately 7% at most. The best performer is EC, a
subspace-based ensemble classifier, achieving fusion gain
of .049 in AP. We can observe that all fusion approaches
improve classification accuracy over the baseline, which sup-
ports the benefit of jointly exploiting semantic keywords and
the associated detectors and classifiers. Figure 4 shows sev-
eral images with high scores in terms of their maliciousness
from both malicious and benign classes.
5. CONCLUSION
We addressed the new image classification problem where
certain classes can be expressed by similar visual features
but should be distinguished from each other semantically. To
demonstrate, we have constructed a novel malicious crowd
image dataset which consists of two classes (benign and mali-
cious) that may look similar but contain semantically different
events. To better classify the images with the aforementioned
characteristics, we have selected representative keywords for
malicious crowd images which are then incorporated with
conventional image classifiiers using a multi-stream late fu-
sion architecture. As Table 2 shows, the approach that we
have hypothesized lead to considerable performance improve-
ments over the conventional baseline classifier when used in
practice.
5. 6. REFERENCES
[1] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hin-
ton, “Imagenet classification with deep convolutional
neural networks,” Advances in Neural Information Pro-
cessing Systems 25, 2012.
[2] Maxime Oquab, L´eon Bottou, Ivan Laptev, and Josef
Sivic, “Is object localization for free? – weakly-
supervised learning with convolutional neural net-
works,” Proceedings of the IEEE Conference on Com-
puter Vision and Pattern Recognition, 2015.
[3] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian
Sun, “Deep residual learning for image recogni-
tion,” IEEE Conference on Computer Vision and Pattern
Recognition, 2015.
[4] Archith J. Bency, Heesung Kwon, Hyungtae Lee,
S Karthikeyan, and B. S. Manjunath, “Weakly super-
vised localization using deep feature maps,” European
Conference on Computer Vision, 2016.
[5] Li-Jia Li and Li Fei-Fei, “What, where and who? clas-
sifying event by scene and object recognition,” IEEE
International Conference on Computer Vision, 2007.
[6] Sangmin Oh, Anthony Hoogs, Amitha Perera, Naresh
Cuntoor, Chia-Chih Chen, Jong Taek Lee, Saurajit
Mukherjee, JK Aggarwal, Hyungtae Lee, Larry Davis,
et al., “A large-scale benchmark dataset for event recog-
nition in surveillance video,” IEEE Conference on Com-
puter Vision and Pattern Recognition, 2011.
[7] George Awad, Jonathan Fiscus, Martial Michel, David
Joy, Wessel Kraaij, Alan F. Smeaton, Georges Qu´eenot,
Maria Eskevich, Robin Aly, and Roeland Ordelman,
“Trecvid 2016: Evaluating video search, video event de-
tection, localization, and hyperlinking,” in Proceedings
of TRECVID 2016. NIST, USA, 2016.
[8] Pedro F. Felzenszwalb, Ross B. Girshick, David
McAllester, and Deva Ramanan, “Object detection
with discriminatively trained part based models,” IEEE
Transactions on Pattern Analysis and Machine Intelli-
gence, vol. 32, no. 9, pp. 1627–1645, 2010.
[9] Maxime Oquab, L´eon Bottou, Ivan Laptev, and Josef
Sivic, “Learning and transferring mid-level image repre-
sentations using convolutional neural networks,” IEEE
Conference on Computer Vision and Pattern Recogni-
tion, 2014.
[10] Mark Everingham, Luc Van Gool, Christopher
K. I. Williams, John Winn, and Andrew Zisserman,
“The PASCAL Visual Object Classes Challenge
2007 (VOC2007) Results,” http://www.pascal-
network.org/challenges/VOC/voc2007/workshop/index.html.
[11] Ronald Alymer Fisher, “The use of multiple measure-
ments in taxonomic problems,” Annals of Eugenics, vol.
7, pp. 179–188, 1936.
[12] David A. Freedman, “Statistical models: Theory and
practice,” p. 128. Cambridge University Press, 2009.
[13] C Cortes and V. Vapnik, “Support-vector networks,”
Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
[14] N. S. Altman, “An introduction to kernel and nearest-
neighbor nonparametric regression,” The American
Statistician, vol. 46, no. 3, pp. 175–185, 1992.
[15] Tin Kam Ho, “The random subspace method for con-
structing decision forests,” IEEE Transactions on Pat-
tern Analysis and Machine Intelligence, vol. 20, no. 8,
pp. 832–844, 1998.
[16] Hyungtae Lee, Heesung Kwon, Ryan M. Robinson,
William D. Nothwang, and Amar M. Marathe, “Dy-
namic belief fusion for object detection,” IEEE Winter
Conference on Applications of Computer Vision, 2016.