More Related Content

Similar to Ensemble based method for the classification of flooding event using social media data(20)

More from multimediaeval(20)


Recently uploaded(20)

Ensemble based method for the classification of flooding event using social media data

  1. An ensemble based method for the classification of flooding event using social media data Muhammad Hanif, Huzaifa Joozer, Muhammad Atif Tahir, Muhammad Rafi Flood related multimedia: MediaEval 2020, December 14-15 2020, Online
  2. Table of Contents  Introduction  Approach  Visual Data Processing  Text Data Processing  Ensemble (Visual and Textual Data)  Results  Challenges and Future work
  3. Introduction  Flood-related Multimedia Task at MediaEval 2020  Goal:  Analyze Tweets (Visual and Textual data) for classification of flooding event [1]  Challenges:  Imbalance Dataset (4:1 ratio for majority to minority class)
  4. Approach: 1. Visual Data A. Selection of N input samples  Data Augmentation: To oversample the minority class, three augmented copies of each image in minority class are created through Augmentor library [2].  Stratified Random Sample Selection: Equal-sized random samples are selected from minority and majority class of the train-set.  Different samples (N=15) are created, so that same quantity of training models could be trained.
  5. Approach: 1. Visual Data B. Parallel training of N Models  Each training model receives its respective input of Randomly selected input images  Each model has utilized Visual Geometry Group (VGG16) [3] model, pre-trained on hybrid dataset of ImageNet [4] and Places365 [5] dataset. C. Prediction on Test-data (Majority Voting)  As N=15 number of models are trained during the training stage, so each of the model is utilized for the prediction of test-data, hence N number of predictions are retrieved.  Majority voting is applied on N predictions to decide final prediction for test-set.
  6. Approach: 2. Textual Data A. Translation: Translation of each tweet is performed by using googletrans [6]. B. Oversampling: Population of minority class in increased by Oversampling. C. TF-IDF Calculation: TF-IDF is calculated for the tweets. D. Training: Different classifiers are tried but Multinomial Naïve Bayes has produced better results during training, so it was selected for the prediction. E. Prediction: Trained Classifiers are utilized to perform prediction on test-set.
  7. Approach: 3. Ensemble of Visual & Textual Data  Average of probabilities  Final outcome of visual and textual data is retrieved in the form of probabilities  Average of both out come is calculated.  Prediction of class  Class of each instance of test-set is prediction on the basis of average probability
  8. Results Task Type Prediction on Test-set Average outcome Run 1: Fusion of text and visual data 27.86% 22.13% Run 2: Text 36.31% 35.71% Run 3: Visual 20.76% 13.18%  F1 evaluation results on Test-set Dataset
  9. Challenges and Future Work  The research has proposed ensemble based approach which has combined deep learning and shallow learning based methods, for the classification of flooding incident by using text and visual data extracted from social media data.  In future more data may be added for the effective training of deep learning model.  More deep leaning based pre-trained models may be test to improve the results.
  10. References 1. Stelios Andreadis, Ilias Gialampoukidis, Anastasios Karakostas, Stefanos Vrochidis, Ioannis Kompatsiaris, Roberto Fiorin, Daniele Norbiato, and Michele Ferri. 2020. The Flood-related Multimedia Task at MediaEval 2020. (2020) 2. Marcus D Bloice, Christof Stocker, and Andreas Holzinger. 2017. Augmentor: an image augmentation library for machine learning. arXiv preprint arXiv:1708.04680 (2017) 3. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014). 4. Ryan Lagerstrom, Yulia Arzhaeva, Piotr Szul, Oliver Obst, Robert Power, Bella Robinson, and Tomasz Bednarz. 2016. Image classification to support emergency situation awareness. Frontiers in Robotics and AI 3 (2016), 54. 5. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. IEEE, 248–255. 6. Suhun Han. 2020. googletrans 3.0.0. (2020). Accessed: 2020-11-1