To present on the seminar in DASH-Lab, SKKU, I brought out the thesis, which is Transferable GAN-generated Images (ICML 2020)
Detection.
.
If you want to see the context more specifically, you can see from this link : https://arxiv.org/abs/2008.04115
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...KIMMINHA3
This paper is about the Super-Resolution (SR) task and was introduced in CVPRW 2020 as the winner of two tasks with SR competition.
The authors called into question why there are no practical methods for denoising. Because previous papers dealt with ideal noise like bicubic downsampling.
To solve this impractical and ideal problem, the authors proposed to improve the resolution via kernel estimation and noise injection, which means that they do not use it while the training phase. That is why I was interested in this paper.
It is simply for before training. So I was interested in how they explore the proper with real-world images; kernel estimation and noise injection.
In summary, they save some informs of kernels that are applied corresponding to their formula using the eval data, i.e., no have ground truth. Also, the values of noise are as well.
These are what they are emphasizing novel method.
If you guys want to see and know more specifically this paper, you can cite this link:
https://openaccess.thecvf.com/content_CVPRW_2020/papers/w31/Ji_Real-World_Super-Resolution_via_Kernel_Estimation_and_Noise_Injection_CVPRW_2020_paper.pdf
In this presentation, I'll introduce the 'Real-world Super-Resolution via Kernel Estimation and Noise Injection'
Transfer learning for low frequency extrapolation from shot gathers for FWI a...Oleg Ovcharenko
Slides for my talk at EAGE 2019 in London this June. We attempt to extrapolate for missing low-frequency content in seismic data using a deep learning (DL) approach. We generate a set of random subsurface models and use those to produce a synthetic training dataset. We train a supervised DL model to infer a mono-frequency representation of a common shot gather, given respective data on multiple high frequencies. In the end, we show an example of FWI on extrapolated synthetic data and an example of bandwidth extrapolation on a single shot from field data.
Feasibility of moment tensor inversion for a single-well microseismic data us...Oleg Ovcharenko
Slides for my talk at GEO 2018 in Manama, Bahrain. We approach the problem of full moment tensor reconstruction when given data from a single well. Only 5 of 6 moment tensor components are resolved in isotropic medium (in anisotropic all 6 might be resolved in theory), whereas the 6th one is only approximated. We propose a data-driven approach to build a representation of all 6 components from amplitudes of first arrivals at 3-component geophone using a vanilla feed-forward multilayer perceptron.
Object extraction from satellite imagery using deep learningAly Abdelkareem
Presentation for extract objects from satellite imagery using deep learning techniques. you find a comparison between state-of-art approaches in computer vision.
[CVPRW 2020]Real world Super-Resolution via Kernel Estimation and Noise Injec...KIMMINHA3
This paper is about the Super-Resolution (SR) task and was introduced in CVPRW 2020 as the winner of two tasks with SR competition.
The authors called into question why there are no practical methods for denoising. Because previous papers dealt with ideal noise like bicubic downsampling.
To solve this impractical and ideal problem, the authors proposed to improve the resolution via kernel estimation and noise injection, which means that they do not use it while the training phase. That is why I was interested in this paper.
It is simply for before training. So I was interested in how they explore the proper with real-world images; kernel estimation and noise injection.
In summary, they save some informs of kernels that are applied corresponding to their formula using the eval data, i.e., no have ground truth. Also, the values of noise are as well.
These are what they are emphasizing novel method.
If you guys want to see and know more specifically this paper, you can cite this link:
https://openaccess.thecvf.com/content_CVPRW_2020/papers/w31/Ji_Real-World_Super-Resolution_via_Kernel_Estimation_and_Noise_Injection_CVPRW_2020_paper.pdf
In this presentation, I'll introduce the 'Real-world Super-Resolution via Kernel Estimation and Noise Injection'
Transfer learning for low frequency extrapolation from shot gathers for FWI a...Oleg Ovcharenko
Slides for my talk at EAGE 2019 in London this June. We attempt to extrapolate for missing low-frequency content in seismic data using a deep learning (DL) approach. We generate a set of random subsurface models and use those to produce a synthetic training dataset. We train a supervised DL model to infer a mono-frequency representation of a common shot gather, given respective data on multiple high frequencies. In the end, we show an example of FWI on extrapolated synthetic data and an example of bandwidth extrapolation on a single shot from field data.
Feasibility of moment tensor inversion for a single-well microseismic data us...Oleg Ovcharenko
Slides for my talk at GEO 2018 in Manama, Bahrain. We approach the problem of full moment tensor reconstruction when given data from a single well. Only 5 of 6 moment tensor components are resolved in isotropic medium (in anisotropic all 6 might be resolved in theory), whereas the 6th one is only approximated. We propose a data-driven approach to build a representation of all 6 components from amplitudes of first arrivals at 3-component geophone using a vanilla feed-forward multilayer perceptron.
Object extraction from satellite imagery using deep learningAly Abdelkareem
Presentation for extract objects from satellite imagery using deep learning techniques. you find a comparison between state-of-art approaches in computer vision.
https://imatge.upc.edu/web/publications/region-oriented-convolutional-networks-object-retrieval
BSc thesis by Eduard Fontdevila advised by Amaia Salvador and Xavier Giró-i-Nieto.
EET UPC, June 2015.
Surveillance scene classification using machine learningUtkarsh Contractor
The problem of scene classification in surveillance footage is of great importance for ensuring security in public areas. With challenges such as low quality feeds, occlusion, viewpoint variations, background clutter etc. The task is both challenging and error-prone. Therefore it is important to keep the false positives low to maintain a high accuracy of detection. In this paper, we adapt high performing CNN architectures to identify abandoned luggage in a surveillance feed. We explore several CNN based approaches, from Transfer Learning on the Imagenet dataset to object classification using Faster R-CNNs on the COCO dataset. Using network visualization techniques, we gain insight into what the neural network sees and the basis of classification decision. The experiments have been conducted on real world datasets, and highlights the complexity in such classifications. Obtained results indicate that a combination of proposed techniques outperforms the individual approaches.
Mask R-CNN present a conceptually simple, flexible, and general framework for object instance segmentation. This approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without tricks, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition.
presentation: https://www.youtube.com/watch?v=FZePQKPEwoo (한국어)
reference: He, Kaiming, et al. "Mask r-cnn." arXiv preprint arXiv:1703.06870 (2017).
Slides by Miriam Bellver from the Computer Vision Reading Group at the Universitat Politecnica de Catalunya about the paper:
Lu, Yongxi, Tara Javidi, and Svetlana Lazebnik. "Adaptive Object Detection Using Adjacency and Zoom Prediction." CVPR 2016
Abstract:
State-of-the-art object detection systems rely on an accurate set of region proposals. Several recent methods use a neural network architecture to hypothesize promising object locations. While these approaches are computationally efficient, they rely on fixed image regions as anchors for predictions. In this paper we propose to use a search strategy that adaptively directs computational resources to sub-regions likely to contain objects. Compared to methods based on fixed anchor locations, our approach naturally adapts to cases where object instances are sparse and small. Our approach is comparable in terms of accuracy to the state-of-the-art Faster R-CNN approach while using two orders of magnitude fewer anchors on average. Code is publicly available.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-leontiev
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Anton Leontiev, Embedded Software Architect at ELVEES, JSC, presents the "Designing a Stereo IP Camera From Scratch" tutorial at the May 2017 Embedded Vision Summit.
As the number of cameras in an intelligent video surveillance system increases, server processing of the video quickly becomes a bottleneck. On the other hand, when computer vision algorithms are moved to a resource-limited camera platform, their output quality is often unsatisfactory.
The effectiveness of vision algorithms for surveillance can be greatly improved by using a depth map in addition to the regular image. Thus, using a stereo camera is a way to enable offloading of advanced algorithms from servers to IP cameras. This talk covers the main problems arising during the design of an embedded stereo IP camera, including capturing video streams from two sensors, frame synchronization between sensors, stereo calibration algorithms, and, finally, disparity map calculation.
Image Classification Done Simply using Keras and TensorFlow Rajiv Shah
This presentation walks through the process of building an image classifier using Keras with a TensorFlow backend. It will give a basic understanding of image classification and show the techniques used in industry to build image classifiers. The presentation will start with building a simple convolutional network, augmenting the data, using a pretrained network, and finally using transfer learning by modifying the last few layers of a pretrained network. The classification will be based on the classic example of classifying cats and dogs. The code for the presentation can be found at https://github.com/rajshah4/image_keras, and the presentation will discuss how to extend the code to your own pictures to make a custom image classifier.
Leading water utility company in USA was facing a challenge to improve pipeline inspection process to reduce human errors and manual inspection time.Pipeline Anomaly Detection automates the process of identification of defects in pipeline videos, by a camera which notes the observations and lastly it generates the report.
Hadoop 2.0, and in particular YARN has opened up a lot of potential applications beyond MapReduce. This presentation explains some of the ways this happened, and what you can now do that you couldn't before. It also introduces some new tools (Spark) and infrastructure pieces (Mesos) to achieve even more efficient cluster use.
Locating objects in images (“detection”) quickly and efficiently enables object tracking and counting applications on embedded visual sensors (fixed and mobile). By 2012, progress on techniques for detecting objects in images – a topic of perennial interest in computer vision – had plateaued, and techniques based on histogram of oriented gradients (HOG) were state of the art. Soon, though, convolutional neural networks (CNNs), in addition to classifying objects, were also beginning to become effective at simultaneously detecting objects. Research in CNN-based object detection was jump-started by the groundbreaking region-based CNN (R-CNN). We’ll follow the evolution of neural network algorithms for object detection, starting with R-CNN and proceeding to Fast R-CNN, Faster R-CNN, “You Only Look Once” (YOLO), and up to the latest Single Shot Multibox detector. In this talk, we’ll examine the successive innovations in performance and accuracy embodied in these algorithms – which is a good way to understand the insights behind effective neural-network-based object localization. We’ll also contrast bounding-box approaches with pixel-level segmentation approaches and present pros and cons.
This contains the agenda of the Spark Meetup I organised in Bangalore on Friday, the 23rd of Jan 2014. It carries the slides for the talk I gave on distributed deep learning over Spark
https://imatge.upc.edu/web/publications/region-oriented-convolutional-networks-object-retrieval
BSc thesis by Eduard Fontdevila advised by Amaia Salvador and Xavier Giró-i-Nieto.
EET UPC, June 2015.
Surveillance scene classification using machine learningUtkarsh Contractor
The problem of scene classification in surveillance footage is of great importance for ensuring security in public areas. With challenges such as low quality feeds, occlusion, viewpoint variations, background clutter etc. The task is both challenging and error-prone. Therefore it is important to keep the false positives low to maintain a high accuracy of detection. In this paper, we adapt high performing CNN architectures to identify abandoned luggage in a surveillance feed. We explore several CNN based approaches, from Transfer Learning on the Imagenet dataset to object classification using Faster R-CNNs on the COCO dataset. Using network visualization techniques, we gain insight into what the neural network sees and the basis of classification decision. The experiments have been conducted on real world datasets, and highlights the complexity in such classifications. Obtained results indicate that a combination of proposed techniques outperforms the individual approaches.
Mask R-CNN present a conceptually simple, flexible, and general framework for object instance segmentation. This approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without tricks, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition.
presentation: https://www.youtube.com/watch?v=FZePQKPEwoo (한국어)
reference: He, Kaiming, et al. "Mask r-cnn." arXiv preprint arXiv:1703.06870 (2017).
Slides by Miriam Bellver from the Computer Vision Reading Group at the Universitat Politecnica de Catalunya about the paper:
Lu, Yongxi, Tara Javidi, and Svetlana Lazebnik. "Adaptive Object Detection Using Adjacency and Zoom Prediction." CVPR 2016
Abstract:
State-of-the-art object detection systems rely on an accurate set of region proposals. Several recent methods use a neural network architecture to hypothesize promising object locations. While these approaches are computationally efficient, they rely on fixed image regions as anchors for predictions. In this paper we propose to use a search strategy that adaptively directs computational resources to sub-regions likely to contain objects. Compared to methods based on fixed anchor locations, our approach naturally adapts to cases where object instances are sparse and small. Our approach is comparable in terms of accuracy to the state-of-the-art Faster R-CNN approach while using two orders of magnitude fewer anchors on average. Code is publicly available.
For the full video of this presentation, please visit:
https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2017-embedded-vision-summit-leontiev
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Anton Leontiev, Embedded Software Architect at ELVEES, JSC, presents the "Designing a Stereo IP Camera From Scratch" tutorial at the May 2017 Embedded Vision Summit.
As the number of cameras in an intelligent video surveillance system increases, server processing of the video quickly becomes a bottleneck. On the other hand, when computer vision algorithms are moved to a resource-limited camera platform, their output quality is often unsatisfactory.
The effectiveness of vision algorithms for surveillance can be greatly improved by using a depth map in addition to the regular image. Thus, using a stereo camera is a way to enable offloading of advanced algorithms from servers to IP cameras. This talk covers the main problems arising during the design of an embedded stereo IP camera, including capturing video streams from two sensors, frame synchronization between sensors, stereo calibration algorithms, and, finally, disparity map calculation.
Image Classification Done Simply using Keras and TensorFlow Rajiv Shah
This presentation walks through the process of building an image classifier using Keras with a TensorFlow backend. It will give a basic understanding of image classification and show the techniques used in industry to build image classifiers. The presentation will start with building a simple convolutional network, augmenting the data, using a pretrained network, and finally using transfer learning by modifying the last few layers of a pretrained network. The classification will be based on the classic example of classifying cats and dogs. The code for the presentation can be found at https://github.com/rajshah4/image_keras, and the presentation will discuss how to extend the code to your own pictures to make a custom image classifier.
Leading water utility company in USA was facing a challenge to improve pipeline inspection process to reduce human errors and manual inspection time.Pipeline Anomaly Detection automates the process of identification of defects in pipeline videos, by a camera which notes the observations and lastly it generates the report.
Hadoop 2.0, and in particular YARN has opened up a lot of potential applications beyond MapReduce. This presentation explains some of the ways this happened, and what you can now do that you couldn't before. It also introduces some new tools (Spark) and infrastructure pieces (Mesos) to achieve even more efficient cluster use.
Locating objects in images (“detection”) quickly and efficiently enables object tracking and counting applications on embedded visual sensors (fixed and mobile). By 2012, progress on techniques for detecting objects in images – a topic of perennial interest in computer vision – had plateaued, and techniques based on histogram of oriented gradients (HOG) were state of the art. Soon, though, convolutional neural networks (CNNs), in addition to classifying objects, were also beginning to become effective at simultaneously detecting objects. Research in CNN-based object detection was jump-started by the groundbreaking region-based CNN (R-CNN). We’ll follow the evolution of neural network algorithms for object detection, starting with R-CNN and proceeding to Fast R-CNN, Faster R-CNN, “You Only Look Once” (YOLO), and up to the latest Single Shot Multibox detector. In this talk, we’ll examine the successive innovations in performance and accuracy embodied in these algorithms – which is a good way to understand the insights behind effective neural-network-based object localization. We’ll also contrast bounding-box approaches with pixel-level segmentation approaches and present pros and cons.
This contains the agenda of the Spark Meetup I organised in Bangalore on Friday, the 23rd of Jan 2014. It carries the slides for the talk I gave on distributed deep learning over Spark
Efficient Reversible Data Hiding Algorithms Based on Dual Predictionsipij
In this paper, a new reversible data hiding (RDH) algorithm that is based on the concept of shifting of
prediction error histograms is proposed. The algorithm extends the efficient modification of prediction
errors (MPE) algorithm by incorporating two predictors and using one prediction error value for data
embedding. The motivation behind using two predictors is driven by the fact that predictors have different
prediction accuracy which is directly related to the embedding capacity and quality of the stego image. The
key feature of the proposed algorithm lies in using two predictors without the need to communicate
additional overhead with the stego image. Basically, the identification of the predictor that is used during
embedding is done through a set of rules. The proposed algorithm is further extended to use two and three
bins in the prediction errors histogram in order to increase the embedding capacity. Performance
evaluation of the proposed algorithm and its extensions showed the advantage of using two predictors in
boosting the embedding capacity while providing competitive quality for the stego image.
Statistical performance assessment of supervised machine learning algorithms ...IAESIJAI
Several studies have shown that an ensemble classifier's effectiveness is directly correlated with the diversity of its members. However, the algorithms used to build the base learners are one of the issues encountered when using a stacking ensemble. Given the number of options, choosing the best ones might be challenging. In this study, we selected some of the most extensively applied supervised machine learning algorithms and performed a performance evaluation in terms of well-known metrics and validation methods using two internet of things (IoT) intrusion detection datasets, namely network-based anomaly internet of things (N-BaIoT) and internet of things intrusion detection dataset (IoTID20). Friedman and Dunn's tests are used to statistically examine the significant differences between the classifier groups. The goal of this study is to encourage security researchers to develop an intrusion detection system (IDS) using ensemble learning and to propose an appropriate method for selecting diverse base classifiers for a stacking-type ensemble. The performance results indicate that adaptive boosting, and gradient boosting (GB), gradient boosting machines (GBM), light gradient boosting machines (LGBM), extreme gradient boosting (XGB) and deep neural network (DNN) classifiers exhibit better trade-off between the performance parameters and classification time making them ideal choices for developing anomaly-based IDSs.
Deep Learning: Chapter 11 Practical MethodologyJason Tsai
Lecture for Deep Learning 101 study group to be held on June 9th, 2017.
Reference book: https://www.deeplearningbook.org/
Past video archives: https://goo.gl/hxermB
Initiated by Taiwan AI Group (https://www.facebook.com/groups/Taiwan.AI.Group/)
ABSTRACT: In the field of computer science known as "machine learning," a computer makes predictions about
the tasks it will perform next by examining the data that has been given to it. The computer can access data via
interacting with the environment or by using digitalized training sets. In contrast to static programming
algorithms, which require explicit human guidance, machine learning algorithms may learn from data and
generate predictions on their own. Various supervised and unsupervised strategies, including rule-based
techniques, logic-based techniques, instance-based techniques, and stochastic techniques, have been presented in
order to solve problems. Our paper's main goal is to present a comprehensive comparison of various cutting-edge
supervised machine learning techniques.
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioAlluxio, Inc.
Alluxio Global Online Meetup
Apr 23, 2020
For more Alluxio events: https://www.alluxio.io/events/
Speakers:
Jiao (Jennie) Wang, Intel
Tsai Louie, Intel
Bin Fan, Alluxio
Today, many people run deep learning applications with training data from separate storage such as object storage or remote data centers. This presentation will demo the Intel Analytics Zoo + Alluxio stack, an architecture that enables high performance while keeping cost and resource efficiency balanced without network being I/O bottlenecked.
Intel Analytics Zoo is a unified data analytics and AI platform open-sourced by Intel. It seamlessly unites TensorFlow, Keras, PyTorch, Spark, Flink, and Ray programs into an integrated pipeline, which can transparently scale from a laptop to large clusters to process production big data. Alluxio, as an open-source data orchestration layer, accelerates data loading and processing in Analytics Zoo deep learning applications.
This talk, we will go over:
- What is Analytics Zoo and how it works
- How to run Analytics Zoo with Alluxio in deep learning applications
- Initial performance benchmark results using the Analytics Zoo + Alluxio stack
Covers basics Artificial neural networks and motivation for deep learning and explains certain deep learning networks, including deep belief networks and autoencoders. It also details challenges of implementing a deep learning network at scale and explains how we have implemented a distributed deep learning network over Spark.
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
Does more data always improve ML models? Is it better to use distributed ML instead of single node ML?
In this talk I will show that while more data often improves DL models in high variance problem spaces (with semi or unstructured data) such as NLP, image, video more data does not significantly improve high bias problem spaces where traditional ML is more appropriate. Additionally, even in the deep learning domain, single node models can still outperform distributed models via transfer learning.
Data scientists have pain points running many models in parallel automating the experimental set up. Getting others (especially analysts) within an organization to use their models Databricks solves these problems using pandas udfs, ml runtime and MLflow.
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsValery Tkachenko
While we have seen a tremendous growth in machine learning methods over the last two decades there is still no one fits all solution. The next era of cheminformatics and pharmaceutical research in general is focused on mining the heterogeneous big data, which is accumulating at ever growing pace, and this will likely use more sophisticated algorithms such as Deep Learning (DL). There has been increasing use of DL recently which has shown powerful advantages in learning from images and languages as well as many other areas. However the accessibly of this technique for cheminformatics is hindered as it is not available readily to non-experts. It was therefore our goal to develop a DL framework embedded into a general research data management platform (Open Science Data Repository) which can be used as an API, standalone tool or integrated in new software as an autonomous module. In this poster we will present results of comparing performance of classic machine learning methods (Naïve Bayes, logistic regression, Support Vector Machines etc.) with Deep Learning and will discuss challenges associated with Ddeep Learning Neural Networks (DNN). The DNN learning models of different complexity (up to 6 hidden layers) were built and tuned (different number of hidden units per layer, multiple activation functions, optimizers, drop out fraction, regularization parameters, and learning rate) using Keras (https://keras.io/) and Tensorflow (www.tensorflow.org) and applied to various use cases connected to prediction of physicochemical properties, ADME, toxicity and calculating properties of materials. It was also shown that using nVidia GPUs significantly accelerates calculations, although memory consumption puts some limits on performance and applicability of standard toolkits 'as is'.
Similar to Transferable GAN-generated Images Detection Framework. (20)
[ECCV2022] Generative Domain Adaptation for Face Anti-SpoofingKIMMINHA3
이미지 변환을 통해 대상 데이터를 소스 도메인 스타일로 스타일화하여 대상 데이터를 소스 모델에 직접 맞추는 얼굴 안티 스푸핑을 위한 감독되지 않은 도메인 적응의 새로운 관점을 제안한다.
스타일화를 보장하기 위해, 도메인 간 신경 통계 NSC과 DSC과 결합된 생성 도메인 적응 프레임워크를 제시함. 그리고 일반화를 보장하기 위해 목표 데이터 분포를 더욱 확장하기 위해 도메인 내 SpecMix을 제시함.
광범위한 Experiments과 Visualization을 통해 제안된 방법의 효과를 입증.
[CVPR'22] Domain Generalization via Shuffled Style Assembly for Face Anti-Spo...KIMMINHA3
<Contribution>
FAS(Face Anti-Spoofing)를 일반화할 수 있는 어셈블리 네트워크(SSAN) 제안
도메인 구별이 불가능하도록 Adversarial Learning을 채택한다.
Style feature의 경우, 도메인별 정보를 억제하면서 활력 관련 스타일 정보를 강조하기 위해 Contrastive Learning이 사용된다.
기존 데이터셋을 집계하여 FAS에 대한 대규모 벤치마크를 구축
[TIFS'22] Learning Meta Pattern for Face Anti-SpoofingKIMMINHA3
Digital displays are made of glass sand have high reflection coefficients.
Printed photos attacks tend to present lower image quality due to the low Dots Per Inch (DPI) and color degradation.
이전 연구에선 handcrafted features을 이용해서 spoofing attack을 탐지했음
(i.e., Local Binary Pattern (LBP), Speeded Up Robust Features (SURF), and Blurring, etc.)
→ 전문가들의 domain knowledge에 의존하는 문제가 있음
Only using RGB images as input
→ 모델이 source domain에 overfitting되는 문제
이를 극복하기 위해서, hand-crafted feature을 함께 학습시키거나 HSV 채널을 concat하는 방식이 등장
→ 모델 일반화 성능을 높이기엔 충분하지 않음
본 논문에선 meta-learning을 통해 Meta Pattern (MP)를 생성할 수 있는 네트워크를 제안
생성된 MP와 RGB image를 융화시키기 위한 Hierarchical Fusion Network (HFN)을 제안
최근, Spoof trace disentanglement framework가 일반화 성능 측면에서 높은 잠재성을 보여주며 등장하고 있음.
하지만, single-modal input 시나리오에서 제약이 큼.
본 논문에선 다음과 같은 방법을 제안
1. Multi-modal disentanglement model
→robust generic attack detection
2. Two-stream disentangling network
→robust on RGB and depth inputs
3. Fusion module
→ spoof의 RGB, Depth로 부터 more informative feature을 각각 생성
I summarized that I've researched super-resolution tasks. In detail, I studied the SR as unsupervised learning which doesn't need some ground-truth high-resolution dataset. But I changed the main topic from unsupervised learning. That is SR is based on continual learning.
I'm continually researching the unsupervised SR that I cannot finish in the past. Thus, it's an honor any contact which is interesting in the SR.
(You can refer to these slides anytime.)
[CVPRW2021]FReTAL: Generalizing Deepfake detection using Knowledge Distillati...KIMMINHA3
This is my first paper when I was a graduate student.
I really appreciate it if you want to use this presentation, and I would if it is useful for you. If you use it, please refer to my name or my ID for your future presentation.
We propose a novel domain adaption framework, “Feature Representation Transfer Adaptation Learning” (FReTAL), based on knowledge distillation and representation learning that can prevent catastrophic forgetting without accessing the source domain data.
We show that leveraging knowledge distillation and representation learning can enhance adaptability across different deepfake domains.
We demonstrate that our method outperforms baseline approaches on deepfake benchmark datasets with up to 86.97% accuracy on low-quality deepfake detection.
We propose a novel domain adaption framework, “Feature Representation Transfer Adaptation Learning” (FReTAL), based on knowledge distillation and representation learning that can prevent catastrophic forgetting without accessing to the source domain data.
We show that leveraging knowledge distillation and representation learning can enhance adaptability across different deepfake domains.
We demonstrate that our method outperforms baseline approaches on deepfake benchmark datasets with up to 86.97% accuracy on low-quality deepfake detection.
They proposed two novel methods.
1. Stripe-Wise Pruning (SWP)
They propose a new pruning paradigm called SWP (Stripe-Wise Pruning)
They achieve a higher pruning ratio compared to the filter-wise, channel-wise, and group-wise pruning methods.
2. Filter Skeleton (FS)
They propose a new method ‘Filter Skeleton’ to efficiently learn the optimal shape of the filters for pruning.
They didn't much compare with other baselines. But they obviously suggested the novel methods, that is why I choose for review when reviewing the paper. More, they said that It is State-of-the-art (SOTA) method of lately pruning methods.
Methods for interpreting and understanding deep neural networksKIMMINHA3
This paper "methods for interpreting and understanding deep neural networks" was presented in ICASSP 2017, and introduced by G Montavon et al.
Now, the number of citations is more than 1,370. That is, this paper has a lot of things to study for deep learning technology.
Meta learned Confidence for Few-shot LearningKIMMINHA3
This was presented Meta learned Confidence for Few-shot Learning on CVPR in 2020.
Few-shot learning is an important challenge under data scarcity.
When there is a lot of unlabeled data and data scarcity,
a) leveraging nearest neighbor graph
b) using predicted soft or hard labels on unlabeled samples to update the class prototype.
the model confidence may be unreliable, which may lead to incorrect predictions.
Extreme Incetion을 명명한 "Xception"은 적은 파라미터 수로 빠르게 학습시키는 CNN 모델 중 한개이다.
Depth wise Conv -> Point wise Conv 방식인 Depth wise Separable Conv와
모듈간의 분리를 한 Cross-channel correlation 형태인 Inception Hypothesis의 개념을 합친 것이라고 볼 수 있다.
[출처]
Sparable Convolutions :
https://zzsza.github.io/data/2018/02/23/introduction-convolution/
Inception & Xception :
https://hichoe95.tistory.com/49
short text large effect measuring the impact of user reviews on android app s...KIMMINHA3
ppt about short text large effect measuring the impact of user reviews on android app security & privacy
this is for presentation NLP study in Hanyang-University
If Anyone wants, You may use this
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
1. Data-driven AI
Security HCI (DASH) Lab
1
Data-driven AI
Security HCI (DASH) Lab
김민하
성균관대학교
July 23, 2020
Data-driven AI
Security HCI (DASH) Lab
HyeonseongJeon1 YoungohBang1 JunyaupKim SimonS.Woo1
2. Data-driven AI
Security HCI (DASH) Lab
Background
• high-resolution images produced
by the latest GANs
• It’s feasible through few-shot or
single-shot learning
This paper proposes a novel regularization method with self-
training for transfer learning by combining and transforming
regularization, augmentation, self-training
3. Data-driven AI
Security HCI (DASH) Lab
Limitation
● Relying on Metadata Information
They cannot provide an optimal solution for transfer learning
5. Data-driven AI
Security HCI (DASH) Lab
Transfer learning framework in more detail.
Calculate the L2 SP using the weights
6. Data-driven AI
Security HCI (DASH) Lab
L2-SP(L2-Starting Point)
Pre trained weights Of the Conv layers
Eq. 1
Eq. 2 L2-SP
L2-Norm
L2-SP differs in that the starting point from a well pre-trained
source dataset guides the learning process by referring to the
information of the pre-trained source dataset.
7. Data-driven AI
Security HCI (DASH) Lab
Eq. 3
Eq. 4
Eq. 5
Binary Cross Entropy
Self-training for L2-SP
inversely proportional.
Final loss Function
L2-norm
L2-SP
Self training
9. Data-driven AI
Security HCI (DASH) Lab
Generaltransfer
● It is common practice to freeze
some weights of pre-trained
model from source dataset, and
finetune the model with weight
decay to the target dataset.
Baselines
ForensicTransfer
● ForensicTransfer introduced an
autoencoder for the GAN-image
detection.
● Although the ForensicTransfer
showed promise for model
transferability, its performance
remains mediocre.
Cozzolino, D., Thies, J., R¨ossler, A., Riess, C., Nießner, M., and Verdoliva,
L. Forensictransfer: Weakly-supervised domain adaptation for forgery
detection. arXiv preprint arXiv:1812.02510, 2018.
10. Data-driven AI
Security HCI (DASH) Lab
Performance Results
As you can see, Even though T-GD did Transfer learning,
it has been confirmed that there is a high probability of performance
even if the data set of different types
11. Data-driven AI
Security HCI (DASH) Lab
ResNext vs. EfficientNet
Therefore, TGD performance is not directly
related to the number of parameters.
EfficientNet shows better performance in T-GD
even though it has a few parameters.
When comparing T-GD from different base models…
12. Data-driven AI
Security HCI (DASH) Lab
Non-face GAN-image detection
T-GD is effective not only for GAN-generated face detection,
but also for nonface tasks
13. Data-driven AI
Security HCI (DASH) Lab
Self-training & Data augmentation effects
intra-class Cutmix, JPEG compression,
Gaussian blur, and random horizontal flip
to avoid over-fitting in transfer learning…
Self-trained and augmentation method’s AUROC is
higher than with out method’s AUROC
15. Data-driven AI
Security HCI (DASH) Lab
Contribution
● Maintain high performance and Overcoming catastrophic forgetting
during Transfer Learning
● Effectively detecting state-of-the art GAN-images with a small data
without any metadata information.
(Hello, I’m glad to be here with you today.)
(Let me start off by brifly introducing myself first.
My name is minha kim.
I am going to transfer from Hanyang University to the Department of Software at Sungkyunkwan University..)
The second paper I’m going to talk about is T-GD: Transferable GAN-generated Images Detection Framework
It was registered in the ICML in 2020.
Recent advancements(엣벤스먼츠) in Generative Adversarial Networks (GANs) enable the generation of realistic images, which has now become feasible through(뜨루) few-shot or single-shot learning.
Even high-resolution images produced by the latest GANs are hardly distinguishable from real images or by human inspection.
While many studies on transfer learning have already shown impressive performance, they have not applied for GAN-image detection.
Now I talk about limitations of previous works.
First, Some methods rely on detection with the metadata such as GAN-model information.
Second, Data augmentation methods such as JPEG compression and Gaussian Blur are not fully explained about the generalized way.
Third, they show relatively weak results for transfer learning ability within GAN-image detection.
That is, They cannot provide an optimal solution for transfer learning
-------------------------------
metadata information???
metadata data is data’s information
For example, if data is composed of PGGAN, the meatadata show that it was created as PGGAN
Let me explain the framework of TGD.
TGD consists of Teacher Classifier and Student Classifier
Teacher Classifier is pre-trained classifier, and Student Classifier is that we will train using L2 SP and Self-training.
Let me explain about transfer learning framework in more detail.
First, Calculate the L2 SP using the weights of the pre-trained Teacher Model for self training.
Then, apply binary cross entropy and apply self training as suggested in this paper
Through this self training, we will adjust the target date while automatically adjusting the regularization learning rate.
--------------------------------
Stochastic Depth?
Let me explain about L2-sp for self training breifly.
The weight of the pretrained model from the source dataset is used as the SPAR that is (starting point as the reference).
we use L2-SP for transfer learning.
which regularizes the weight variation of a target model by referring to the weights pre trained on the source dataset
L2-SP differs in that the starting point from a well pre-trained source dataset guides the learning process by referring to the information of the pre-trained source dataset.
This method does not require freezing the weights of the pre-trained model nor using weight decay.
----------------------------------------------
L2 SP이용하는 이유?
This method does not require freezing the weights of the pre-trained model nor using weight decay.
This Regularization can lead to a better optimization by preventing over-fitting when learning from scratch;
Regularization 하는 이유?
Regularization can lead to a better optimization by preventing over-fitting when learning from scratch;
1. The loss function uses binary cross entropy as shown in equation 3.
the input data, ˜ x(틸트)noised i , is from the target dataset with noise injection.
Fw(prime) denotes the pre-trained models from the source dataset
2. And for stable self-learning, gamma values can be obtained by inserting the target source into the Teacher model, just like equation 4
As you can see, when you put the target datas in the Teacher model, you can see that the Loss and Gamma values are inversely proportional.This means that if Teacher model considers Target data unfamiliar, it will lower the gamma value so that it can learn more.
and the negative value of the result is taken and transformed by the sigmoid function γ in Eq. 4,
3.
The final loss function, as shown in Eq. 5, is composed of a cross-entropy term and an L2-SP term for the self training of the student model
----------------------------------------------
*Advantage of self training
The proposed method has the advantage of preventing either excessive or minor regularization.
---------------------------------------------
Y~ = target data
Y = source data
W’ = weights of the teacher model
Y헷 =
Fw’ denotes the pre-trained models from the source dataset
-----------------------
Sigma, S ?????
And Sigma means a sigmoid function.
s is a hyperparameter taking values from 0.1 to 2.0
Next is paper’s proposed intra-class CutMix.
This paper introduce a novel augmentation method to solve the over-fitting problem by transforming Cutmix
On the left side, the original CutMix, that is inter-class CutMix. Inter-class Cutmix replaces the chosen patch with another image patch in the same location.
On the right side is shown this paper proposed Intra-class CutMix.
To put it simply, A1 GAN image is cropped to a random size at a random location, then A2 GAN image is changed to the same location and size.
this paper found that the inter-class CutMix for a binary classification causes highly unstable training
----------------------------------
What is difference from two methods.
Before I explain the performance comparison results of TGD, I will briefly explain two transfer methods introduced in Baseline.
First. It’s a Generaltransfer learning method.
It is common practice to freeze some weights of pre-trained model from source dataset, and finetune the model with weight decay to the target dataset.
Second. It’s a ForensicTransfer.
ForensicTransfer introduced an auto encoder for the GAN-image detection.
They apply auto encoder and detect GAN-images through reconstruction error. This learning method has advantages regarding lower data usage, when the model is well trained.
Although the ForensicTransfer showed promise for model transferability, its performance remains mediocre
In this paper, GN+WS is applied instead of BN to quickly achieve high Accuracy even in small batches.The disadvantage of BN is that the performance of the model depends on the large batch size because it performs normalization in minibatch units.Various techniques have been proposed to address these issues, such as Group Normalization (GN), but in typical large-batch training situations, the performance of the BN is not as good as that of the BN and is significantly underutilized.The Weight Standardization (WS) proposed in this paper has shown in the 'WS' paper that it achieved better performance than BN in large-batch situations by completely eliminating minibatch dependency, as shown in GN.-----------------
Weight standarzation?
This table is performance Results of Baseline methods and T-GD using EfficientNet and ResNext
The evaluation metric is AUROC (%).
The Data set column indicates pretrained model from a source data set,
and the Data set row indicates the target test set for transfer learning.
The Zero-shot category represents the performance of a pre-trained model without any additional training and
the Transfer learning category represents each pre-trained model transferred from the source to target dataset.
As you can see, Even though T-GD did Transfer learning,
it has been confirmed that there is a high probability of performance even if the data set of different types
-------------------------------
Auroc는?
AUROC is an abbreviation for the Area Under a ROC Curve. It is the area of the ROC curve calculated and graphically represented by TPR and FPR.
In comparison to T-GD from different base models, the results show subtle differences.
Although the number of parameters in ResNext is greater than that of EfficientNet,
EfficientNet shows better performance in T-GD even though it has a few parameters.
Although the number of parameters affects the classification performance, the performance of EfficientNet was superior to that of ResNext in TGD transfer tasks.
Therefore, TGD performance is not directly related to the number of parameters.
------------------
EfficientNet have 3million weights, and ResNext have 20million weights
ResNet이 depth scaling을 통해 모델의 크기를 조절하는 대표적인 모델
Non-face GAN-image detection.
T-GD is effective not only for GAN-generated face detection, but also for nonface tasks.
This experimented with transfer learning from non-face GAN-images as the source (PGGAN-images from LSUN-bedroom and LSUN-bird) to face GAN-images as the target.
This achieved stable AUROC on both detection tasks as shown in Table.
------------------------------
Auroc는?
AUROC is an abbreviation for the Area Under a ROC Curve. It is the area of the ROC curve calculated and graphically represented by TPR and FPR.
I explained why self-training is used for L2-SP.
The following table shows Self-training and Data augmentation effects.
As you can see, Self-trained method’s AUROC is higher than with out self-trained method’s AUROC
To compare the performance of this model with and without self-training,
it’s keep all other settings the same.
And at Data augmentation effect.
to avoid over-fitting in transfer learning, this paper utilized the following data augmentation methods :
Using intra-class Cutmix, JPEG compression, Gaussian blur, and random horizontal flip.
Despite the small reduction in the target AUROC, the drastic increase in the source AUROC implies that over-fitting can be avoided through these augmentation methods in transfer learning, while preventing catastrophic forgetting. //이까지만 설명
+
For the target dataset with augmentation, the AUROC of T-GD dropped from 99.38% to 98.13% (1.25%), but we achieved a 10.04% higher AUROC for the source dataset than that of the same dataset without augmentation (from 85.04% to 95.08%).
------------------------------------------------------------
w augrmentation에서 target auroc가 떨어진거면 더 안좋은 성능인거 아닌가?
The target auroc alone does, but it is a small decrease compared to the source auroc’s increasing of about 10 percent.It is relatively more effective when w augmentation is applied.
--------------------------
catastrophic forgetting??
(the Neural Network shows excellent performance for Single task, but )there is a problem that learning different kinds of tasks significantly reduces the performance of previously learned tasks. This is called Catastrophic porting.
This is Validation loss in transfer learning between Cutmix and intra-class Cutmix.
the validation loss for yellow and red lines, that intra-class Cutmix is considerably lower and more stable than that for Cutmix (green and blue).
-------------------------------------------
1.This paper present T-GD network, a method to maintain high performance on both the source and target datasets for the GAN-image detection during transfer learning.
This paper propose the novel regularization and augmentation techniques, the L2-SP self-training and intra-class Cutmix, building upon well-known CNN backbone models.
In addition, target AUROC is increased while preserving existing source AUROC using self training and intra cutmix, thus preventing CATASTRHIC FORGETTING.
2.
T-GD achieves high performance on the source dataset by overcoming catastrophic forgetting and effectively detecting state-of-the artGAN-images with only a small volume of data without any metadata information.
--------------------
catastrophic forgetting??
(the Neural Network shows excellent performance for Single task, but )there is a problem that learning different kinds of tasks significantly reduces the performance of previously learned tasks. This is called Catastrophic porting.
----------------------------------------------
metadata information???
metadata data is data’s information
For example, if data is composed of PGGAN, the meatadata show that it was created as PGGAN
------------------------------------------
이 논문의 장점 단점?
Adventage?
1.without any meta, this able to transfer learning
2.this overcome catastrophic forgetting
Weakness?
If there were comparison results for active cutmix, cutout, mixup, etc. in addition to general cutmix, it would have been possible to objectify the results of intra cutmix.