Korean Paper Review of "Abnormal Event Detection in Videos using Generative Adversarial Nets"
(Review date: 2021.05.17 @ Soongsil Univ. Cognitive Science Class)
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...MYEONGGYU LEE
review date: 2019/07/26 (by Meyong-Gyu.LEE @Soongsil Univ.)
Eng+Kor review of 'Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder' (Siggraph 2017)
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...Taegyun Jeon
PR-050: Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Original Slide from http://home.cse.ust.hk/~xshiab/data/valse-20160323.pptx
Youtube: https://youtu.be/3cFfCM4CXws
Learning visual representation without human labelKai-Wen Zhao
Self supervised learning (SSL) is one of the most fast-growing research topic in recent years. SSL provides algorithm that directly learn visual representation from data itself rather than human manual labels. From theoretical point of view, SSL explores information theory & the nature of large scale dataset.
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...MYEONGGYU LEE
review date: 2019/07/26 (by Meyong-Gyu.LEE @Soongsil Univ.)
Eng+Kor review of 'Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder' (Siggraph 2017)
[PR12] PR-050: Convolutional LSTM Network: A Machine Learning Approach for Pr...Taegyun Jeon
PR-050: Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
Original Slide from http://home.cse.ust.hk/~xshiab/data/valse-20160323.pptx
Youtube: https://youtu.be/3cFfCM4CXws
Learning visual representation without human labelKai-Wen Zhao
Self supervised learning (SSL) is one of the most fast-growing research topic in recent years. SSL provides algorithm that directly learn visual representation from data itself rather than human manual labels. From theoretical point of view, SSL explores information theory & the nature of large scale dataset.
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...Joonhyung Lee
A presentation introducting DeepLab V3+, the state-of-the-art architecture for semantic segmentation. It also includes detailed descriptions of how 2D multi-channel convolutions function, as well as giving a detailed explanation of depth-wise separable convolutions.
Visual odometry & slam utilizing indoor structured environmentsNAVER Engineering
Visual odometry (VO) and simultaneous localization and mapping (SLAM) are fundamental building blocks for various applications from autonomous vehicles to virtual and augmented reality (VR/AR).
To improve the accuracy and robustness of the VO & SLAM approaches, we exploit multiple lines and orthogonal planar features, such as walls, floors, and ceilings, common in man-made indoor environments.
We demonstrate the effectiveness of the proposed VO & SLAM algorithms through an extensive evaluation on a variety of RGB-D datasets and compare with other state-of-the-art methods.
The presentation is coverong the convolution neural network (CNN) design.
First,
the main building blocks of CNNs will be introduced. Then we systematically
investigate the impact of a range of recent advances in CNN architectures and
learning methods on the object categorization (ILSVRC) problem. In the
evaluation, the influence of the following choices of the architecture are
tested: non-linearity (ReLU, ELU, maxout, compatibility with batch
normalization), pooling variants (stochastic, max, average, mixed), network
width, classifier design (convolution, fully-connected, SPP), image
pre-processing, and of learning parameters: learning rate, batch size,
cleanliness of the data, etc.
#6 PyData Warsaw: Deep learning for image segmentationMatthew Opala
Deep learning techniques ignited a great progress in many computer vision tasks like image classification, object detection, and segmentation. Almost every month a new method is published that achieves state-of-the-art result on some common benchmark dataset. In addition to that, DL is being applied to new problems in CV.
In the talk we’re going to focus on DL application to image segmentation task. We want to show the practical importance of this task for the fashion industry by presenting our case study with results achieved with various attempts and methods.
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 3: prior embedding deep super resolution
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서
GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.
For the full video of this presentation, please visit:
http://www.embedded-vision.com/platinum-members/auvizsystems/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Nagesh Gupta, Founder and CEO of Auviz Systems, presents the "Semantic Segmentation for Scene Understanding: Algorithms and Implementations" tutorial at the May 2016 Embedded Vision Summit.
Recent research in deep learning provides powerful tools that begin to address the daunting problem of automated scene understanding. Modifying deep learning methods, such as CNNs, to classify pixels in a scene with the help of the neighboring pixels has provided very good results in semantic segmentation. This technique provides a good starting point towards understanding a scene. A second challenge is how such algorithms can be deployed on embedded hardware at the performance required for real-world applications. A variety of approaches are being pursued for this, including GPUs, FPGAs, and dedicated hardware.
This talk provides insights into deep learning solutions for semantic segmentation, focusing on current state of the art algorithms and implementation choices. Gupta discusses the effect of porting these algorithms to fixed-point representation and the pros and cons of implementing them on FPGAs.
The problem of scene classification in surveillance footage is of great importance for ensuring security in public areas. With challenges such as low quality feeds, occlusion, viewpoint variations, background clutter etc. The task is both challenging and error-prone. Therefore it is important to keep the false positives low to maintain a high accuracy of detection. In this paper, we adapt high performing CNN architectures to identify abandoned luggage in a surveillance feed. We explore several CNN based approaches, from Transfer Learning on the Imagenet dataset to one-shot detection using architectures such as YOLOv3. Using network visualization techniques, we gain insight into what the neural network sees and the basis of classification decision. The experiments have been conducted on real world datasets, and highlights the complexity in such classifications. Obtained results indicate that a combination of proposed techniques outperforms the individual approaches.
Author: Utkarsh Contractor
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...Joonhyung Lee
A presentation introducting DeepLab V3+, the state-of-the-art architecture for semantic segmentation. It also includes detailed descriptions of how 2D multi-channel convolutions function, as well as giving a detailed explanation of depth-wise separable convolutions.
Visual odometry & slam utilizing indoor structured environmentsNAVER Engineering
Visual odometry (VO) and simultaneous localization and mapping (SLAM) are fundamental building blocks for various applications from autonomous vehicles to virtual and augmented reality (VR/AR).
To improve the accuracy and robustness of the VO & SLAM approaches, we exploit multiple lines and orthogonal planar features, such as walls, floors, and ceilings, common in man-made indoor environments.
We demonstrate the effectiveness of the proposed VO & SLAM algorithms through an extensive evaluation on a variety of RGB-D datasets and compare with other state-of-the-art methods.
The presentation is coverong the convolution neural network (CNN) design.
First,
the main building blocks of CNNs will be introduced. Then we systematically
investigate the impact of a range of recent advances in CNN architectures and
learning methods on the object categorization (ILSVRC) problem. In the
evaluation, the influence of the following choices of the architecture are
tested: non-linearity (ReLU, ELU, maxout, compatibility with batch
normalization), pooling variants (stochastic, max, average, mixed), network
width, classifier design (convolution, fully-connected, SPP), image
pre-processing, and of learning parameters: learning rate, batch size,
cleanliness of the data, etc.
#6 PyData Warsaw: Deep learning for image segmentationMatthew Opala
Deep learning techniques ignited a great progress in many computer vision tasks like image classification, object detection, and segmentation. Almost every month a new method is published that achieves state-of-the-art result on some common benchmark dataset. In addition to that, DL is being applied to new problems in CV.
In the talk we’re going to focus on DL application to image segmentation task. We want to show the practical importance of this task for the fashion industry by presenting our case study with results achieved with various attempts and methods.
Intelligent Image Enhancement and Restoration - From Prior Driven Model to Ad...Wanjin Yu
ICME2019 Tutorial: Intelligent Image Enhancement and Restoration - From Prior Driven Model to Advanced Deep Learning Part 3: prior embedding deep super resolution
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis taeseon ryu
해당 논문은 3D Aware 모델입니다 StyleGAN 같은 경우에는 어떤 하나의 피처에 대해서 Editing 하고 싶을 때 입력에 해당하는 레이턴트 백터를 찾아서 레이턴트 백터를 수정함으로써 입에 해당하는 피쳐를 바꿀 수 있었는데 이런 컨셉을 그대로 착안해서
GAN 스페이스 논문에서는 인풋이 들어왔을 때 어떤 공간적인 정보까지도 에디팅하려고 시도했습니다 결과를 봤을 때 로테이션 정보가 어느 정도 잘 학습된 것 같지만 같은 사람이 아닌 것 같이 인식되기도 합니다 이러한 문제를 이제 disentangle 되지 않았다라고 하는 게 원하는 피처만 변화시켜야 되는 것과 달리 다른 피처까지도 모두 학습 모두 변했다는 것인데 이를 좀 더 효율적으로 3D를 더 잘 이해시키기 위해서 탄생한 논문입니다.
For the full video of this presentation, please visit:
http://www.embedded-vision.com/platinum-members/auvizsystems/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Nagesh Gupta, Founder and CEO of Auviz Systems, presents the "Semantic Segmentation for Scene Understanding: Algorithms and Implementations" tutorial at the May 2016 Embedded Vision Summit.
Recent research in deep learning provides powerful tools that begin to address the daunting problem of automated scene understanding. Modifying deep learning methods, such as CNNs, to classify pixels in a scene with the help of the neighboring pixels has provided very good results in semantic segmentation. This technique provides a good starting point towards understanding a scene. A second challenge is how such algorithms can be deployed on embedded hardware at the performance required for real-world applications. A variety of approaches are being pursued for this, including GPUs, FPGAs, and dedicated hardware.
This talk provides insights into deep learning solutions for semantic segmentation, focusing on current state of the art algorithms and implementation choices. Gupta discusses the effect of porting these algorithms to fixed-point representation and the pros and cons of implementing them on FPGAs.
The problem of scene classification in surveillance footage is of great importance for ensuring security in public areas. With challenges such as low quality feeds, occlusion, viewpoint variations, background clutter etc. The task is both challenging and error-prone. Therefore it is important to keep the false positives low to maintain a high accuracy of detection. In this paper, we adapt high performing CNN architectures to identify abandoned luggage in a surveillance feed. We explore several CNN based approaches, from Transfer Learning on the Imagenet dataset to one-shot detection using architectures such as YOLOv3. Using network visualization techniques, we gain insight into what the neural network sees and the basis of classification decision. The experiments have been conducted on real world datasets, and highlights the complexity in such classifications. Obtained results indicate that a combination of proposed techniques outperforms the individual approaches.
Author: Utkarsh Contractor
Presentation of paper "Gated-ViGAT: Efficient Bottom-Up Event
Recognition and Explanation Using a New Frame
Selection Policy and Gating Mechanism", by N. Gkalelis, D. Daskalakis, V. Mezaris, delivered at IEEE ISM 2022, Dec. 2022, Naples, Italy.
In this paper, Gated-ViGAT, an efficient approach for video event recognition, utilizing bottom-up (object) information, a new frame sampling policy and a gating mechanism is proposed. Specifically, the frame sampling policy uses weighted in-degrees (WiDs), derived from the adjacency matrices of graph attention networks (GATs), and a dissimilarity measure to select
the most salient and at the same time diverse frames representing
the event in the video. Additionally, the proposed gating mechanism fetches the selected frames sequentially, and commits early exiting when an adequately confident decision is achieved. In this way, only a few frames are processed by the computationally
expensive branch of our network that is responsible for the bottom-up information extraction. The experimental evaluation on two large, publicly available video datasets (MiniKinetics, ActivityNet) demonstrates that Gated-ViGAT provides a large computational complexity reduction in comparison to our previous approach (ViGAT), while maintaining the excellent event
recognition and explainability performance.
DEEP NEURAL NETWORKS APPLIED TO LOW POWER ONBOARD IMAGE COMPRESSION
Over the past decade, rapid developments in digital technologies and access to space have enabled unprecedented capabilities of monitoring our planet and, more generally, our Universe.
This new space race is pushing for a paradigm shift in order to respond to the ever-increasing challenge of delivering the useful information to the end users. With huge number of satellites, greater spatial and spectral resolutions, higher temporal cadence and shrinking spectrum resources, on-board data reduction becomes not only a cost saving solution but, in many cases also, a key enabling technology to achieve viable missions.
https://atpi.eventsair.com/obpdc2022/
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Alpen-Adria-Universität
Recorded cataract surgery videos play a prominent role in training and investigating the surgery, and enhancing the surgical outcomes. Due to storage limitations in hospitals, however, the recorded cataract surgeries are deleted after a short time and this precious source of information cannot be fully utilized. Lowering the quality to reduce the required storage space is not advisable since the degraded visual quality results in the loss of relevant information that limits the usage of these videos. To address this problem, we propose a relevance-based compression technique consisting of two modules: (i) relevance detection, which uses neural networks for semantic segmentation and classification of the videos to detect relevant spatio-temporal information, and (ii) content-adaptive compression, which restricts the amount of distortion applied to the relevant content while allocating less bitrate to irrelevant content. The proposed relevance-based compression framework is implemented considering five scenarios based on the definition of relevant information from the target audience’s perspective. Experimental results demonstrate the capability of the proposed approach in relevance detection. We further show that the proposed approach can achieve high compression efficiency by abstracting substantial redundant information while retaining the high quality of the relevant content.
SENSITIVITY OF A VIDEO SURVEILLANCE SYSTEM BASED ON MOTION DETECTIONsipij
The implementation of a stand-alone system developed in JAVA language for motion detection has been discussed. The open-source OpenCV library has been adopted for video surveillance image processing thus implementing Background Subtraction algorithm also known as foreground detection algorithm. Generally the region of interest of a body or object to detect is related to a precise objects (people, cars, etc.) emphasized on a background. This technique is widely used for tracking a moving objects. In particular, the BackgroundSubtractorMOG2 algorithm of OpenCV has been applied. This algorithm is based on Gaussian distributions and offers better adaptability to different scenes due to changes in lighting and the detection of shadows as well. The implemented webcam system relies on saving frames and creating GIF and JPGs files with previously saved frames. In particular the Background Subtraction function, find Contours, has been adopted to detect the contours. The numerical quantity of these contours has been compared with the tracking points of sensitivity obtained by setting an user-modifiable slider able to save the frames as GIFs composed by different merged JPEGs. After a full design of the image processing prototype different motion test have been performed. The results showed the importance to consider few sensitivity points in order to obtain more frequent image storages also concerning minor movements.Sensitivity points can be modified through a slider function and are inversely proportional to the number of saved images. For small object in motion will be detected a low percentage of sensitivity points.Experimental results proves that the setting condition are mainly function of the typology of moving object rather than the light conditions. The proposed prototype system is suitable for video surveillance smart
camera in industrial systems.
SDVIs and In-Situ Visualization on TACC's StampedeIntel® Software
Speaker: Paul Navrátil, Texas Advanced Computing Center (TACC)
The design emphasis for supercomputing systems has moved from raw performance to performance-per-watt, and as a result, supercomputing architectures are converging on processors with wide vector units and many processing cores per chip. Such processors are capable of performant image rendering purely in software. This improved capability is fortuitous, since the prevailing homogeneous system designs lack dedicated, hardware-accelerated rendering subsystems for use in data visualization. Reliance on this “software-defined” rendering capability will grow in importance since, due to growing data sizes, visualizations must be performed on the same machine where the data is produced. Further, as data sizes outgrow disk I/O capacity, visualization will be increasingly incorporated into the simulation code itself (in situ visualization).
This talk presents recent work in high-fidelity visualization using the OSPRay ray tracing framework on TACC’s local and remote visualization systems. We present work using OSPRay within ParaView Catalyst in situ framework from Kitware, including capitalizing on opportunities to reduce data costs migrating through VTK filters for visualization. We highlight the performance opportunities and advantages of Intel® Advanced Vector Extensions 512, the memory system improvements possible with Intel® Xeon Phi™ processor multi-channel DRAM (MCDRAM) and the Intel® Omni-Path Architecture interconnect.
Александр Заричковый "Faster than real-time face detection"Fwdays
I will talk about object and face detection problems, evolution of different approaches to solving these problems and about the ideas behind each of these approaches. Also I will describe meta-architecture that achieve state of the art results on faces detection problem and works faster than real-time.
Paper discussion:Video-to-Video Synthesis (NIPS 2018)Motaz Sabri
This presentation was used in Ridge-i Yomekai event in decemver 2018 for a NIPS2018 paper named Video-to-Video Synthesis delivered by researchers from Nvidia and MIT.
Similar to (Paper Review) Abnormal Event Detection in Videos using Generative Adversarial Nets (20)
(Paper Review)A versatile learning based 3D temporal tracker - scalable, robu...MYEONGGYU LEE
review date: 2018/04/09 (by Meyong-Gyu.LEE @Soongsil Univ.)
Eng review of 'A versatile learning based 3D temporal tracker - scalable, robust, online'(ICCV 2015)
(Paper Review)3D shape reconstruction from sketches via multi view convolutio...MYEONGGYU LEE
review date: 2019/03/20 (by Meyong-Gyu.LEE @Soongsil Univ.)
Korean review of '3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks'(CVPR 2017)
(Paper Review)Towards foveated rendering for gaze tracked virtual realityMYEONGGYU LEE
review date: 2017/10/30 (by Meyong-Gyu.LEE @Soongsil Univ.)
Korean review of 'Towards Foveated Rendering for Gaze-Tracked Virtual Reality'(A Patney et al.)
(Papers Review)CNN for sentence classificationMYEONGGYU LEE
review date: 2017/10/10 (by Meyong-Gyu.LEE @Soongsil Univ.)
Korean review of 'Convolutional Neural Networks for Sentence Classification'(EMNLP2014) and 'A Syllable-based Technique for Word Embeddings of Korean Words'(HCLT 2017)
(Paper Review)Kernel predicting-convolutional-networks-for-denoising-monte-ca...MYEONGGYU LEE
review date: 2017/12/5 (by Meyong-Gyu.LEE @Soongsil Univ.)
Korean Paper review of 'Kernel Predicting Convolutional Networks for Denoising Monte Carlo Renderings'(Siggraph2017)
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
How to Get CNIC Information System with Paksim Ga.pptx
(Paper Review) Abnormal Event Detection in Videos using Generative Adversarial Nets
1. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (1/26)
2021/05/17
ABNORMAL EVENT DETECTION
IN VIDEOS USING GENERATIVE
ADVERSARIAL NETS
GAN을 활용한 영상 내 비정상 이벤트 탐지 기법
Paper Review
Presented by 이명규
2. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (2/26)
I N D E X
01
02
03
Introduction
Paper Overview
Conclusion
3. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (3/26)
Introduction
Part 01
1. Paper Introduction
2. Related Works
3. Background
4. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (4/26)
Paper Introduction
1-1
• 발표: ICIP 2018
(IEEE International Conference on Image Processing)
• 저자: Mahdyar Ravanbakhsh et al. (University of Genova)
• 인용수: 195회
• 1저자 Google Scholar 주소: Mahdyar Ravanbakhsh
• 논문 개요: GAN 네트워크를 활용해 영상으로부터 비정상 이벤트 탐지
5. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (5/26)
Related Works
1-2
• Various Methods for Detect “Abnormality”
➢ 전통적인 Feature Extractor들을 활용해 Abnormality를 탐지 [1, 2, 3, 4, 5, 6, 7]
e.g) Optical-Flow, Tracklets etc…)
➢ CNN을 활용해 Abnormality를 탐지 [15, 16]
➢ Denoising AE와 같은 Generative Method를 통해 Abnormality를 탐지 [17]
• Why is Anomaly Detection Task Challenging?
➢ 지도학습 기반으로 학습하기에는 비정상 샘플이 지나치게 적음
➢ Abnormality를 명확하게 정의하기 어려움
6. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (6/26)
↳ What is Anomaly Detection?
Background
1-3
• Normal(정상) sample과 Abnormal(비정상) sample을 구별하는 Task
• 대부분의 데이터 샘플들과 크게 다른 차이를 보이는 이벤트를 탐지
• Bank Fraud Detection, Structural Defect Detection 등의 다양한 분야에서 활용
Improving Unsupervised Defect Segmentation by Applying Structural Similarity To Autoencoders
7. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (7/26)
↳ What is GAN?
https://www.notion.so/A-Brief-Introduction-To-GANs-397de071301f4e56b4907a65d93cef7b
Background
1-3
8. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (8/26)
↳ What is GAN?
https://www.notion.so/A-Brief-Introduction-To-GANs-397de071301f4e56b4907a65d93cef7b
Background
1-3
9. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (9/26)
↳ What is GAN?
https://www.notion.so/A-Brief-Introduction-To-GANs-397de071301f4e56b4907a65d93cef7b
Background
1-3
“Want to learn 𝑷𝒎𝒐𝒅𝒆𝒍(𝒙) similar to 𝑷𝒅𝒂𝒕𝒂(𝒙)”
• Discriminative Modeling:
• Focus on the decision boundary
• Only for supervised tasks
• 𝑺𝒂𝒎𝒑𝒍𝒆 𝒙가 주어졌을 때 𝒍𝒂𝒃𝒆𝒍 𝒚의 확률 𝑷(𝒚|𝒙)추정하는 문제
• Generative Modeling:
• Probabilistic model of each class
• 𝑺𝒂𝒎𝒑𝒍𝒆 𝒙의 𝑷(𝒙)를 추정하는 문제
(단 cGAN의 경우는 𝒄𝒐𝒏𝒅𝒊𝒕𝒊𝒐𝒏 𝒗𝒆𝒄𝒕𝒐𝒓 𝒚가 주어진 경우 𝑷(𝒙|𝒚)를 추정)
10. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (10/26)
↳ GAN Application:
Image to Image Translation
“Latent Vector Magic!”
Background
1-3
11. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (11/26)
↳
https://brstar96.github.io/paperreview/shoveling/3D-shape-reconstruction-from-sketches-via-multi-view-convolutional-networks/
“Sketch to 3D Mesh”
GAN Application:
Image to Image Translation
Background
1-3
12. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (12/26)
↳ What is Optical Flow?
Background
1-3
• 빛의 흐름을 2D Vector field representation으로 표현하는 기법
(본 논문에서는 Conditinal GAN의 Condition Vector로 활용하기 위해 사용)
• 모션을 활용한 다양한 어플리케이션에 활용
e.g. SfM(Structure from motion), Video Compression, Video Stabilization, …
https://bkshin.tistory.com/entry/OpenCV-31-%EA%B4%91%ED%95%99-%ED%9D%90%EB%A6%84Optical-Flow , https://en.wikipedia.org/wiki/Optical_flow
▲ Gunner Farneback Algorithm
(cv2.calcOpticalFlowFarneback)
▲ Lucas-Kanade Algorithm
(cv2. calcOpticalFlowPyrLK)
13. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (13/26)
↳ Optical Flow Application:
Structure from motion
Background
1-3
14. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (14/26)
↳ Optical Flow Application:
Video Stabilization
Background
1-3
15. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (15/26)
Paper Overview
Part 02
1. Datasets
2. Network Architecture
16. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (16/26)
↳
Datasets
2-1
Dataset Spec: UCSD Dataset
• 움직이는 다양한 물체 포함
• Ped1, Ped2로 분리해 제공
• Ped1: 34 Train, 16 Test Sequences
• Ped2: 16 Train, 12 Test Videos
• 238*158 Low Resolution *.tif files
• Test 이미지는 정답에 대한 바이너리
마스크 이미지 제공
▲ Test Image에 정답 마스크를 덧씌운 샘플
17. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (17/26)
↳
Datasets
2-1
Dataset Spec: UMN Dataset
• 11 Videos in 3 Different Scenes (7700 frames)
• “Abnormal Crowd Behavior Detection using Social Force Model” 논문에서 발표
20. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (20/26)
↳
Network Architecture
2-1
Abnormality Detection(1/2) - 차이 맵 계산
• 비정상 픽셀 검출은 ∆𝑶(𝑶𝒕 −
𝑶𝒕)와 ∆𝑺(𝒉 𝑭 − 𝒉(
𝑭))를 합성해 수행
➢ 𝑶𝒕 는 𝑻𝒆𝒔𝒕 𝑰𝒏𝒑𝒖𝒕 𝑭𝒓𝒂𝒎𝒆 𝑭𝒕와 𝑭𝒕−𝟏를 이용해 [28]의 방법으로 연산
➢
𝑶𝒕는 정상 이미지만으로 학습된 𝑮를 통해 연산
➢ 𝒉 는 Pre-trained AlexNet,
𝑭는 정상 이미지만으로 학습된 𝑮를 통해 연산
• 단순히 𝑻𝒆𝒔𝒕 𝑰𝒏𝒑𝒖𝒕 𝑭𝒓𝒂𝒎𝒆 𝑭와 𝑮𝒆𝒏𝒆𝒓𝒂𝒕𝒆𝒅 𝒑𝑭의 차이를 구하는 것은
∆𝑶보다 정보량이 적은 문제 발생
➢ 따라서 𝒉 에 𝑭와
𝑭를 통과시킨 후 5th layer output으로 ∆𝑺 연산
21. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (21/26)
↳
Network Architecture
2-1
Abnormality Detection(2/2) - 차이 맵 합성
Upsample
∆𝑺
Same resolution as ∆𝑶
∆𝑶
∆𝑺
∆′
𝑺 MinMax
Normalize
With respect to corresponding
channel-value range
𝑵𝑶(𝒊, 𝒋) = 𝟏/𝒎𝑶∆𝑶(𝒊, 𝒋),
𝑵𝑺(𝒊, 𝒋) = 𝟏/𝒎𝑺∆′
𝑺(𝒊, 𝒋)
𝑵𝑺, 𝑵𝑶
Sum
𝑨 = 𝑵𝑺 + 𝝀𝑵𝑶,
𝑾𝒉𝒆𝒓𝒆 𝝀 = 𝟐
Final
Abnormality
Heatmap
22. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (22/26)
Conclusion
Part 03
1. Experiments
2. Conclusion
23. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (23/26)
↳
Experiments
3-1
Visual Results
24. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (24/26)
↳
Conclusion
3-2
Quantitative Results
25. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (25/26)
↳
Conclusion
3-2
Quantitative Results
https://angeloyeo.github.io/2020/08/05/ROC.html
TPR: “암에 걸린 환자를 암환자로 분류한 비율”
FPR: “암에 걸리지 않은 환자를 암환자로 분류한 비율”
27. 이명규
ABNORMAL EVENT DETECTION IN VIDEOS USING GENERATIVE ADVERSARIAL NETS (27/26)
Thank you for Watching.
Myeong-Gyu LEE | Ph.D. Student @SSU
🧪 Computer Graphics Lab (Advised by Prof. KyoungSu Oh)
Department of Digital Media
💼 Espreso Media co., Application Tech. Development
Assistant Research Engineer
✉ brstar96@naver.com