This document discusses domain transfer and domain adaptation in deep learning. It begins with introductions to domain transfer, which learns a mapping between domains, and domain adaptation, which learns a mapping between domains with labels. It then covers several approaches for domain transfer, including neural style transfer, instance normalization, and GAN-based methods. It also discusses general approaches for domain adaptation such as source/target feature matching and target data augmentation.
Semantic Image Synthesis with Spatially-Adaptive Normalization哲东 郑
SPADE is a network architecture for semantic image synthesis that uses spatially-adaptive normalization layers to better preserve semantic information from input masks compared to conventional normalization layers. It replaces global normalization layers with spatially-adaptive normalization layers that modulate normalization parameters using semantic segmentation maps, allowing semantic information to be incorporated throughout image generation.
How to use Apache TVM to optimize your ML modelsDatabricks
Apache TVM is an open source machine learning compiler that distills the largest, most powerful deep learning models into lightweight software that can run on the edge. This allows the outputed model to run inference much faster on a variety of target hardware (CPUs, GPUs, FPGAs & accelerators) and save significant costs.
In this deep dive, we’ll discuss how Apache TVM works, share the latest and upcoming features and run a live demo of how to optimize a custom machine learning model.
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/02/introduction-to-the-tvm-open-source-deep-learning-compiler-stack-a-presentation-from-octoml/
Luis Ceze, Co-founder and CEO of OctoML, a Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington, and Venture Partner at Madrona Venture Group, presents the “Introduction to the TVM Open Source Deep Learning Compiler Stack” tutorial at the September 2020 Embedded Vision Summit.
There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms — such as mobile phones, embedded devices, and accelerators — requires significant manual effort.
In this talk, Ceze presents his work on the TVM stack, which exposes graph- and operator-level optimizations to provide performance portability for deep learning workloads across diverse hardware back-ends. TVM solves optimization challenges specific to deep learning, such as high-level operator fusion, mapping to arbitrary hardware primitives and memory latency hiding. It also automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of optimizations.
Semantic Image Synthesis with Spatially-Adaptive Normalization哲东 郑
SPADE is a network architecture for semantic image synthesis that uses spatially-adaptive normalization layers to better preserve semantic information from input masks compared to conventional normalization layers. It replaces global normalization layers with spatially-adaptive normalization layers that modulate normalization parameters using semantic segmentation maps, allowing semantic information to be incorporated throughout image generation.
How to use Apache TVM to optimize your ML modelsDatabricks
Apache TVM is an open source machine learning compiler that distills the largest, most powerful deep learning models into lightweight software that can run on the edge. This allows the outputed model to run inference much faster on a variety of target hardware (CPUs, GPUs, FPGAs & accelerators) and save significant costs.
In this deep dive, we’ll discuss how Apache TVM works, share the latest and upcoming features and run a live demo of how to optimize a custom machine learning model.
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2021/02/introduction-to-the-tvm-open-source-deep-learning-compiler-stack-a-presentation-from-octoml/
Luis Ceze, Co-founder and CEO of OctoML, a Professor in the Paul G. Allen School of Computer Science and Engineering at the University of Washington, and Venture Partner at Madrona Venture Group, presents the “Introduction to the TVM Open Source Deep Learning Compiler Stack” tutorial at the September 2020 Embedded Vision Summit.
There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms — such as mobile phones, embedded devices, and accelerators — requires significant manual effort.
In this talk, Ceze presents his work on the TVM stack, which exposes graph- and operator-level optimizations to provide performance portability for deep learning workloads across diverse hardware back-ends. TVM solves optimization challenges specific to deep learning, such as high-level operator fusion, mapping to arbitrary hardware primitives and memory latency hiding. It also automates optimization of low-level programs to hardware characteristics by employing a novel, learning-based cost modeling method for rapid exploration of optimizations.
The document provides an introduction to diffusion models. It discusses that diffusion models have achieved state-of-the-art performance in image generation, density estimation, and image editing. Specifically, it covers the Denoising Diffusion Probabilistic Model (DDPM) which reparametrizes the reverse distributions of diffusion models to be more efficient. It also discusses the Denoising Diffusion Implicit Model (DDIM) which generates rough sketches of images and then refines them, significantly reducing the number of sampling steps needed compared to DDPM. In summary, diffusion models have emerged as a highly effective approach for generative modeling tasks.
Generative adversarial networks (GANs) are a class of machine learning frameworks where two neural networks, a generator and discriminator, compete against each other. The generator learns to generate new data with the same statistics as the training set to fool the discriminator, while the discriminator learns to better distinguish real samples from generated samples. GANs have applications in image generation, image translation between domains, and image completion. Training GANs can be challenging due to issues like mode collapse.
【DL輪読会】Efficiently Modeling Long Sequences with Structured State SpacesDeep Learning JP
This document summarizes a research paper on modeling long-range dependencies in sequence data using structured state space models and deep learning. The proposed S4 model (1) derives recurrent and convolutional representations of state space models, (2) improves long-term memory using HiPPO matrices, and (3) efficiently computes state space model convolution kernels. Experiments show S4 outperforms existing methods on various long-range dependency tasks, achieves fast and memory-efficient computation comparable to efficient Transformers, and performs competitively as a general sequence model.
Automated machine learning lectures given at the Advanced Course on Data Science & Machine Learning. AutoML, hyperparameter optimization, Bayesian optimization, Neural Architecture Search, Meta-learning, MAML
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
Words are no longer sufficient in delivering the search results users are looking for, particularly in relation to image search. Text and languages pose many challenges in describing visual details and providing the necessary context for optimal results. Machine Learning technology opens a new world of search innovation that has yet to be applied by businesses.
In this session, Mike Ranzinger of Shutterstock will share a technical presentation detailing his research on composition aware search. He will also demonstrate how the research led to the launch of AI technology allowing users to more precisely find the image they need within Shutterstock’s collection of more than 150 million images. While the company released a number of AI search enabled tools in 2016, this new technology allows users to search for items in an image and specify where they should be located within the image. The research identifies the networks that localize and describe regions of an image as well as the relationships between things. The goal of this research was to improve the future of search using visual data, contextual search functions, and AI. A combination of multiple machine learning technologies led to this breakthrough.
The document provides an introduction to diffusion models. It discusses that diffusion models have achieved state-of-the-art performance in image generation, density estimation, and image editing. Specifically, it covers the Denoising Diffusion Probabilistic Model (DDPM) which reparametrizes the reverse distributions of diffusion models to be more efficient. It also discusses the Denoising Diffusion Implicit Model (DDIM) which generates rough sketches of images and then refines them, significantly reducing the number of sampling steps needed compared to DDPM. In summary, diffusion models have emerged as a highly effective approach for generative modeling tasks.
Generative adversarial networks (GANs) are a class of machine learning frameworks where two neural networks, a generator and discriminator, compete against each other. The generator learns to generate new data with the same statistics as the training set to fool the discriminator, while the discriminator learns to better distinguish real samples from generated samples. GANs have applications in image generation, image translation between domains, and image completion. Training GANs can be challenging due to issues like mode collapse.
【DL輪読会】Efficiently Modeling Long Sequences with Structured State SpacesDeep Learning JP
This document summarizes a research paper on modeling long-range dependencies in sequence data using structured state space models and deep learning. The proposed S4 model (1) derives recurrent and convolutional representations of state space models, (2) improves long-term memory using HiPPO matrices, and (3) efficiently computes state space model convolution kernels. Experiments show S4 outperforms existing methods on various long-range dependency tasks, achieves fast and memory-efficient computation comparable to efficient Transformers, and performs competitively as a general sequence model.
Automated machine learning lectures given at the Advanced Course on Data Science & Machine Learning. AutoML, hyperparameter optimization, Bayesian optimization, Neural Architecture Search, Meta-learning, MAML
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017StampedeCon
Words are no longer sufficient in delivering the search results users are looking for, particularly in relation to image search. Text and languages pose many challenges in describing visual details and providing the necessary context for optimal results. Machine Learning technology opens a new world of search innovation that has yet to be applied by businesses.
In this session, Mike Ranzinger of Shutterstock will share a technical presentation detailing his research on composition aware search. He will also demonstrate how the research led to the launch of AI technology allowing users to more precisely find the image they need within Shutterstock’s collection of more than 150 million images. While the company released a number of AI search enabled tools in 2016, this new technology allows users to search for items in an image and specify where they should be located within the image. The research identifies the networks that localize and describe regions of an image as well as the relationships between things. The goal of this research was to improve the future of search using visual data, contextual search functions, and AI. A combination of multiple machine learning technologies led to this breakthrough.
Artificial Intelligence for Automated Software TestingLionel Briand
This document provides an overview of applying artificial intelligence techniques such as metaheuristic search, machine learning, and natural language processing to problems in automated software testing. It begins with introductions to software testing, relevant AI techniques including genetic algorithms, machine learning, and natural language processing. It then discusses search-based software testing (SBST) as an application of metaheuristic search to problems in test case generation and optimization. Examples are provided of representing test cases as chromosomes for genetic algorithms and defining fitness functions to guide the search for test cases that maximize code coverage.
Machine learning for IoT - unpacking the blackboxIvo Andreev
This document provides an overview of machine learning and how it can be applied to IoT scenarios. It discusses different machine learning algorithms like supervised and unsupervised learning. It also compares various machine learning platforms like Azure ML, BigML, Amazon ML, Google Prediction and IBM Watson ML. It provides guidance on choosing the right algorithm based on the data and diagnosing why machine learning models may fail. It also introduces neural networks and deep learning concepts. Finally, it demonstrates Azure ML capabilities through a predictive maintenance example.
This document discusses various transfer learning techniques for machine learning, including domain adaptation and small sample learning. It proposes three methods for unsupervised domain adaptation that use graph or hypergraph matching to minimize domain discrepancy: 1) Graph Matching, 2) Hypergraph Matching, and 3) Graph Matching with representation learning. For small sample learning, it discusses approaches for few-shot learning and zero-shot learning, and proposes a two-stage solution for few-shot learning that learns a discriminative low-dimensional space and estimates class variance, and a method for zero-shot learning that matches features to semantics. Evaluation on standard datasets shows the proposed methods achieve competitive performance.
Machine Learning with ML.NET and Azure - Andy CrossAndrew Flatters
- The document discusses machine learning and ML.NET. It begins with an introduction of the speaker and their background in machine learning.
- Key topics that will be covered include machine learning, ML.NET, Parquet.NET, using machine learning in production, and relevant Azure tools for data and machine learning.
- Examples provided will demonstrate sentiment analysis, finding patterns in taxi fare data, image recognition, and more to illustrate machine learning algorithms and best practices.
The Power of Auto ML and How Does it WorkIvo Andreev
Automated ML is an approach to minimize the need of data science effort by enabling domain experts to build ML models without having deep knowledge of algorithms, mathematics or programming skills. The mechanism works by allowing end-users to simply provide data and the system automatically does the rest by determining approach to perform particular ML task. At first this may sound discouraging to those aiming to the “sexiest job of the 21st century” - the data scientists. However, Auto ML should be considered as democratization of ML, rather that automatic data science.
In this session we will talk about how Auto ML works, how is it implemented by Microsoft and how it could improve the productivity of even professional data scientists.
The Machine Learning Workflow with AzureIvo Andreev
This document provides an overview of real world machine learning using Azure. It discusses the machine learning workflow including data understanding, preprocessing, feature engineering, model selection, evaluation and tuning. It then describes various Azure machine learning tools for building, testing and deploying machine learning models including Azure ML Workbench, Studio, Experimentation Service and Model Management Service. It concludes with an upcoming demo of predictive maintenance using Azure ML Studio.
If there is one crucial thing in building ML models, this would be the data preparation. That is the process of transforming raw data to a state where machine learning algorithms could be run to disclose insights and make predictions. Data preparation involves analysis, depends on the nature of the problem and the particular algorithms. As far as there are knowledge and experience involved, there is no such thing as automation, which makes the role of the data scientist the key to success.
ML is trendy and Microsoft already have more than 10 services to support ML. So we will focus on tools like Azure ML Workbench and Python for data preparation, review some common tricks to approach data and experiment in Azure ML Studio.
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
This document provides an overview of machine learning and predictive modeling techniques for hackers and data scientists. It discusses foundational concepts in machine learning like functionalism, connectionism, and black box modeling. It also covers practical techniques like feature engineering, model selection, evaluation, optimization, and popular Python libraries. The document encourages an experimental approach to hacking predictive models through techniques like brute forcing hyperparameters, fuzzing with data permutations, and social engineering within data science communities.
This document is a slide presentation on recent advances in deep learning. It discusses self-supervised learning, which involves using unlabeled data to learn representations by predicting structural information within the data. The presentation covers pretext tasks, invariance-based approaches, and generation-based approaches for self-supervised learning in computer vision and natural language processing. It provides examples of specific self-supervised methods like predicting image rotations, clustering representations to generate pseudo-labels, and masked language modeling.
In this video from the ISC Big Data'14 Conference, Ted Willke from Intel presents: The Analytics Frontier of the Hadoop Eco-System.
"The Hadoop MapReduce framework grew out of an effort to make it easy to express and parallelize simple computations that were routinely performed at Google. It wasn’t long before libraries, like Apache Mahout, were developed to enable matrix factorization, clustering, regression, and other more complex analyses on Hadoop. Now, many of these libraries and their workloads are migrating to Apache Spark because it supports a wider class of applications than MapReduce and is more appropriate for iterative algorithms, interactive processing, and streaming applications. What’s next beyond Spark? Where is big data analytics processing headed? How will data scientists program these systems? In this talk, we will explore the current analytics frontier, the popular debates, and discuss some potentially clever additions. We will also share the emergent data science applications and collaborative university research that inform our thinking."
Learn more:
http://www.isc-events.com/bigdata14/schedule.html
and
http://www.intel.com/content/www/us/en/software/intel-graph-solutions.html
Watch the video presentation: https://www.youtube.com/watch?v=qlfx495Ekw0
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
The document introduces various computer vision topics including convolutional neural networks, popular CNN architectures, data augmentation, transfer learning, object detection, neural style transfer, generative adversarial networks, and variational autoencoders. It provides overviews of each topic and discusses concepts such as how convolutions work, common CNN architectures like ResNet and VGG, why data augmentation is important, how transfer learning can utilize pre-trained models, how object detection algorithms like YOLO work, the content and style losses used in neural style transfer, how GANs use generators and discriminators, and how VAEs describe images with probability distributions. The document aims to discuss these topics at a practical level and provide insights through examples.
This document discusses using deep learning models to generate text-based regression scores for web domain reputation. It motivates using deep learning models to supplement existing reputation scores for new domains and provide data enrichment. The document outlines preprocessing input domain text data, describing common neural network architectures, and training an initial LSTM model on a dataset of 1.6 million domains and their reputation scores. It discusses results, opportunities for improvement, and options for model deployment.
Ensemble machine learning methods are often used when the true prediction function is not easily approximated by a single algorithm. Practitioners may prefer ensemble algorithms when model performance is valued above other factors such as model complexity and training time. The Super Learner algorithm, also called "stacking", learns the optimal combination of the base learner fits. The latest version of H2O now contains a "Stacked Ensemble" method, which allows the user to stack H2O models into a Super Learner. The Stacked Ensemble method is the the native H2O version of stacking, previously only available in the h2oEnsemble R package, and now enables stacking from all the H2O APIs: Python, R, Scala, etc.
Erin is a Statistician and Machine Learning Scientist at H2O.ai. Before joining H2O, she was the Principal Data Scientist at Wise.io (acquired by GE Digital) and Marvin Mobile Security (acquired by Veracode) and the founder of DataScientific, Inc. Erin received her Ph.D. from University of California, Berkeley. Her research focuses on ensemble machine learning, learning from imbalanced binary-outcome data, influence curve based variance estimation and statistical computing.
Similar to Domain Transfer and Adaptation Survey (20)
Brief History of Visual Representation LearningSangwoo Mo
The document summarizes the history of visual representation learning in 3 eras: (1) 2012-2015 saw the evolution of deep learning architectures like AlexNet and ResNet; (2) 2016-2019 brought diverse learning paradigms for tasks like few-shot learning and self-supervised learning; (3) 2020-present focuses on scaling laws and foundation models through larger models, data and compute as well as self-supervised methods like MAE and multimodal models like CLIP. The field is now exploring how to scale up vision transformers to match natural language models and better combine self-supervision and generative models.
Learning Visual Representations from Uncurated DataSangwoo Mo
Slide about the defense of my Ph.D. dissertation: "Learning Visual Representations from Uncurated Data"
It includes four papers about
- Learning from multi-object images for contrastive learning [1] and Vision Transformer (ViT) [2]
- Learning with limited labels (semi-sup) for image classification [3] and vision-language [4] models
[1] Mo*, Kang* et al. Object-aware Contrastive Learning for Debiased Scene Representation. NeurIPS’21.
[2] Kang*, Mo* et al. OAMixer: Object-aware Mixing Layer for Vision Transformers. CVPRW’22.
[3] Mo et al. RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data. ICLR’23.
[4] Mo et al. S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions. Under Review.
This document proposes using hyperbolic space to embed hierarchical tree structures, like those that can represent sequences of events in reinforcement learning problems. Specifically, it suggests a method called S-RYM that applies spectral normalization to regularize gradients when training deep reinforcement learning agents with hyperbolic embeddings. This stabilization technique allows naive hyperbolic embeddings to outperform standard Euclidean embeddings. It works by reducing gradient norm explosions during training, allowing the entropy loss to converge properly. The document provides technical details on spectral normalization, hyperbolic space representations, and how S-RYM trains deep reinforcement learning agents with stabilized hyperbolic embeddings.
A Unified Framework for Computer Vision Tasks: (Conditional) Generative Model...Sangwoo Mo
Lab seminar introduces Ting Chen's recent 3 works:
- Pix2seq: A Language Modeling Framework for Object Detection (ICLR’22)
- A Unified Sequence Interface for Vision Tasks (NeurIPS’22)
- A Generalist Framework for Panoptic Segmentation of Images and Videos (submitted to ICLR’23)
Deep Learning Theory Seminar (Chap 3, part 2)Sangwoo Mo
This document summarizes key points from a lecture on deep learning theory:
1) It discusses the Maurey sampling technique, which shows that a finite sample approximation X^ of a random variable X converges to X as the number of samples k goes to infinity.
2) It proposes extending this technique to sample finite-width neural networks by converting the weight distribution of an infinite network to a probability measure through normalization.
3) The approximation error between outputs of the infinite and finite networks is bounded using Maurey sampling, with the bound converging to zero as the number of samples increases.
Deep Learning Theory Seminar (Chap 1-2, part 1)Sangwoo Mo
1. The document discusses the approximation capabilities of deep neural networks. It outlines topics that will be covered, including approximation, optimization, and generalization.
2. For approximation, it shows that a neural network can approximate any smooth function over a compact domain to any desired accuracy by bounding the function norm. Specifically, it presents constructive proofs that a univariate function can be approximated by a 2-layer network and a multivariate function by a 3-layer network.
3. The chapter will prove approximation capabilities of finite-width neural networks, including constructive proofs for specific activations and universal approximation for general activations. It will discuss approximating indicators with ReLU activations.
1) The document discusses object-region video transformers (ORViT) for video recognition. ORViT applies attention at both the patch and object levels.
2) ORViT considers three aspects of objects: the objects themselves, interactions between objects, and object dynamics over time.
3) Experimental results show ORViT outperforms baseline models on action recognition, compositional action recognition, and spatio-temporal action detection tasks. ORViT better captures object-level information and dynamics compared to patch-level attention alone.
Deep Implicit Layers: Learning Structured Problems with Neural NetworksSangwoo Mo
Deep implicit layers allow neural networks to solve structured problems by following algorithmic rules. They include layers for convex optimization, discrete optimization, differential equations, and more. The forward pass runs an algorithm, while the backward pass computes gradients using algorithmic properties like KKT conditions. This enables problems like structured prediction, meta-learning, and time series modeling to be solved reliably with neural networks by respecting their underlying structure.
Learning Theory 101 ...and Towards Learning the Flat MinimaSangwoo Mo
The document discusses recent theories on why deep neural networks generalize well despite being highly overparameterized. Classic learning theory, which assumes restricting the hypothesis space is necessary for generalization, fails to explain modern neural networks. Recent studies suggest neural networks generalize because 1) their complexity is underestimated and 2) SGD regularization finds flat minima. Sharpness-aware minimization (SAM) directly optimizes for flat minima and consistently improves generalization, especially for vision transformers which have sharper loss landscapes than ResNets. SAM produces more interpretable attention maps and significantly boosts performance of vision transformers and MLP-Mixers on in-domain and out-of-domain tasks.
Lab seminar on
- Sharpness-Aware Minimization for Efficiently Improving Generalization (ICLR 2021)
- When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations (under review)
This document summarizes recent advances in deep generative models with explicit density estimation. It discusses variational autoencoders (VAEs), including techniques to improve VAEs such as importance weighting, semi-amortized inference, and mitigating posterior collapse. It also covers energy-based models, autoregressive models, flow-based models, vector-quantized VAEs, hierarchical VAEs, and diffusion probabilistic models. The document provides an overview of these generative models with a focus on density estimation and generation quality.
This document summarizes research on reducing the computational complexity of self-attention in Transformer models from O(L2) to O(L log L) or O(L). It describes the Reformer model which uses locality-sensitive hashing to achieve O(L log L) complexity, the Linformer model which uses low-rank approximations and random projections to achieve O(L) complexity, and the Synthesizer model which replaces self-attention with dense or random attention. It also briefly discusses the expressive power of sparse Transformer models.
This document summarizes two meta-learning papers:
1) "Meta-Learning with Implicit Gradients" which introduces Implicit Model-Agnostic Meta-Learning (iMAML), an efficient alternative to MAML that computes meta-gradients without differentiating through the inner loop.
2) "Modular Meta-Learning with Shrinkage" which proposes learning a separate set of parameters for each module with different levels of shrinkage, optimized in an alternating manner to avoid collapse.
Deep Learning for Natural Language ProcessingSangwoo Mo
This document summarizes a lecture on recent advances in deep learning for natural language processing. It discusses improvements to network architectures like attention mechanisms and self-attention, which help models learn long-term dependencies and attend to relevant parts of the input. It also discusses improved training methods to reduce exposure bias and the loss-evaluation mismatch. Newer models presented include the Transformer, which uses only self-attention, and BERT, which introduces a pretrained bidirectional transformer encoder that achieves state-of-the-art results on many NLP tasks.
Improved Trainings of Wasserstein GANs (WGAN-GP)Sangwoo Mo
This document summarizes improved training methods for Wasserstein GANs (WGANs). It begins with an overview of GANs and their limitations, such as gradient vanishing. It then introduces WGANs, which use the Wasserstein distance instead of Jensen-Shannon divergence to provide more meaningful gradients during training. However, weight clipping used in WGANs limits the function space and can cause optimization difficulties. The document proposes using gradient penalty instead of weight clipping to enforce a Lipschitz constraint. It also suggests sampling from an estimated optimal coupling rather than independently sampling real and generated samples to better match theory. Experimental results show the gradient penalty approach improves stability and performance of WGANs on image generation tasks.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3Data Hops
Free A4 downloadable and printable Cyber Security, Social Engineering Safety and security Training Posters . Promote security awareness in the home or workplace. Lock them Out From training providers datahops.com
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Best 20 SEO Techniques To Improve Website Visibility In SERP
Domain Transfer and Adaptation Survey
1. Algorithmic Intelligence Laboratory
Algorithmic Intelligence Laboratory
EE807: Recent Advances in Deep Learning
Lecture 12
Slide made by
Sangwoo Mo
KAIST EE
Domain Transfer and Adaptation
1
2. Algorithmic Intelligence Laboratory
1. Introduction
• What is domain transfer?
• What is domain adaptation?
2. Domain Transfer
• General approaches
• Neural style transfer
• Instance normalization
• GAN-based methods
3. Domain Adaptation
• General approaches
• Source/target feature matching
• Target data augmentation
Table of Contents
2
3. Algorithmic Intelligence Laboratory
1. Introduction
• What is domain transfer?
• What is domain adaptation?
2. Domain Transfer
• General approaches
• Neural style transfer
• Instance normalization
• GAN-based methods
3. Domain Adaptation
• General approaches
• Source/target feature matching
• Target data augmentation
Table of Contents
3
4. Algorithmic Intelligence Laboratory
What is Domain Transfer?
• Learning a mapping between two (or more) domains
• Each domain is typically, described by a set of data samples.
• Given !" and !#, learn a mapping $: !" → !#
Colorization1 Super-resolution2 Inpainting3
Machine Translation4 Music Style Transfer5
1. [Zhang et al., 2016]; 2. [Ledig et al., 2017]; 3. [Yeh et al., 2017]; 4. [Artetxe et al., 2018]; 5. [Mor et al., 2018] 4
5. Algorithmic Intelligence Laboratory
What is Domain Adaptation?
• Learning a mapping between two (or more) domains with labels
• A special case of transfer learning
• Given ("#, %#) and "', learn a mapping (: "' → %'
5*Figure made by Jongjin Park
“Traditional”
Machine Learning
Inductive
Transfer Learning
Transductive
Transfer Learning
Unsupervised
Transfer Learning
or are observable
Yes
NoYes Yes No
No
Knowledge
Distillation
Domain
Adaptation
Multi-task
Learning
Continual
Learning
Semi-supervised
Learning
6. Algorithmic Intelligence Laboratory
1. Introduction
• What is domain transfer?
• What is domain adaptation?
2. Domain Transfer
• General approaches
• Neural style transfer
• Instance normalization
• GAN-based methods
3. Domain Adaptation
• General approaches
• Source/target feature matching
• Target data augmentation
Table of Contents
6
7. Algorithmic Intelligence Laboratory
General Approaches for Domain Transfer
• Domain (or style) transfer aims to
• Keep content of source data
• Change style to match with target data (or domain)
• General optimizing objective for producing outputs:
• Domain transfer research is about designing content & style losses
Content Style Output
*Source: Ulyanov et al. “Instance Normalization: The Missing Ingredient for Fast Stylization”, arXiv 2016 7
8. Algorithmic Intelligence Laboratory
Neural Style Transfer [Gatys et al., 2016]
• Idea: Use a well pretrained (e.g., by the ImageNet dataset) neural network for
content & style losses
• Goal: Given inputs !" and !#, generate $! having content of !" and style of !#
• Content loss:
• High layer of NN contains (abstract) content information
• Match feature maps of high layers
where %&'(
)
denotes feature of * ∈ {1, … , 0}-th layer, 2 ∈ {1, … , 3)×5)}
denotes spatial location, and 6 ∈ {1, … , 7)} denotes channel
8
9. Algorithmic Intelligence Laboratory
Neural Style Transfer [Gatys et al., 2016]
• Idea: Use a well pretrained (e.g., by the ImageNet dataset) neural network for
content & style losses
• Goal: Given inputs !" and !#, generate $! having content of !" and style of !#
• Style loss:
• It has been observed that feature statistics contains style information
• Feature statistics of where
• Match features statistics (or underlying p.d.f.) of low layers
9
*% is some function distance, e.g., maximum mean discrepancy (MMD) or moment matching
10. Algorithmic Intelligence Laboratory
Neural Style Transfer [Gatys et al., 2016]
• Idea: Use a well pretrained (e.g., by the ImageNet dataset) neural network for
content & style losses
• Goal: Given inputs !" and !#, generate $! having content of !" and style of !#
• Style loss:
• Match features statistics (or underlying p.d.f.) of low layers
• For example, one can minimize a Frobenius norm of Gramian matrix %&,
where ((, *)-th element of %&
is given by
hence,
10
*Frobenius norm of Gramian matrix identical to minimize MMD with kernel ,(!,-) = !/ - 0 [Li et al., 2017]
11. Algorithmic Intelligence Laboratory
Neural Style Transfer [Gatys et al., 2016]
• Idea: Use a well pretrained (e.g., by the ImageNet dataset) neural network for
content & style losses
• Goal: Given inputs !" and !#, generate $! having content of !" and style of !#
• Used 4th layers for content loss and [1-5]th layers for style loss
• Inference: Solve the following optimization problem (for each (!", !#))
11
!"!# !
Content lossStyle
loss
13. Algorithmic Intelligence Laboratory
Fast Neural Style [Johnson et al., 2016]
• Motivation: Inference of Neural Style is too slow!
• Namely, one has to solve some optimization per each inference
• Idea: Instead of solving optimization problems, train a neural network !" which
translates input #$ to have the style of #% (style is fixed for a single network)
• Now, inference is a single forward step &# = !"(#$), which is 100x faster than
Neural Style (solving some optimization problems)
• The loss function is identical to Neural Style (defined by a pretrained network)
13
14. Algorithmic Intelligence Laboratory
Fast Neural Style [Johnson et al., 2016]
• Experimental results: Fast Neural Style is often as good as Neural Style
14
Left: original
Middle: Gatys et al., 2016
Right: Johnson et al., 2016
15. Algorithmic Intelligence Laboratory
Instance Normalization [Ulyanov et al., 2016]
• Motivation: Can we design a better network architecture for domain transfer?
• Idea: Removing the original style of !" would make restyling easier
• More idea: As observed in Neural Style, feature statistics represents the style
• Hence, normalizing feature statistics will remove the original style
*Original motivation of IN was to normalize contrast, but recent studies [Li et al., 2017] suggest the real reason of improvement is normalizing feature statistics
15
Remove style of !"
16. Algorithmic Intelligence Laboratory
Instance Normalization [Ulyanov et al., 2016]
• Motivation: Can we design a better network architecture for domain transfer?
• Idea: normalizing feature statistics to remove the original style
• In particular, normalize 1st & 2nd moments (i.e., mean & variance)
• Similar to batch normalization (BN), but for a single instance
H: height
W: width
C: channel
N: batch size
16
17. Algorithmic Intelligence Laboratory
Instance Normalization [Ulyanov et al., 2016]
• Motivation: Can we design a better network architecture for domain transfer?
• Idea: normalizing feature statistics to remove the original style
• In particular, normalize 1st & 2nd moments (i.e., mean & variance)
• Formally, for features , Instance Normalization (IN) is
where
• One can also learn affine parameters separately for each domain (CIN [Dumoulin
et al., 2017]) or adaptively from target style ! (AdaIN [Huang et al., 2017])
*Source: Wu et al. “Group Normalization”, ECCV 2018 17
18. Algorithmic Intelligence Laboratory
Instance Normalization [Ulyanov et al., 2016]
• Experimental results: Instance Normalization helps for Fast Neural Style
18
Content Style Neural Style [Gatys et al., 2016]
Fast Neural Style
[Johnson et al., 2016]
baseline + zero padding + zero padding + IN
19. Algorithmic Intelligence Laboratory
Batch-Instance Normalization [Nam et al., 2018]
• Motivation: Can removing style also improve performance of classification?
• Naïvely applying IN hurts performance as style contains some useful info.
• Idea: use linear combination of BN and IN (learn weight !)
• For simplicity, let "($)
and "(&)
are normalized features of BN and IN, that
and ', ( are defined as before (normalize for )*
×,*
×- with batch size - for BN,
and normalize for )*×,* for IN)
• Here, Batch-Instance Normalization (BIN) is
19
20. Algorithmic Intelligence Laboratory
Batch-Instance Normalization [Nam et al., 2018]
• Motivation: Can removing style also improve performance of classification?
• Naïvely applying IN hurts performance as style contains some useful info.
• Idea: use linear combination of BN and IN (learn parameter !)
• Batch-Instance Normalization (BIN) is linear combination of BN and IN
• After training, low layers use IN more, and high layers use BN more
20
21. Algorithmic Intelligence Laboratory
Batch-Instance Normalization [Nam et al., 2018]
• Experimental results: BIN shows better performance than BN
*Source: Nam et al. “Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks”, NIPS 2018 21
Left: train accuracy
Right: test accuracy
CIFAR-100 results
Top-1 accuracy
for CIFAR-100
22. Algorithmic Intelligence Laboratory
pix2pix [Isola et al., 2017]
• Motivation: Neural Style shows good performance for artistic styles,
but often fails to generate realistic outputs for more complex domain transfer
• Idea: Use GAN (which known to produce realistic images ) for style loss
• Goal: Given source domain !" and target domain !#, learn a mapping $: !" → !#
• In prior terminology, generate '(
)
with content of '* ∈ !" and style of !#
• Style loss:
• Generator , fools discriminator - which guesses if data is in target domain
22
23. Algorithmic Intelligence Laboratory
pix2pix [Isola et al., 2017]
• Motivation: Neural Style shows good performance for artistic styles,
but often fails to generate realistic outputs for more complex domain transfer
• Idea: Use GAN (which known to produce realistic images ) for style loss
• Goal: Given source domain !" and target domain !#, learn a mapping $: !" → !#
• In prior terminology, generate '(
)
with content of '* ∈ !" and style of !#
• Content loss:
• Similar to Neural Style, one can use neural network-defined content loss
• However, if we have paired data ('*, '(), we don’t need such a network, but
can simply apply /0 loss
23
24. Algorithmic Intelligence Laboratory
pix2pix [Isola et al., 2017]
• Motivation: Neural Style shows good performance for artistic styles,
but often fails to generate realistic outputs for more complex domain transfer
• Idea: Use GAN (which known to produce realistic images ) for style loss
• Goal: Given source domain !" and target domain !#, learn a mapping $: !" → !#
• In prior terminology, generate '(
)
with content of '* ∈ !" and style of !#
• In addition, pix2pix propose novel architectures
• Skip-connection generator (e.g., U-Net [Ronneberger et al., 2015])
• PatchGAN discriminator (output is ,×, matrix rather than a single scalar)
24*Source: https://www.groundai.com/project/patch-based-image-inpainting-with-generative-adversarial-networks/
25. Algorithmic Intelligence Laboratory
pix2pix [Isola et al., 2017]
• Motivation: Neural Style shows good performance for artistic styles,
but often fails to generate realistic outputs for more complex domain transfer
• Idea: Use GAN (which known to produce realistic images ) for style loss
• Goal: Given source domain !" and target domain !#, learn a mapping $: !" → !#
• In prior terminology, generate '(
)
with content of '* ∈ !" and style of !#
• Pros & Cons (between GAN-based methods and Neural Style):
• (+) Generates realistic images (neural style is mostly for artistic styles)
• (+) Does not rely on a pretrained network (can be applied to non-images)
• (+) Theoretically sound (neural style relies on feature statistics heuristic)
• (-) Need a dataset of target style, not a single data
• (-) Training is less stable (alternative optimization)
25
27. Algorithmic Intelligence Laboratory
CycleGAN [Zhu et al., 2017a]
• Motivation: pix2pix requires paired data of two domains in training (for
content = !" loss). Can we extend it to unpaired (unsupervised setting)?
• Idea: data translated from source domain to target domain, and translated back
to source domain from target domain should be identical to the original image
• Content loss:
• For source→target generator $%& and target→source generator $&%,
give cycle-consistency loss, that
27
*There are other methods, e.g., isometry constraints [Benaim et al., 2017] or complexity constraints [Galanti et al., 2018] too
29. Algorithmic Intelligence Laboratory
CycleGAN [Zhu et al., 2017a]
• Results (Failure cases & Solutions)
• CycleGAN suffers from false positive/negative problems
• To relax this issue, one can use
segmentation [Liang et al., 2017] or
predicted attention [Mejjati et al., 2018]
to hardly mask instances
• Or train additional segmentors to provide
shape-consistency loss [Zhang et al., 2018]
*Source: Mejjati et al. “Unsupervised Attention-guided Image to Image Translation”, NIPS 2018
Zhang et al. “Translating and Segmenting Multimodal Medical Volumes with Cycle- and Shape-Consistency Generative Adversarial Network”, CVPR 2018 29
30. Algorithmic Intelligence Laboratory
StarGAN [Choi et al., 2018]
• Motivation: Can we extend domain transfer to multi-domain settings?
• Idea: Provide domain conditional vector ! (one-hot encoded) as input
• For translation, give target domain vector, and for reconstruction, give original
domain vector, hence comes back to the original image
• Discriminator classifies domain in addition to real/fake
• This idea also can be applied to multi-modal settings (e.g., BicycleGAN [Zhu et al.,
2017b], AugCGAN [Almahairi et al., 2018]) by using random vector "
• In this case, one should maximize mutual information between # $%, " and " to
avoid mode collapse, i.e., single output with regardless of "
30
32. Algorithmic Intelligence Laboratory
MUNIT [Huang et al., 2018]
• Motivation: Can we do multi-modal domain transfer?
• Idea: Disentangle content & style, and restyle with random style
• To this end, train a content encoder !": $ ↦ & and a style encoder !': $ ↦ (
• Also, train a decoder ): (&, () ↦ $
• In addition to the original reconstruction (= cycle-consistency) loss (Fig a),
use cross-domain reconstruction loss (Fig b)
• At inference, one can apply arbitrary style (randomly sampled) to the given content
32
34. Algorithmic Intelligence Laboratory
vid2vid [Wang et al., 2018]
• Motivation: Can we extend domain transfer to video translation?
• Issue: In video generation, one should consider temporal coherence, that
successive images should be smoothly varied
• Idea: design an additional recurrent structure in a model
• Train a sequential generator !: #$:%&$, ($:% ↦ (%&$
• In addition to image discriminator *+, train a video discriminator *,, which
compares the real sequences and generated sequences
• Caveat: We need paired source/target sequences for this approach
*Source: https://www.youtube.com/watch?v=GrP_aOSXt5U 34
36. Algorithmic Intelligence Laboratory
Recycle-GAN [Bansal et al., 2018]
• Motivation: Can we extend domain transfer to unpaired video translation?
• Problem: We don’t have real target sequences
• Idea: Train prediction models !": $%:& ↦ $&(%, !): *%:& ↦ *&(%
36
37. Algorithmic Intelligence Laboratory
Recycle-GAN [Bansal et al., 2018]
• Motivation: Can we extend domain transfer to unpaired video translation?
• Idea: Train prediction models !": $%:& ↦ $&(%, !): *%:& ↦ *&(%
• In addition to GAN loss and cycle-consistency loss, use
• Recurrent loss (for training !", !)):
• Recycle loss (for training +", +), using !", !)):
37
39. Algorithmic Intelligence Laboratory
1. Introduction
• What is domain transfer?
• What is domain adaptation?
2. Domain Transfer
• General approaches
• Neural style transfer
• Instance normalization
• GAN-based methods
3. Domain Adaptation
• General approaches
• Source/target feature matching
• Target data augmentation
Table of Contents
39
40. Algorithmic Intelligence Laboratory
General Approaches for Domain Adaptation
• Domain adaptation aims to learn !: #$ → &$ only using (#(, &() and #$
• There are two general approaches:
• Source/target feature matching: Make features of #( and #$ be similar
• Target data augmentation: Generate target data (#$
+
, &$
+
) using domain transfer
40*Source: Ganin et al. “Unsupervised Domain Adaptation by Backpropagation”, ICML 2015
41. Algorithmic Intelligence Laboratory
General Approaches for Domain Adaptation
• Domain adaptation aims to learn !: #$ → &$ only using (#(, &() and #$
• There are two general approaches:
• Source/target feature matching: Make features of #( and #$ be similar
• Target data augmentation: Generate target data (#$
+
, &$
+
) using domain transfer
41*Source: Ganin et al. “Unsupervised Domain Adaptation by Backpropagation”, ICML 2015
42. Algorithmic Intelligence Laboratory
Domain adversarial neural network (DANN) [Ganin et al., 2015]
• Goal: Make features of source data !" and target data !# be similar
• Idea: Train discriminator $ which classifies domain label, and adversarially train
network to fool discriminator fail to distinguish source/target feature
• To this end, gradient from domain classifier is reversely applied for the network
42
43. Algorithmic Intelligence Laboratory
Adversarial discriminative domain adaptation (ADDA) [Tzeng et al., 2017]
• Goal: Make features of source data !" and target data !# be similar
• Instead, one can alternatively update discriminator, similar to GAN scheme
• Also, one can train separate feature extractors for source/target domain
• It is less stable for train, but shows better performance than gradient reversal
43
44. Algorithmic Intelligence Laboratory
Domain Separation Network (DSN) [Bousmalis et al., 2016]
• Motivation: Is it rational to exactly match features for source/target data?
• Idea: Consider style of each domain in addition to the shared content
• To this end, train shared content encoder !" and private style encoders !#
#
, !#
$
• Classifier ignores styles but only use shared content as an input
44
45. Algorithmic Intelligence Laboratory
Residual Transfer Network (RTN) [Long et al., 2016]
• Motivation: Is it rational to exactly match classifiers for source/target data?
• Idea: Define source classifier as a residual function of target classifier
• To ensure that !" learns structure of target domain, minimize
entropy for target data, which is popular method for semi-
supervised learning [Grandvalet & Bengio, 2004]
• Hence, in addition to (supervised) classification loss # and
feature matching loss $(&', &") (e.g., GAN loss), use
(unsupervised) entropy loss * on target dataset
45*Source: https://wiki.math.uwaterloo.ca/statwiki/index.php?title=Unsupervised_Domain_Adaptation_with_Residual_Transfer_Networks
46. Algorithmic Intelligence Laboratory
Domain Randomization [Tobin et al., 2017]
• Motivation: Source/target feature matching can be viewed as disentangling
content and style (remove style of each domain but only keep common content)
• Idea: In simulation-to-real (sim2real) setting, we can disentangle content by
domain augmentation
• Train NN on simulations with randomly generated styles
⇒ style sums up, and only content remains
46
48. Algorithmic Intelligence Laboratory
General Approaches for Domain Adaptation
• Domain adaptation aims to learn !: #$ → &$ only using (#(, &() and #$
• There are two general approaches:
• Source/target feature matching: Make features of #( and #$ be similar
• Target data augmentation: Generate target data (#$
+
, &$
+
) using domain transfer
48*Source: Ganin et al. “Unsupervised Domain Adaptation by Backpropagation”, ICML 2015
49. Algorithmic Intelligence Laboratory
SimGAN [Shrivastava et al., 2017]
• Idea: Generate target data with domain transfer model !: #$ → #&
• Given source data '(, *( and transfer model !, we can generate labeled target
data '+
,
, *+
,
= (! '( , *(), and use it to train target network
• Popular application is augmenting real images from synthetic images
49*Source: Shrivastava et al. “Learning from Simulated and Unsupervised Images through Adversarial Training”, CVPR 2017
50. Algorithmic Intelligence Laboratory
CyCADA [Hoffman et al., 2018]
• Motivation: Bridging gap between two approaches: source/target feature
matching and target data augmentation?
• Combine ADDA (feature matching via GAN) and CycleGAN (domain transfer)
50
source/target
feature matching
target data
augmentation
(CycleGAN)
51. Algorithmic Intelligence Laboratory
Conclusion
• Domain transfer is about generating data match with given content and style
• Hence, we should design two losses: content loss and style loss
• Domain adaptation is about transferring knowledge for different domains
• To match source/target features, we apply adversarial or randomization schemes
• We can also apply domain transfer algorithms to generate target data
• The research is still ongoing
• Dozens of papers exist.
• Lots of variants not covered in this slide
• There would be many interesting research directions
51
52. Algorithmic Intelligence Laboratory
Introduction
• [Zhang et al., 2016] Colorful Image Colorization. ECCV 2016.
link : https://arxiv.org/abs/1603.08511
• [Ledig et al., 2017] Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial… CVPR 2017.
link : https://arxiv.org/abs/1609.04802
• [Yeh et al., 2017] Semantic Image Inpainting with Deep Generative Models. CVPR 2017.
link : https://arxiv.org/abs/1607.07539
• [Artetxe et al., 2018] Unsupervised Neural Machine Translation. ICLR 2018.
link : https://arxiv.org/abs/1710.11041
• [Mor et al., 2018] A Universal Music Translation Network. arXiv 2018.
link : https://arxiv.org/abs/1805.07848
Network Architecture
• [Ronneberger et al., 2015] U-Net: Convolutional Networks for Biomedical Image Segmentation. MICCAI 2015.
link : https://arxiv.org/abs/1505.04597
• [He et al., 2016] Deep Residual Learning for Image Recognition. CVPR 2016.
link : https://arxiv.org/abs/1512.03385
References
52
53. Algorithmic Intelligence Laboratory
Domain Transfer
• [Gatys et al., 2016] Image Style Transfer Using Convolutional Neural Networks. CVPR 2016.
link : https://ieeexplore.ieee.org/document/7780634/
• [Johnson et al., 2016] Perceptual Losses for Real-Time Style Transfer and Super-Resolution. ECCV 2016.
link : https://arxiv.org/abs/1603.08155
• [Li et al., 2017] Demystifying Neural Style Transfer. ICJAI 2017.
link : https://arxiv.org/abs/1701.01036
• [Taigman et al., 2017] Unsupervised Cross-Domain Image Generation. ICLR 2017.
link : https://arxiv.org/abs/1611.02200
• [Dumoulin et al., 2017] A Learned Representation For Artistic Style. ICLR 2017.
link : https://arxiv.org/abs/1610.07629
• [Isola et al., 2017] Image-to-Image Translation with Conditional Adversarial Networks. CVPR 2017.
link : https://arxiv.org/abs/1611.07004
• [Zhu et al., 2017a] Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial... ICCV 2017.
link : https://arxiv.org/abs/1703.10593
• [Liu et al., 2017] Unsupervised Image-to-Image Translation Networks. NIPS 2017.
link : https://arxiv.org/abs/1703.00848
• [Benaim et al., 2017] One-Sided Unsupervised Domain Mapping. NIPS 2017.
link : https://arxiv.org/abs/1706.00826
• [Galanti et al., 2018] The Role of Minimal Complexity Functions in Unsupervised Learning of… ICLR 2018.
link : https://arxiv.org/abs/1709.00074
References
53
54. Algorithmic Intelligence Laboratory
Domain Transfer - Extensions
• [Zhu et al., 2017b] Toward Multimodal Image-to-Image Translation. NIPS 2017.
link : https://arxiv.org/abs/1711.11586
• [Liang et al., 2017] Generative Semantic Manipulation with Contrasting GAN. arXiv 2017.
link : https://arxiv.org/abs/1708.00315
• [Choi et al., 2018] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image... CVPR 2018.
link : https://arxiv.org/abs/1711.09020
• [Zhang et al., 2018] Translating and Segmenting Multimodal Medical Volumes with Cycle- and... CVPR 2018.
link : https://arxiv.org/abs/1802.09655
• [Almahairi et al., 2018] Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired… ICML 2018.
link : https://arxiv.org/abs/1802.10151
• [Huang et al., 2018] Multimodal Unsupervised Image-to-Image Translation. ECCV 2018.
link : https://arxiv.org/abs/1804.04732
• [Bansal et al., 2018] Recycle-GAN: Unsupervised Video Retargeting. ECCV 2018.
link : https://arxiv.org/abs/1808.05174
• [Wang et al., 2018] Video-to-Video Synthesis. NIPS 2018.
link : https://arxiv.org/abs/1808.06601
• [Mejjati et al., 2018] Unsupervised Attention-guided Image to Image Translation. NIPS 2018.
link : https://arxiv.org/abs/1806.02311
• [Chan et al., 2018] Everybody Dance Now. arXiv 2018.
link : https://arxiv.org/abs/1808.07371
References
54
55. Algorithmic Intelligence Laboratory
Instance Normalization
• [Ioffe et al., 2016] Batch Normalization: Accelerating Deep Network Training by Reducing Internal... ICML 2015.
link : https://arxiv.org/abs/1502.03167
• [Ulyanov et al., 2016] Instance Normalization: The Missing Ingredient for Fast Stylization. arXiv 2016.
link : https://arxiv.org/abs/1607.08022
• [Dumoulin et al., 2017] A Learned Representation For Artistic Style. ICLR 2017.
link : https://arxiv.org/abs/1610.07629
• [Huang et al., 2017] Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization. ICCV 2017.
link : https://arxiv.org/abs/1703.06868
• [Wu et al., 2018] Group Normalization. ECCV 2018.
link : https://arxiv.org/abs/1803.08494
• [Nam et al., 2018] Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks. NIPS 2018.
link : https://arxiv.org/abs/1805.07925
References
55
56. Algorithmic Intelligence Laboratory
Domain Adaptation
• [Grandvalet & Bengio, 2004] Semi-supervised Learning by Entropy Minimization. NIPS 2004.
link : https://papers.nips.cc/paper/2740-semi-supervised-learning-by-entropy-minimization
• [Ganin et al., 2015] Unsupervised Domain Adaptation by Backpropagation. ICML 2015.
link : http://proceedings.mlr.press/v37/ganin15.html
• [Bousmalis et al., 2016] Domain Separation Networks. NIPS 2016.
link : https://arxiv.org/abs/1608.06019
• [Long et al., 2016] Unsupervised Domain Adaptation with Residual Transfer Networks. NIPS 2016.
link : https://arxiv.org/abs/1602.04433
• [Tzeng et al., 2017] Adversarial Discriminative Domain Adaptation. CVPR 2017.
link : https://arxiv.org/abs/1702.05464
• [Shrivastava et al., 2017] Learning from Simulated and Unsupervised Images through Adversarial... CVPR 2017.
link : https://arxiv.org/abs/1612.07828
• [Bousmalis et al., 2017] Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial... CVPR 2017.
link : https://arxiv.org/abs/1612.05424
• [Tobin et al., 2017] Domain Randomization for Transferring Deep Neural Networks from Simulation... IROS 2017.
link : https://arxiv.org/abs/1703.06907
• [Hoffman et al., 2018] CyCADA: Cycle-Consistent Adversarial Domain Adaptation. ICML 2018.
link : https://arxiv.org/abs/1711.03213
References
56