Kenneth presented a summary of his participation in the Kaggle TGS Salt Identification Challenge. He discussed the top solutions which utilized techniques like ResNeXt and ResNet backbones, feature pyramid attention, Lovasz loss, and test time augmentation. Kenneth's own solution ranked 1482nd, using a U-Net architecture with Lovasz loss and test time flipping. He analyzed techniques that did and did not work well for this salt segmentation task from radar images.
This will address two recently concluded Kaggle competitions.
1. Google landmark retrieval
2. Google landmark recognition
The talk would focus on image retrieval and recognition in large scale. The tentative plan for the presentation:
Primer on signal analysis (DFT, Wavelets).
Primer on information retrieval.
Tips for parallelizing your data pipeline.
Description of my approach and detailed discussion of bottlenecks, limitations and lessons.
In-depth analysis of winning solutions.
This will be a combination of theoretical rigor and practical implementation.
Compressed Sensing using Generative Modelkenluck2001
This document summarizes Kenneth Emeka Odoh's presentation on using generative models for compressed sensing. It introduces compressed sensing and describes how generative models like variational autoencoders (VAEs) and generative adversarial networks (GANs) can be used to solve the underdetermined linear systems that arise in compressed sensing. The document also compares VAEs to standard autoencoders and discusses how generative models may require fewer training examples due to regularization imposing structure on the problem.
Accelerating Habanero-Java Program with OpenCL GenerationAkihiro Hayashi
Accelerating Habanero-Java Program with OpenCL Generation. Akihiro Hayashi, Max Grossman, Jisheng Zhao, Jun Shirako, Vivek Sarkar. 10th International Conference on the Principles and Practice of Programming in Java (PPPJ), September 2013.
Our fall 12-Week Data Science bootcamp starts on Sept 21st,2015. Apply now to get a spot!
If you are hiring Data Scientists, call us at (1)888-752-7585 or reach info@nycdatascience.com to share your openings and set up interviews with our excellent students.
---------------------------------------------------------------
Come join our meet-up and learn how easily you can use R for advanced Machine learning. In this meet-up, we will demonstrate how to understand and use Xgboost for Kaggle competition. Tong is in Canada and will do remote session with us through google hangout.
---------------------------------------------------------------
Speaker Bio:
Tong is a data scientist in Supstat Inc and also a master students of Data Mining. He has been an active R programmer and developer for 5 years. He is the author of the R package of XGBoost, one of the most popular and contest-winning tools on kaggle.com nowadays.
Pre-requisite(if any): R /Calculus
Preparation: A laptop with R installed. Windows users might need to have RTools installed as well.
Agenda:
Introduction of Xgboost
Real World Application
Model Specification
Parameter Introduction
Advanced Features
Kaggle Winning Solution
Event arrangement:
6:45pm Doors open. Come early to network, grab a beer and settle in.
7:00-9:00pm XgBoost Demo
Reference:
https://github.com/dmlc/xgboost
Gradient boosting in practice: a deep dive into xgboostJaroslaw Szymczak
The document discusses tuning parameters for the XGBoost gradient boosting algorithm. It explores different parameters like max_depth, learning_rate, and n_estimators using a news article classification dataset. Experiments are performed to evaluate the effect of these parameters on model accuracy and training time. The learning curves are also plotted to analyze model performance over iterations.
Gradient Boosted Regression Trees in scikit-learnDataRobot
Slides of the talk "Gradient Boosted Regression Trees in scikit-learn" by Peter Prettenhofer and Gilles Louppe held at PyData London 2014.
Abstract:
This talk describes Gradient Boosted Regression Trees (GBRT), a powerful statistical learning technique with applications in a variety of areas, ranging from web page ranking to environmental niche modeling. GBRT is a key ingredient of many winning solutions in data-mining competitions such as the Netflix Prize, the GE Flight Quest, or the Heritage Health Price.
I will give a brief introduction to the GBRT model and regression trees -- focusing on intuition rather than mathematical formulas. The majority of the talk will be dedicated to an in depth discussion how to apply GBRT in practice using scikit-learn. We will cover important topics such as regularization, model tuning and model interpretation that should significantly improve your score on Kaggle.
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTaegyun Jeon
TensorFlow's eager execution allows running operations immediately without building graphs. This makes debugging easier and improves the development workflow. Eager execution can be enabled with tf.enable_eager_execution(). Common operations like variables, gradients, control flow work the same in eager and graph modes. Code written with eager execution in mind is compatible with graph-based execution for deployment. Eager execution provides benefits for iteration and is useful alongside TensorFlow's high-level APIs.
This document discusses speaker diarization, which is the process of segmenting an audio stream into homogeneous segments according to speaker identity. It covers feature extraction methods like MFCCs, segmentation using Bayesian Information Criteria to compare Gaussian mixture models, and clustering algorithms like k-means and hierarchical agglomerative clustering. Dendrogram visualizations are used to identify natural speaker clusters. The overall goal is to partition audio recordings of discussions or debates into homogeneous segments to attribute speech segments to individual speakers.
This will address two recently concluded Kaggle competitions.
1. Google landmark retrieval
2. Google landmark recognition
The talk would focus on image retrieval and recognition in large scale. The tentative plan for the presentation:
Primer on signal analysis (DFT, Wavelets).
Primer on information retrieval.
Tips for parallelizing your data pipeline.
Description of my approach and detailed discussion of bottlenecks, limitations and lessons.
In-depth analysis of winning solutions.
This will be a combination of theoretical rigor and practical implementation.
Compressed Sensing using Generative Modelkenluck2001
This document summarizes Kenneth Emeka Odoh's presentation on using generative models for compressed sensing. It introduces compressed sensing and describes how generative models like variational autoencoders (VAEs) and generative adversarial networks (GANs) can be used to solve the underdetermined linear systems that arise in compressed sensing. The document also compares VAEs to standard autoencoders and discusses how generative models may require fewer training examples due to regularization imposing structure on the problem.
Accelerating Habanero-Java Program with OpenCL GenerationAkihiro Hayashi
Accelerating Habanero-Java Program with OpenCL Generation. Akihiro Hayashi, Max Grossman, Jisheng Zhao, Jun Shirako, Vivek Sarkar. 10th International Conference on the Principles and Practice of Programming in Java (PPPJ), September 2013.
Our fall 12-Week Data Science bootcamp starts on Sept 21st,2015. Apply now to get a spot!
If you are hiring Data Scientists, call us at (1)888-752-7585 or reach info@nycdatascience.com to share your openings and set up interviews with our excellent students.
---------------------------------------------------------------
Come join our meet-up and learn how easily you can use R for advanced Machine learning. In this meet-up, we will demonstrate how to understand and use Xgboost for Kaggle competition. Tong is in Canada and will do remote session with us through google hangout.
---------------------------------------------------------------
Speaker Bio:
Tong is a data scientist in Supstat Inc and also a master students of Data Mining. He has been an active R programmer and developer for 5 years. He is the author of the R package of XGBoost, one of the most popular and contest-winning tools on kaggle.com nowadays.
Pre-requisite(if any): R /Calculus
Preparation: A laptop with R installed. Windows users might need to have RTools installed as well.
Agenda:
Introduction of Xgboost
Real World Application
Model Specification
Parameter Introduction
Advanced Features
Kaggle Winning Solution
Event arrangement:
6:45pm Doors open. Come early to network, grab a beer and settle in.
7:00-9:00pm XgBoost Demo
Reference:
https://github.com/dmlc/xgboost
Gradient boosting in practice: a deep dive into xgboostJaroslaw Szymczak
The document discusses tuning parameters for the XGBoost gradient boosting algorithm. It explores different parameters like max_depth, learning_rate, and n_estimators using a news article classification dataset. Experiments are performed to evaluate the effect of these parameters on model accuracy and training time. The learning curves are also plotted to analyze model performance over iterations.
Gradient Boosted Regression Trees in scikit-learnDataRobot
Slides of the talk "Gradient Boosted Regression Trees in scikit-learn" by Peter Prettenhofer and Gilles Louppe held at PyData London 2014.
Abstract:
This talk describes Gradient Boosted Regression Trees (GBRT), a powerful statistical learning technique with applications in a variety of areas, ranging from web page ranking to environmental niche modeling. GBRT is a key ingredient of many winning solutions in data-mining competitions such as the Netflix Prize, the GE Flight Quest, or the Heritage Health Price.
I will give a brief introduction to the GBRT model and regression trees -- focusing on intuition rather than mathematical formulas. The majority of the talk will be dedicated to an in depth discussion how to apply GBRT in practice using scikit-learn. We will cover important topics such as regularization, model tuning and model interpretation that should significantly improve your score on Kaggle.
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager ExecutionTaegyun Jeon
TensorFlow's eager execution allows running operations immediately without building graphs. This makes debugging easier and improves the development workflow. Eager execution can be enabled with tf.enable_eager_execution(). Common operations like variables, gradients, control flow work the same in eager and graph modes. Code written with eager execution in mind is compatible with graph-based execution for deployment. Eager execution provides benefits for iteration and is useful alongside TensorFlow's high-level APIs.
This document discusses speaker diarization, which is the process of segmenting an audio stream into homogeneous segments according to speaker identity. It covers feature extraction methods like MFCCs, segmentation using Bayesian Information Criteria to compare Gaussian mixture models, and clustering algorithms like k-means and hierarchical agglomerative clustering. Dendrogram visualizations are used to identify natural speaker clusters. The overall goal is to partition audio recordings of discussions or debates into homogeneous segments to attribute speech segments to individual speakers.
This document provides MATLAB examples of neural networks, including:
1. Calculating the output of a simple neuron and plotting it over a range of inputs.
2. Creating a custom neural network, defining its topology and transfer functions, training it on sample data, and calculating outputs.
3. Classifying linearly separable data with a perceptron network and plotting the decision boundary.
This document describes OpenCL buffer management and provides examples of its use. It discusses declaring buffers, copying data between the host and device, and provides simple examples of image rotation and matrix multiplication. The goal is to demonstrate the basic OpenCL host code needed for buffer handling and to serve as templates for more complex kernels.
The document discusses using machine learning techniques to learn vector representations of SQL queries that can then be used for various workload management tasks without requiring manual feature engineering. It shows that representations learned from SQL strings using models like Doc2Vec and LSTM autoencoders can achieve high accuracy for tasks like predicting query errors, auditing users, and summarizing workloads for index recommendation. These learned representations allow workload management to be database agnostic and avoid maintaining database-specific feature extractors.
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Hansol Kang
* Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
EdSketch: Execution-Driven Sketching for JavaLisa Hua
Sketching is a relatively recent approach to program synthesis, which has shown much promise. The key idea in sketching is to allow users to write partial programs that have “holes” and provide test harnesses or reference implementations, and let synthesis tools create program fragments that the holes such that the resulting complete program has the desired functionality. Traditional solutions to the sketching problem perform a translation to SAT and employ CEGIS. While e ective for a range of programs, when applied to real applications, such translation-based approaches have a key limitation: they require either translating all relevant libraries that are invoked directly or indirectly by the given sketch – which can lead to impractical SAT problems – or creating models of those libraries – which can require much manual effort.
is paper introduces execution-driven sketching, a novel approach for synthesis of Java programs using a backtracking search that is commonly employed in so ware model checkers. e key novelty of our work is to introduce effective pruning strategies to effciently explore the actual program behaviors in presence of libraries and to provide a practical solution to sketching small parts of real-world applications, which may use complex constructs of modern languages, such as reflection or native calls. Our tool EdSketch embodies our approach in two forms: a stateful search based on the Java PathFinder model checker; and a stateless search based on re-execution inspired by the VeriSoft model checker. Experimental results show that EdSketch’s performance compares well with the well-known SAT-based Sketch system for a range of small but complex programs, and moreover, that EdSketch can complete some sketches that require handling complex constructs.
Deep Learning in Python with Tensorflow for FinanceBen Ball
Speaker: Ben Ball
Abstract: Python is becoming the de facto standard for many machine learning applications. We have been using Python with deep learning and other ML techniques, with a focus in prediction and exploitation in transactional markets. I am presenting one of our implementations (A dueling double DQN - a class of Reinforcement Learning algorithm) in Python using TensorFlow, along with information and background around the class of deep learning algorithm, and the application to financial markets we have employed. Attendees will learn how to implement a DQN using Tensorflow, and how to design a system for deep learning for solving a wide range of problems. The code will be available on github for attendees.
Bio: Ben is a believer in making a career out of what you love. He is inspired by the joining of excellent technology and research and likes building software that is easy to use, but does amazing things. His work has spanned 15 years, with a dual focus in AI Software Engineering and Algorithmic Trading. He is currently working as the CTO of http://prediction-machines.com
Video of the presentation:
https://engineers.sg/video/deep-learning-with-python-in-finance-singapore-python-user-group--1875
With a large number of web services offering the same functionality, the Quality of Service (QoS) rendered by a web service becomes a key differentiator. WS-BPEL has emerged as the de facto industry standard for composing web services. Thus, determining the QoS of a composite web service expressed in BPEL can be extremely beneficial. While there has been much work on QoS computation of structured workflows, there exists no tool to ascertain QoS for BPEL processes, which are semantically richer than conventional workflows. We propose a model for estimating three key QoS parameters - Response Time, Cost and Reliability - of an executable BPEL process from the QoS information of its partner services and certain control flow parameters. We have built a tool to compute QoS of a WS-BPEL process that accounts for most workflow patterns that may be expressed by standard WS-BPEL. Another feature of our QoS approach and the tool is that it allows a designer to explore the impact on QoS of using different software fault tolerance techniques like Recovery blocks, N-version programming etc., thereby provisioning QoS computation of mission critical applications that may employ these techniques to achieve high reliability and/or performance.
http://imatge-upc.github.io/telecombcn-2016-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
Paper presented at the 6th International Work-Conference on Ambient Assisted Living.
Abstract: Due to the increasing demand of multi-camera setup and long-term monitoring in vision applications, real-time multi-view action recognition has gain a great interest in recent years. In this paper, we propose a multiple kernel learning based fusion framework that employs a motion-based person detector for finding regions of interest and local descriptors with bag-of-words quantisation for feature representation. The experimental results on a multi-view action dataset suggest that the proposed framework significantly outperforms simple fusion techniques and state-of-the-art methods.
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
This document summarizes a paper on learning to remember rare events using a memory-augmented neural network. The paper proposes a memory module that stores examples from previous tasks to help learn new rare tasks from only a single example. The memory module is trained end-to-end with the neural network on two tasks: one-shot learning on Omniglot characters and machine translation of rare words. The implementation uses a TensorFlow memory module that stores key-value pairs to retrieve examples similar to a query. Experiments show the memory module improves one-shot learning performance and handles rare words better than baselines.
- POSTECH EECE695J, "딥러닝 기초 및 철강공정에의 활용", Week 5
- Contents: Restricted Boltzmann Machine (RBM), various activation functions, data preprocessing, regularization methods, training of a neural network
- Video: https://youtu.be/v4rGPl-8wdo
The document discusses topics related to just-in-time (JIT) compilation in the Java Virtual Machine (JVM). It explains that the JIT compiler is triggered when methods are invoked many times based on invocation and back-edge counters, compiling frequently used code to machine code for improved performance. It also discusses why ahead-of-time compilation is not well-suited for Java due to its dynamic nature, and what can cause compiled code to be deoptimized, such as class initialization failures, null pointer exceptions, and unstable control flow.
Multimodal Residual Learning for Visual Question-AnsweringNAVER D2
The document summarizes Jin-Hwa Kim's paper on multimodal residual learning for visual question answering (VQA). It describes the VQA task, the vision and question modeling parts of the proposed approach, and how multimodal residual networks are used to combine the vision and question representations. Evaluation results on the VQA test-dev dataset show the proposed approach achieves state-of-the-art performance.
The document discusses Java memory management and garbage collection. It describes the responsibilities of garbage collection as allocating memory, ensuring referenced objects remain in memory, and recovering memory from unreachable objects. It discusses generation collection and the Hotspot memory model with young and old generations. The major garbage collectors are described as serial, parallel, parallel compacting, and concurrent mark-sweep collectors. The document provides examples of VM options and logging output.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Deep learning for molecules, introduction to chainer chemistryKenta Oono
1) The document introduces machine learning and deep learning techniques for predicting chemical properties, including rule-based approaches versus learning-based approaches using neural message passing algorithms.
2) It discusses several graph neural network models like NFP, GGNN, WeaveNet and SchNet that can be applied to molecular graphs to predict characteristics. These models update atom representations through message passing and graph convolution operations.
3) Chainer Chemistry is introduced as a deep learning framework that can be used with these graph neural network models for chemical property prediction tasks. Examples of tasks include drug discovery and molecular generation.
This is the DeepStochLog presentation, published at AAAI22 (Association for the Advancement of Artificial Intelligence 2022).
Authors: Thomas Winters*, Giuseppe Marra*, Robin Manhaeve, Luc De Raedt
*equal contribution
Code: https://github.com/ml-kuleuven/deepstochlog
Abstract: Recent advances in neural symbolic learning, such as DeepProbLog, extend probabilistic logic programs with neural predicates. Like graphical models, these probabilistic logic programs define a probability distribution over possible worlds, for which inference is computationally hard. We propose DeepStochLog, an alternative neural symbolic framework based on stochastic definite clause grammars, a type of stochastic logic program, which defines a probability distribution over possible derivations. More specifically, we introduce neural grammar rules into stochastic definite clause grammars to create a framework that can be trained end-to-end. We show that inference and learning in neural stochastic logic programming scale much better than for neural probabilistic logic programs. Furthermore, the experimental evaluation shows that DeepStochLog achieves state-of-the-art results on challenging neural symbolic learning tasks.
This document discusses mutation testing and how it can be used to test test cases. It introduces the concept of mutating code to generate bugs and then running tests against the mutated code to evaluate how effective the tests are at detecting bugs. It provides examples of using the PIT and Descartes mutation testing tools and frameworks to mutate code and evaluate test coverage and effectiveness. Key benefits of mutation testing discussed are that it can reveal weaknesses or missing elements in test suites that regular testing may not uncover.
Robot, Learning from Data
1. Direct Policy Learning in RKHS with learning theory
2. Inverse Reinforcement Learning Methods
Sungjoon Choi (sungjoon.choi@cpslab.snu.ac.kr)
PFN Spring Internship Final Report: Autonomous Drive by Deep RLNaoto Yoshida
This is the final report for the spring internship 2016 at Preferred Networks. gym_torcs is released in mt github account: https://github.com/ugo-nama-kun/gym_torcs
This document provides MATLAB examples of neural networks, including:
1. Calculating the output of a simple neuron and plotting it over a range of inputs.
2. Creating a custom neural network, defining its topology and transfer functions, training it on sample data, and calculating outputs.
3. Classifying linearly separable data with a perceptron network and plotting the decision boundary.
This document describes OpenCL buffer management and provides examples of its use. It discusses declaring buffers, copying data between the host and device, and provides simple examples of image rotation and matrix multiplication. The goal is to demonstrate the basic OpenCL host code needed for buffer handling and to serve as templates for more complex kernels.
The document discusses using machine learning techniques to learn vector representations of SQL queries that can then be used for various workload management tasks without requiring manual feature engineering. It shows that representations learned from SQL strings using models like Doc2Vec and LSTM autoencoders can achieve high accuracy for tasks like predicting query errors, auditing users, and summarizing workloads for index recommendation. These learned representations allow workload management to be database agnostic and avoid maintaining database-specific feature extractors.
Photo-realistic Single Image Super-resolution using a Generative Adversarial ...Hansol Kang
* Ledig, Christian, et al. "Photo-realistic single image super-resolution using a generative adversarial network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
EdSketch: Execution-Driven Sketching for JavaLisa Hua
Sketching is a relatively recent approach to program synthesis, which has shown much promise. The key idea in sketching is to allow users to write partial programs that have “holes” and provide test harnesses or reference implementations, and let synthesis tools create program fragments that the holes such that the resulting complete program has the desired functionality. Traditional solutions to the sketching problem perform a translation to SAT and employ CEGIS. While e ective for a range of programs, when applied to real applications, such translation-based approaches have a key limitation: they require either translating all relevant libraries that are invoked directly or indirectly by the given sketch – which can lead to impractical SAT problems – or creating models of those libraries – which can require much manual effort.
is paper introduces execution-driven sketching, a novel approach for synthesis of Java programs using a backtracking search that is commonly employed in so ware model checkers. e key novelty of our work is to introduce effective pruning strategies to effciently explore the actual program behaviors in presence of libraries and to provide a practical solution to sketching small parts of real-world applications, which may use complex constructs of modern languages, such as reflection or native calls. Our tool EdSketch embodies our approach in two forms: a stateful search based on the Java PathFinder model checker; and a stateless search based on re-execution inspired by the VeriSoft model checker. Experimental results show that EdSketch’s performance compares well with the well-known SAT-based Sketch system for a range of small but complex programs, and moreover, that EdSketch can complete some sketches that require handling complex constructs.
Deep Learning in Python with Tensorflow for FinanceBen Ball
Speaker: Ben Ball
Abstract: Python is becoming the de facto standard for many machine learning applications. We have been using Python with deep learning and other ML techniques, with a focus in prediction and exploitation in transactional markets. I am presenting one of our implementations (A dueling double DQN - a class of Reinforcement Learning algorithm) in Python using TensorFlow, along with information and background around the class of deep learning algorithm, and the application to financial markets we have employed. Attendees will learn how to implement a DQN using Tensorflow, and how to design a system for deep learning for solving a wide range of problems. The code will be available on github for attendees.
Bio: Ben is a believer in making a career out of what you love. He is inspired by the joining of excellent technology and research and likes building software that is easy to use, but does amazing things. His work has spanned 15 years, with a dual focus in AI Software Engineering and Algorithmic Trading. He is currently working as the CTO of http://prediction-machines.com
Video of the presentation:
https://engineers.sg/video/deep-learning-with-python-in-finance-singapore-python-user-group--1875
With a large number of web services offering the same functionality, the Quality of Service (QoS) rendered by a web service becomes a key differentiator. WS-BPEL has emerged as the de facto industry standard for composing web services. Thus, determining the QoS of a composite web service expressed in BPEL can be extremely beneficial. While there has been much work on QoS computation of structured workflows, there exists no tool to ascertain QoS for BPEL processes, which are semantically richer than conventional workflows. We propose a model for estimating three key QoS parameters - Response Time, Cost and Reliability - of an executable BPEL process from the QoS information of its partner services and certain control flow parameters. We have built a tool to compute QoS of a WS-BPEL process that accounts for most workflow patterns that may be expressed by standard WS-BPEL. Another feature of our QoS approach and the tool is that it allows a designer to explore the impact on QoS of using different software fault tolerance techniques like Recovery blocks, N-version programming etc., thereby provisioning QoS computation of mission critical applications that may employ these techniques to achieve high reliability and/or performance.
http://imatge-upc.github.io/telecombcn-2016-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
Paper presented at the 6th International Work-Conference on Ambient Assisted Living.
Abstract: Due to the increasing demand of multi-camera setup and long-term monitoring in vision applications, real-time multi-view action recognition has gain a great interest in recent years. In this paper, we propose a multiple kernel learning based fusion framework that employs a motion-based person detector for finding regions of interest and local descriptors with bag-of-words quantisation for feature representation. The experimental results on a multi-view action dataset suggest that the proposed framework significantly outperforms simple fusion techniques and state-of-the-art methods.
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
This document summarizes a paper on learning to remember rare events using a memory-augmented neural network. The paper proposes a memory module that stores examples from previous tasks to help learn new rare tasks from only a single example. The memory module is trained end-to-end with the neural network on two tasks: one-shot learning on Omniglot characters and machine translation of rare words. The implementation uses a TensorFlow memory module that stores key-value pairs to retrieve examples similar to a query. Experiments show the memory module improves one-shot learning performance and handles rare words better than baselines.
- POSTECH EECE695J, "딥러닝 기초 및 철강공정에의 활용", Week 5
- Contents: Restricted Boltzmann Machine (RBM), various activation functions, data preprocessing, regularization methods, training of a neural network
- Video: https://youtu.be/v4rGPl-8wdo
The document discusses topics related to just-in-time (JIT) compilation in the Java Virtual Machine (JVM). It explains that the JIT compiler is triggered when methods are invoked many times based on invocation and back-edge counters, compiling frequently used code to machine code for improved performance. It also discusses why ahead-of-time compilation is not well-suited for Java due to its dynamic nature, and what can cause compiled code to be deoptimized, such as class initialization failures, null pointer exceptions, and unstable control flow.
Multimodal Residual Learning for Visual Question-AnsweringNAVER D2
The document summarizes Jin-Hwa Kim's paper on multimodal residual learning for visual question answering (VQA). It describes the VQA task, the vision and question modeling parts of the proposed approach, and how multimodal residual networks are used to combine the vision and question representations. Evaluation results on the VQA test-dev dataset show the proposed approach achieves state-of-the-art performance.
The document discusses Java memory management and garbage collection. It describes the responsibilities of garbage collection as allocating memory, ensuring referenced objects remain in memory, and recovering memory from unreachable objects. It discusses generation collection and the Hotspot memory model with young and old generations. The major garbage collectors are described as serial, parallel, parallel compacting, and concurrent mark-sweep collectors. The document provides examples of VM options and logging output.
https://telecombcn-dl.github.io/2017-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.
Deep learning for molecules, introduction to chainer chemistryKenta Oono
1) The document introduces machine learning and deep learning techniques for predicting chemical properties, including rule-based approaches versus learning-based approaches using neural message passing algorithms.
2) It discusses several graph neural network models like NFP, GGNN, WeaveNet and SchNet that can be applied to molecular graphs to predict characteristics. These models update atom representations through message passing and graph convolution operations.
3) Chainer Chemistry is introduced as a deep learning framework that can be used with these graph neural network models for chemical property prediction tasks. Examples of tasks include drug discovery and molecular generation.
This is the DeepStochLog presentation, published at AAAI22 (Association for the Advancement of Artificial Intelligence 2022).
Authors: Thomas Winters*, Giuseppe Marra*, Robin Manhaeve, Luc De Raedt
*equal contribution
Code: https://github.com/ml-kuleuven/deepstochlog
Abstract: Recent advances in neural symbolic learning, such as DeepProbLog, extend probabilistic logic programs with neural predicates. Like graphical models, these probabilistic logic programs define a probability distribution over possible worlds, for which inference is computationally hard. We propose DeepStochLog, an alternative neural symbolic framework based on stochastic definite clause grammars, a type of stochastic logic program, which defines a probability distribution over possible derivations. More specifically, we introduce neural grammar rules into stochastic definite clause grammars to create a framework that can be trained end-to-end. We show that inference and learning in neural stochastic logic programming scale much better than for neural probabilistic logic programs. Furthermore, the experimental evaluation shows that DeepStochLog achieves state-of-the-art results on challenging neural symbolic learning tasks.
This document discusses mutation testing and how it can be used to test test cases. It introduces the concept of mutating code to generate bugs and then running tests against the mutated code to evaluate how effective the tests are at detecting bugs. It provides examples of using the PIT and Descartes mutation testing tools and frameworks to mutate code and evaluate test coverage and effectiveness. Key benefits of mutation testing discussed are that it can reveal weaknesses or missing elements in test suites that regular testing may not uncover.
Robot, Learning from Data
1. Direct Policy Learning in RKHS with learning theory
2. Inverse Reinforcement Learning Methods
Sungjoon Choi (sungjoon.choi@cpslab.snu.ac.kr)
PFN Spring Internship Final Report: Autonomous Drive by Deep RLNaoto Yoshida
This is the final report for the spring internship 2016 at Preferred Networks. gym_torcs is released in mt github account: https://github.com/ugo-nama-kun/gym_torcs
GALE: Geometric active learning for Search-Based Software EngineeringCS, NcState
Multi-objective evolutionary algorithms (MOEAs) help software engineers find novel solutions to complex problems. When automatic tools explore too many options, they are slow to use and hard to comprehend. GALE is a near-linear time MOEA that builds a piecewise approximation to the surface of best solutions along the Pareto frontier. For each piece, GALE mutates solutions towards the better end. In numerous case studies, GALE finds comparable solutions to standard methods (NSGA-II, SPEA2) using far fewer evaluations (e.g. 20 evaluations, not 1,000). GALE is recommended when a model is expensive to evaluate, or when some audience needs to browse and understand how an MOEA has made its conclusions.
This document summarizes a Kaggle competition on ultrasound nerve segmentation. It describes the data provided, which includes over 5000 training images and masks of the Brachial Plexus nerve. Several baselines are presented, with the top method being a U-Net model achieving a score of 0.62. The document then analyzes aspects of the winning solution in detail, which was based on a modified U-Net architecture with techniques like dropout, data augmentation, and an ensemble of models to achieve a final score of 0.70399. Other approaches tried like FCNs and Inception networks are also discussed.
AI optimizing HPC simulations (presentation from 6th EULAG Workshop)byteLAKE
See our presentation from the 6th International EULAG Users Workshop. We talked about taking HPC to the "Industry 4.0" by implementing smart techniques to optimize the codes in terms of performance and energy consumption. It explains how Machine Learning can dynamically optimize HPC simulations and byteLAKE's software autotuning solution.
Find out more about byteLAKE at: www.byteLAKE.com
This document discusses optimizations for high performance and energy efficient implementations of the Smith-Waterman algorithm on FPGAs using OpenCL. It describes an architecture with a systolic array for parallel computation along anti-diagonals and compression techniques to address the memory-bound nature. Experimental results on two FPGA boards show up to 42.5 GCUPS performance with the best performance/power ratio compared to CPUs and other FPGA implementations.
This document summarizes Patrick Kennedy's work on a Kaggle competition for Prudential Life Insurance to predict risk levels from application data. Kennedy tested various models including XGBoost, AdaBoost, and a 4-level stacking ensemble. After optimizing offsets and bins, the best model scored 0.667. Kennedy plans to explore potential structural patterns in the data and additional models like neural networks to improve performance in the remaining 5 days of the competition. The overall goal is to build a model that can assess life insurance risk levels from applications in a fast, on-demand manner.
Kaggle reviewPlanet: Understanding the Amazon from SpaceEduard Tyantov
This document summarizes a Kaggle competition to detect deforestation in the Amazon rainforest using satellite images. It describes:
1. The competition involved classifying over 150,000 image chips into 17 land cover classes to detect deforestation.
2. The baseline model was a ResNet-18 pretrained on ImageNet with fine-tuning, which achieved a score of 90.06%. Several techniques like optimal class thresholds and hyperparameter tuning improved the score to 92.53%.
3. The top models combined RGB satellite images with a near-infrared channel and indexes, training separate branches on JPG and TIF data. The best single model scored 93.071% by ensembling different model
Accelerating stochastic gradient descent using adaptive mini batch size3muayyad alsadi
This document proposes a method called Train-Measure-Adapt-Repeat for accelerating stochastic gradient descent training of deep neural networks using adaptive mini-batch sizes. The method starts with an extremely small mini-batch size, such as 4-8 samples, to allow for faster training initially through more frequent weight updates. Accuracy is evaluated over time rather than by the number of steps, and the mini-batch size is increased adaptively when accuracy improvements stall. Experiments on image classification datasets demonstrate the method reaching higher accuracy levels faster than using fixed large mini-batch sizes.
The document discusses multi-layer perceptrons (MLPs) and the backpropagation algorithm. [1] MLPs can learn nonlinear decision boundaries using multiple layers of nodes and nonlinear activation functions. [2] The backpropagation algorithm is used to train MLPs by calculating error terms that are propagated backward to adjust weights throughout the network. [3] Backpropagation finds a local minimum of the error function through gradient descent and may get stuck but works well in practice.
The document discusses multi-layer perceptrons and the backpropagation algorithm. It provides an overview of MLP architecture with input, output, and internal nodes. It explains that MLPs can learn nonlinear decision boundaries using sigmoid activation functions. The backpropagation algorithm is then described in detail, including forward and backward propagation steps to calculate errors and update weights through gradient descent. Applications of neural networks are also listed.
Scaling Deep Learning Algorithms on Extreme Scale Architecturesinside-BigData.com
This document summarizes a presentation on scaling deep learning algorithms on extreme scale architectures. It discusses challenges in using deep learning, a vision for machine/deep learning R&D including novel algorithms, and the MaTEx toolkit which supports distributed deep learning on GPU and CPU clusters. Sample results show strong and weak scaling of asynchronous gradient descent on Summit. Fault tolerance needs and the impact of deep learning on other domains are also covered.
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsIntel® Software
The second-generation Intel® Xeon Phi™ processor offers new and enhanced features that provide significant performance gains in modernized code. For this lab, we pair these features with Intel® Software Development Products and methodologies to enable developers to gain insights on application behavior and to find opportunities to optimize parallelism, memory, and vectorization features.
Slide for study session given by Christian Saravia at Arithmer inc.
It is a summary of recent method for object detection, centernet.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Chap 8. Optimization for training deep modelsYoung-Geun Choi
연구실 내부 세미나 자료. Goodfellow et al. (2016), Deep Learning, MIT Press의 Chapter 8을 요약/발췌하였습니다. 깊은 신경망(deep neural network) 모형 훈련시 목적함수 최적화 방법으로 흔히 사용되는 방법들을 소개합니다.
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
1) Bayesian optimization can be used to efficiently tune the hyperparameters of machine learning models, requiring far fewer evaluations than standard random search or grid search methods to find good hyperparameters.
2) It builds a statistical model called a Gaussian process to model the objective function based on previous evaluations, and uses this to select the most promising hyperparameters to evaluate next in order to optimize an objective metric like accuracy.
3) SigOpt is a service that uses Bayesian optimization to tune machine learning models, outperforming expert humans on tasks like classifying images from CIFAR10 and reducing error rates more than standard methods.
Using Bayesian Optimization to Tune Machine Learning ModelsSigOpt
1. Tuning machine learning models is challenging due to the large number of non-intuitive hyperparameters.
2. Traditional tuning methods like grid search are computationally expensive and can find local optima rather than global optima.
3. Bayesian optimization uses Gaussian processes to build statistical models from prior evaluations to determine the most promising hyperparameters to test next, requiring far fewer evaluations than traditional methods to find better performing models.
The document discusses object detection pipelines. It begins by defining object detection as identifying objects in images and locating them with bounding boxes. The main components of an object detection pipeline are datasets, preprocessing, model selection and training, testing and evaluation. Popular models discussed are Faster R-CNN, R-FCN, and SSD which use deep convolutional neural networks as feature extractors and classifiers. Key evaluation metrics are mean average precision and prediction time/memory usage. Popular datasets mentioned are MSCOCO, Pascal VOC, and LSVRC. The document provides information on preprocessing, training including fine-tuning pre-trained models, and codes/models available on GitHub.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
BREEDING METHODS FOR DISEASE RESISTANCE.pptxRASHMI M G
Plant breeding for disease resistance is a strategy to reduce crop losses caused by disease. Plants have an innate immune system that allows them to recognize pathogens and provide resistance. However, breeding for long-lasting resistance often involves combining multiple resistance genes
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
2. Table of Content
● Bio
● Competition
● Evaluation metric
● Winning solutions
● My solution
● What didn’t work
● Conclusion
Kenneth Emeka Odoh
2
3. Education
● Masters in Computer Science (University of Regina) 2013 - 2016
● A number of MOOCs in statistics, algorithms, and machine learning
Work
● Research assistant in the field of Visual Analytics 2013 - 2016
○ { https://scholar.google.com/citations?user=3G2ncgwAAAAJ&hl=en }
● Software Engineer in Vancouver, Canada 2017 -
● Regular speaker at a number of Vancouver AI meetups, organizer of
distributed systems meetup, and Haskell study group and organizer of
Applied Cryptography study group 2017 -
Bio
Kenneth Emeka Odoh
3
5. Competition
● Geology is the earth science that deals with the study of
the properties of the earth crust.
● This competition is related to geology.
● Competition focuses on structured prediction.
● Salt occurs when drilling for crude oil or gas in a number
of oil fields around the world.
● Given a radar image, find the patches of salt in the
images. This is a classic image segmentation problem.
Kenneth Emeka Odoh
5
6. The prizes consist of
● 1st Place - $ 50,000
● 2nd Place - $25,000
● 3rd Place - $ 15,000
● 4th Place - $ 10,000
Competition rules
● 5 submissions per day.
● External data is not allowed.
6
Kenneth Emeka Odoh
10. Data
● Number of training set: 4000 images
● Number of testing set: 18000 images
● Original image size ( 101 x 101 ) pixels
Each image has a mask in training set.
Only images are given in the testing set, then you have to predict the mask.
Competition lasted for 3 months.
Submission format is a compression format of Run length encoding
Kenneth Emeka Odoh
10
11. Kenneth Emeka Odoh
Image vs mask
{https://www.kaggle.com/skainkaryam/basic-data-visualization-using-pytorch-dataset#} 11
13. Winning Solution: 1st place
Summary can be found in
{https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/69291 }.
● encoder: ResNeXt50 / ResNet34
● center: Feature Pyramid Attention
● decoder: conv3x3, T conv, scSE + hyper columns
● loss: BCE, Lovász - 2-level pseudo-labeling
● 2 model final ensemble
It is good to note the following:
● Loss: BCE for classification and Lovasz for segmentation
● Created a local CV by stratifying by depth which correlated with LB.
● segmentation zoo {https://github.com/qubvel/segmentation_models }
● for more information on hypercolumn
{https://en.wikipedia.org/wiki/Cortical_column }.
● scSE, Spatial-Channel Sequeeze & Excitation
{https://arxiv.org/abs/1803.02579 }Kenneth Emeka Odoh
13
14. 1st stage training
Input: 101 -> resize to 192 -> pad to 224
Encoder: ResNeXt50 pretrained on ImageNet
Decoder: conv3x3 + BN, Upsampling, scSE
Optimizer: RMSprop. Batch size: 24
Loss: BCE+Dice. Reduce LR on plateau starting from 0.0001
Loss: Lovasz. Reduce LR on plateau starting from 0.00005
Loss: Lovasz. 4 snapshots with cosine annealing LR, 80 epochs each, LR
starting from 0.0001
Some attempts at ensembling
5-fold ResNeXt50 had 0.864 Public LB (0.878 Private LB)
5-fold ResNet34 had 0.863 (0.880 Private)
Their ensemble scored 0.867 (0.885 Private)
Kenneth Emeka Odoh
14
15. 2nd Stage Training
The first stage provides pseudolabels using probability. Confidence was measured
as a measure of percentage count of pixel predictions (probability < 0.2 or
probability > 0.8)
They used ResNeXt50 was pretrained on confident pseudolabels; and 5 folds
were trained on top of them. 0.871 (0.890 Private)
Kenneth Emeka Odoh
15
16. 3rd Stage Training
We took all the pseudolabels from the 2nd stage ensemble, and phalanx trained 2 models:
resnet_34_pad_128
Input: 101 -> pad to 128
Encoder: ResNet34 + scSE (conv7x7 -> conv3x3 and remove first max pooling)
Center Block: Feature Pyramid Attention (remove 7x7)
Decoder: conv3x3, transposed convolution, scSE + hyper columns
Loss: Lovasz
resnet_34_resize_128
Input: 101 -> resize to 128
Encoder: ResNet34 + scSE (remove first max pooling)
Center Block: conv3x3, Global Convolutional Network
Decoder: Global Attention Upsample (implemented like senet -> like scSE, conv3x3 -> GCN)
+ deep supervision
Loss: BCE for classification and Lovasz for segmentation
Kenneth Emeka Odoh
16
17. Training overview:
Optimizer: SGD. Batch size: 32.
Pretrain on pseudolabels for 150 epochs (50 epochs per cycle with cosine annealing, LR 0.01 ->
0.001)
Finetune on train data. 5 folds, 4 snapshots with cosine annealing LR, 50 epochs each, LR 0.01 ->
0.001
resnet_34_pad_128 had 0.874 (0.895 Private)
resnet_34_resize_128 had 0.872 (0.892 Private)
Final Model
Final model is a blend of ResNeXt50 from the 2nd stage and resnet_34_pad_128 from the 3rd stage with
horizontal flip TTA: 0.876 Public LB (0.896 Private LB).
Augmentations
The list of augmentations
HorizontalFlip(p=0.5)
RandomBrightness(p=0.2,limit=0.2)
RandomContrast(p=0.1,limit=0.2)
ShiftScaleRotate(shift_limit=0.1625, scale_limit=0.6, rotate_limit=0, p=0.7)
Code: https://www.kaggle.com/youhanlee/1st-solution-reproducing-1-unet-resnet34-se/notebook#
Kenneth Emeka Odoh
17
18. Winning Solution: 9th place
The single fold score of SENet154 was 0.882/0.869 and with 10 folds reflective
padding + resizing 0.890/0.875.
● AdamW with the Noam scheduler
● cutout {https://arxiv.org/abs/1708.04552 }
● Stochastic Weight Averaging (SWA) after the training on the best loss, pixel
accuracy (+0.004).
● They used gitlab to manage experiments.
● Using a symmetric version of Lovasz which gave better result
def symmetric_lovasz(outputs, targets):
return (lovasz_hinge(outputs, targets) + lovasz_hinge(-outputs, 1 - targets)) / 2
Summary available
{https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/69053 }
Kenneth Emeka Odoh
18
19. Summary can be found
{https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/69093}
Model: UNet-like architecture
Backbone: SE ResNeXt-50, pretrained on ImageNet
Decoder features:
● Spatial and Channel Squeeze Excitation gating
● Hypercolumns
● Deep supervision (zero/nonzero mask)
Tile size adaptation:
● Pad 101 -> 128
Winning Solution: 11th place
19
Kenneth Emeka Odoh
20. Augmentations:
● Random invert
● Random cutout
● Random gamma-correction
● Random fixed-size crop
● Random horizontal flip
Optimizer:
● SGD: momentum 0.9, weight decay 0.0001
● Batch size: 16
● Finding LR find using fast.ai course - 5e-2
● Cosine annealing for training
● used snapshots for ensembling.
20
Kenneth Emeka Odoh
21. Loss:
deep supervision it’s a combination of 3 losses - classification loss, pure
segmentation loss , total loss
Classification loss - BCE
Segmentation losses - BCE + Lavasz Hinge*0.5
Segmentation loss was evaluated after cropping masks and predictions back
to 101 -
Cross-validation
5-fold random split
Only depth stratification worked
21
Kenneth Emeka Odoh
22. My Solution
● A number of neural architectures like dilated unet, ‘vanilla’ unets with varying
depths with batchnorm or dropout.
● Used loss function like lovasz_loss in second stage of training gives boost
after binary cross entropy was used in the first stage of training.
● Standardizing the image gives better results than just dividing by 255.
● Cyclical learning rate policy (CLR) with restart gave some boost in training
{https://arxiv.org/abs/1506.01186}.
● Don’t use ReduceLROnPlateau with CLR at the same time. You can confuse
your network.
● Training with reduce on plateau in the spirit of simulated annealing tends to
restore learning after relatively long epochs of degrading performance in
some cases.
Kenneth Emeka Odoh
22
23. ● Be careful with early stopping. It may be a blessing or a curse.
● Pre-trained networks in my experimentation were not better than training from
scratch.
● Did Test-time argumentation with only horizontal flipping.
● Got best 5 models with scores from 0.74 - 0.79.
● Perform CRF is local CV is <0.75. Under this condition, I get some 0.2 boost
in LB score.
● Ensemble using weighted average using the LB scores as weight.See more
information in conclusions section.
Position: 1482 / 3267 teams
Private LB : 0.824
Kenneth Emeka Odoh
23
24. My System Setup
● I used a GTX 1080 with 32GB Ram and ITB hard disk. Luckily, I did not have
any need for heaters in my room.
● I create a hypothesis and try to verify it in the code. I will out a number of
Python scripts.
● Run the Python script with a bash script. Each bash script runs a list of
Python script which I consider as an experiment.
○ The benefit of this setup, if a script files, the other scripts with still runs. This will reduce time of
development.
● Jupiter was better if I had to visualize.
● Ran over 100 experiments, validating a number of hypotheses.
● Extensive diagnostics to figure out when experiments go badly and end
experiments to save time.
Kenneth Emeka Odoh
24
25. What didn’t work
● Performing fft on the image and using the amplitude gives very bad
performance. This made me suspect that the phase angles were
important.
● Focal loss, lovasc loss seem to give participant boost if used in first stage
training. However, it gives output in many cases that is meaningless.
Although, someone claimed on the forum that it required a special kind of
gradient flow but logit on the softmax layer of the network
● Training-time argumentation could not work for me.
● Techniques like DANet, OCNet, pixel shuffling, transposing did not work
for some people.
● Pseudo-labeling worked for some and not for others.
Kenneth Emeka Odoh
25
26. Discussions
● Private LB score was higher for many teams as it has been hypothesized that organizers
cherry picked “few-pixel masks (which were hard to predict, incurred high penalty and
probably weren’t important from business standpoint) from private set”.
● Rule of thumb for using pseudo-labeling: it is used when the training set is small.
However, it is difficult to determine sample size and to tune properly making hard to use
in production.
● Treat as a classification + segmentation problem instead of a straightforward
segmentation.
● Don't trust CV, trust stability
{https://www.kaggle.com/c/tgs-salt-identification-challenge/discussion/69051 }
● There was a well-written post about the unethical nature of the competition based on a
climate change activist.This person was against the oil industry and managed to clearly
articulate his point, urging people to abandon the competition.
Kenneth Emeka Odoh
26
27. Common Approaches● Data preprocessing
○ Data augmentation
○ Upsampling
● Build the model
○ Variations of neural architectures (encoder-decoder architecture)
○ Creating more models
■ Perturb the model (change configuration of network)
■ Perturb the data (fft, cumsum, glcm, watershed e.t.c)
● Post-processing
○ Conditional random field
○ Thresholding
○ Downsampling
● Ensembling
○ Averaging of rle blend (weighted average)
Kenneth Emeka Odoh
27
28. Data Augmentation
● The goal is to perturb the data to intensify the signal.
● Caveat: using the wrong argumentation can worsen performance.
● During the competition, I came up with a rule of thumb (Intuition) behind data
argumentation
● Crop if the majority of the data on a 2D spectrogram are concentrated in an area in the image.
Taking the crop in the concentrated area would intensify the signal.
● Flip is if you observe that the patterns in the data tend to be symmetric in either horizontal and
vertical axis.
● Augmentation requires some domain knowledge of how the distribution of data may be in the
wild.
● imgaug library { https://github.com/aleju/imgaug }
● Albumentations {https://github.com/albu/albumentations/releases/tag/v0.1.1}
The best argumentation was horizontal flip in this competition.
Kenneth Emeka Odoh
28
29. Conditional Random Field (CRF)
● CRF is a Markov network suitable for structured prediction.
● Imagine that the features of the data set are highly correlated. As a
result there are lots of overlaps and redundancy. For example, pixels
in an image are similar in local neighbourhood regions.
● Naive Bayes would ignore all the correlations between attributes
which is very informative due to strong independence assumption
leading to skewed probability distribution.
● Adding edges to capture correlations forms a densely connected
network and even harder to figure out the connections.
● Model p(Y|X), instead of p(X,Y). Makes us to care less about
correlation between features.
Kenneth Emeka Odoh
29
30. ● Uses Gibbs (Boltzmann) distribution. See url
{https://en.wikipedia.org/wiki/Boltzmann_distribution } for more information.
[@Daphne Koller, Coursera]
Kenneth Emeka Odoh
30
31. ● CRF is parametrized as a Gibbs distribution, but normalizalized differently.
● CRF can be a logistic regression under some conditions.
● No need to model over distributions that we don’t care about.
● Allows for very expressive features, without the limiting independence
assumption.[@Daphne Koller, Coursera]
Kenneth Emeka Odoh
31
32. Applications of CRF consist of the following, but not limited to these
● image segmentation
○ input: is a set of pixels with values
○ output: class for every pixel
● Text processing
○ input: word in a sentences
○ output: labels of the word (named entity recognition)
CRF can exploit neighbourhood information when classifying thereby
making it look like sequence modeling of some sort.
Kenneth Emeka Odoh
32
33. For more information on using CRF with image segmentation, let us visit
{https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/classes/cs294
_f99/notes/lec7/lec7.htm }.
The version of CRF used by a number of participants is
{https://github.com/lucasb-eyer/pydensecrf} based on the paper
{https://www.philkr.net/papers/2013-06-01-icml/2013-06-01-icml.pdf }
Kenneth Emeka Odoh
33
34. Handling large data set
● Reduce image size.
● Use smaller batch size.
● Use Python generators. A number of deep learning packages
provides a fit-generator method. This allows for careful batch
processing. See tutorial on generator and coroutines
{https://www.dabeaz.com/usenix2009/generators/GeneratorUSENIX.p
df }, {https://www.dabeaz.com/coroutines/Coroutines.pdf }
● Taking a sliding window to crop the image into manageable sizes and
pass through the network and aggregate at the test time.
● Empirically, I found batchnorm to be more computationally expensive
than dropout in keras.
Kenneth Emeka Odoh
34
35. Conclusions
● If you are using save-best model, then it is better to train for longer epoches.
● The trick to improving results is in identifying images without salt patches.
○ Participant tried to create a network to identify empty mask. This helped to intensify the
signal.
○ I tried to do so at the point of ensembling. Note, this will only work if your models are as
diverse as possible.
● Avoid using batchnorm with dropout in the same network. This leads to bad
performance {https://arxiv.org/pdf/1801.05134.pdf }
● Systematically reducing the dropout rate as information flows through the
encoder-decoder network leads to improved performance.
● Upsampling during training especially for the dilated u-nets tends to improve
performance.
Kenneth Emeka Odoh
35