Slides from Portland Machine Learning meetup, April 13th.
Abstract: You've heard all the cool tech companies are using them, but what are Convolutional Neural Networks (CNNs) good for and what is convolution anyway? For that matter, what is a Neural Network? This talk will include a look at some applications of CNNs, an explanation of how CNNs work, and what the different layers in a CNN do. There's no explicit background required so if you have no idea what a neural network is that's ok.
This document provides an agenda for a presentation on deep learning, neural networks, convolutional neural networks, and interesting applications. The presentation will include introductions to deep learning and how it differs from traditional machine learning by learning feature representations from data. It will cover the history of neural networks and breakthroughs that enabled training of deeper models. Convolutional neural network architectures will be overviewed, including convolutional, pooling, and dense layers. Applications like recommendation systems, natural language processing, and computer vision will also be discussed. There will be a question and answer section.
Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach.
This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.
Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Deep Learning - Convolutional Neural Networks - Architectural ZooChristian Perone
This document discusses different convolutional neural network architectures including traditional architectures using convolutional, pooling, and fully connected layers, siamese networks for learning visual similarity, dense prediction networks for tasks like semantic segmentation and image colorization, video classification networks, music recommendation networks, and networks for tasks like object localization, detection, and alignment. It provides examples of specific networks that have been applied to each type of architecture.
Recurrent Neural Networks. Part 1: TheoryAndrii Gakhov
The document provides an overview of recurrent neural networks (RNNs) and their advantages over feedforward neural networks. It describes the basic structure and training of RNNs using backpropagation through time. RNNs can process sequential data of variable lengths, unlike feedforward networks. However, RNNs are difficult to train due to vanishing and exploding gradients. More advanced RNN architectures like LSTMs and GRUs address this by introducing gating mechanisms that allow the network to better control the flow of information.
Machine Learning - Convolutional Neural NetworkRichard Kuo
The document provides an overview of convolutional neural networks (CNNs) for visual recognition. It discusses the basic concepts of CNNs such as convolutional layers, activation functions, pooling layers, and network architectures. Examples of classic CNN architectures like LeNet-5 and AlexNet are presented. Modern architectures such as Inception and ResNet are also discussed. Code examples for image classification using TensorFlow, Keras, and Fastai are provided.
Deep learning and neural networks are inspired by biological neurons. Artificial neural networks (ANN) can have multiple layers and learn through backpropagation. Deep neural networks with multiple hidden layers did not work well until recent developments in unsupervised pre-training of layers. Experiments on MNIST digit recognition and NORB object recognition datasets showed deep belief networks and deep Boltzmann machines outperform other models. Deep learning is now widely used for applications like computer vision, natural language processing, and information retrieval.
Tijmen Blankenvoort, co-founder Scyfer BV, presentation at Artificial Intelligence Meetup 15-1-2014. Introduction into Neural Networks and Deep Learning.
This document provides an agenda for a presentation on deep learning, neural networks, convolutional neural networks, and interesting applications. The presentation will include introductions to deep learning and how it differs from traditional machine learning by learning feature representations from data. It will cover the history of neural networks and breakthroughs that enabled training of deeper models. Convolutional neural network architectures will be overviewed, including convolutional, pooling, and dense layers. Applications like recommendation systems, natural language processing, and computer vision will also be discussed. There will be a question and answer section.
Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach.
This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train.
Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Deep Learning - Convolutional Neural Networks - Architectural ZooChristian Perone
This document discusses different convolutional neural network architectures including traditional architectures using convolutional, pooling, and fully connected layers, siamese networks for learning visual similarity, dense prediction networks for tasks like semantic segmentation and image colorization, video classification networks, music recommendation networks, and networks for tasks like object localization, detection, and alignment. It provides examples of specific networks that have been applied to each type of architecture.
Recurrent Neural Networks. Part 1: TheoryAndrii Gakhov
The document provides an overview of recurrent neural networks (RNNs) and their advantages over feedforward neural networks. It describes the basic structure and training of RNNs using backpropagation through time. RNNs can process sequential data of variable lengths, unlike feedforward networks. However, RNNs are difficult to train due to vanishing and exploding gradients. More advanced RNN architectures like LSTMs and GRUs address this by introducing gating mechanisms that allow the network to better control the flow of information.
Machine Learning - Convolutional Neural NetworkRichard Kuo
The document provides an overview of convolutional neural networks (CNNs) for visual recognition. It discusses the basic concepts of CNNs such as convolutional layers, activation functions, pooling layers, and network architectures. Examples of classic CNN architectures like LeNet-5 and AlexNet are presented. Modern architectures such as Inception and ResNet are also discussed. Code examples for image classification using TensorFlow, Keras, and Fastai are provided.
Deep learning and neural networks are inspired by biological neurons. Artificial neural networks (ANN) can have multiple layers and learn through backpropagation. Deep neural networks with multiple hidden layers did not work well until recent developments in unsupervised pre-training of layers. Experiments on MNIST digit recognition and NORB object recognition datasets showed deep belief networks and deep Boltzmann machines outperform other models. Deep learning is now widely used for applications like computer vision, natural language processing, and information retrieval.
Tijmen Blankenvoort, co-founder Scyfer BV, presentation at Artificial Intelligence Meetup 15-1-2014. Introduction into Neural Networks and Deep Learning.
Sogang University Machine Learning and Data Mining lab seminar, Neural Networks for newbies and Convolutional Neural Networks. This is prerequisite material to understand deep convolutional architecture.
A comprehensive tutorial on Convolutional Neural Networks (CNN) which talks about the motivation behind CNNs and Deep Learning in general, followed by a description of the various components involved in a typical CNN layer. It explains the theory involved with the different variants used in practice and also, gives a big picture of the whole network by putting everything together.
Next, there's a discussion of the various state-of-the-art frameworks being used to implement CNNs to tackle real-world classification and regression problems.
Finally, the implementation of the CNNs is demonstrated by implementing the paper 'Age ang Gender Classification Using Convolutional Neural Networks' by Hassner (2015).
This document provides an overview of deep learning, including definitions of AI, machine learning, and deep learning. It discusses neural network models like artificial neural networks, convolutional neural networks, and recurrent neural networks. The document explains key concepts in deep learning like activation functions, pooling techniques, and the inception model. It provides steps for fitting a deep learning model, including loading data, defining the model architecture, adding layers and functions, compiling, and fitting the model. Examples and visualizations are included to demonstrate how neural networks work.
The document summarizes the Batch Normalization technique presented in the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift". Batch Normalization aims to address the issue of internal covariate shift in deep neural networks by normalizing layer inputs to have zero mean and unit variance. It works by computing normalization statistics for each mini-batch and applying them to the inputs. This helps in faster and more stable training of deep networks by reducing the distribution shift across layers. The paper presented ablation studies on MNIST and ImageNet datasets showing Batch Normalization improves training speed and accuracy compared to prior techniques.
This document provides an introduction to deep learning. It discusses the history of machine learning and how neural networks work. Specifically, it describes different types of neural networks like deep belief networks, convolutional neural networks, and recurrent neural networks. It also covers applications of deep learning, as well as popular platforms, frameworks and libraries used for deep learning development. Finally, it demonstrates an example of using the Nvidia DIGITS tool to train a convolutional neural network for image classification of car park images.
Artificial Intelligence, Machine Learning, Deep Learning
The 5 myths of AI
Deep Learning in action
Basics of Deep Learning
NVIDIA Volta V100 and AWS P3
最近のNLP×DeepLearningのベースになっている"Transformer"について、研究室の勉強会用に作成した資料です。参考資料の引用など正確を期したつもりですが、誤りがあれば指摘お願い致します。
This is a material for the lab seminar about "Transformer", which is the base of recent NLP x Deep Learning research.
This document provides an overview of artificial neural networks and their application as a model of the human brain. It discusses the biological neuron, different types of neural networks including feedforward, feedback, time delay, and recurrent networks. It also covers topics like learning in perceptrons, training algorithms, applications of neural networks, and references key concepts like connectionism, associative memory, and massive parallelism in the brain.
This document provides an overview of convolutional neural networks and summarizes four popular CNN architectures: AlexNet, VGG, GoogLeNet, and ResNet. It explains that CNNs are made up of convolutional and subsampling layers for feature extraction followed by dense layers for classification. It then briefly describes key aspects of each architecture like ReLU activation, inception modules, residual learning blocks, and their performance on image classification tasks.
It’s long ago, approx. 30 years, since AI was not only a topic for Science-Fiction writers, but also a major research field surrounded with huge hopes and investments. But the over-inflated expectations ended in a subsequent crash and followed by a period of absent funding and interest – the so-called AI winter. However, the last 3 years changed everything – again. Deep learning, a machine learning technique inspired by the human brain, successfully crushed one benchmark after another and tech companies, like Google, Facebook and Microsoft, started to invest billions in AI research. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new Hype? How is Deep Learning different from previous approaches? Are the advancing AI technologies really a threat for humanity? Let’s look behind the curtain and unravel the reality. This talk will explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why "Deep Learning is probably one of the most exciting things that is happening in the computer industry” (Jen-Hsun Huang – CEO NVIDIA).
Either a new AI “winter is coming” (Ned Stark – House Stark) or this new wave of innovation might turn out as the “last invention humans ever need to make” (Nick Bostrom – AI Philosoph). Or maybe it’s just another great technology helping humans to achieve more.
Transformer Architectures in Vision
[2018 ICML] Image Transformer
[2019 CVPR] Video Action Transformer Network
[2020 ECCV] End-to-End Object Detection with Transformers
[2021 ICLR] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
classify images from the CIFAR-10 dataset. The dataset consists of airplanes, dogs, cats, and other objects.we'll preprocess the images, then train a convolutional neural network on all the samples. The images need to be normalized and the labels need to be one-hot encoded.
The document summarizes the Transformer neural network model proposed in the paper "Attention is All You Need". The Transformer uses self-attention mechanisms rather than recurrent or convolutional layers. It achieves state-of-the-art results in machine translation by allowing the model to jointly attend to information from different representation subspaces. The key components of the Transformer include multi-head self-attention layers in the encoder and masked multi-head self-attention layers in the decoder. Self-attention allows the model to learn long-range dependencies in sequence data more effectively than RNNs.
Introduction to Recurrent Neural NetworkKnoldus Inc.
The document provides an introduction to recurrent neural networks (RNNs). It discusses how RNNs differ from feedforward neural networks in that they have internal memory and can use their output from the previous time step as input. This allows RNNs to process sequential data like time series. The document outlines some common RNN types and explains the vanishing gradient problem that can occur in RNNs due to multiplication of small gradient values over many time steps. It discusses solutions to this problem like LSTMs and techniques like weight initialization and gradient clipping.
This document provides an overview of convolutional neural networks (CNNs). It defines CNNs as multiple layer feedforward neural networks used to analyze visual images by processing grid-like data. CNNs recognize images through a series of layers, including convolutional layers that apply filters to detect patterns, ReLU layers that apply an activation function, pooling layers that detect edges and corners, and fully connected layers that identify the image. CNNs are commonly used for applications like image classification, self-driving cars, activity prediction, video detection, and conversion applications.
Teaching machines to see the process of designing (datasets) with aiDevFest DC
This document summarizes a presentation about teaching machines to see and recognize images using artificial intelligence. It discusses Clarifai, a company that provides image and video recognition services using convolutional neural networks (CNNs). The presentation explains how CNNs can learn representations of images without needing manual feature engineering. It demonstrates how CNNs can be trained to detect safe versus unsafe image content. Finally, it discusses how CNNs have progressed over time to surpass traditional computer vision techniques for tasks like object detection.
Evolution of Deep Learning and new advancementsChitta Ranjan
Earlier known as neural networks, deep learning saw a remarkable resurgence in the past decade. Neural networks did not find enough adopters in the past century due to its limited accuracy in real world applications (due to various reasons) and difficult interpretation. Many of these limitations got resolved in the recent years, and it was re-branded as deep learning. Now deep learning is widely used in industry and has become a popular research topic in academia. Learning about the passage of its evolution and development is intriguing. In this presentation, we will learn about how we resolved the issues in last generation neural networks, how we reached to the recent advanced methods from the earlier works, and different components of deep learning models.
Sogang University Machine Learning and Data Mining lab seminar, Neural Networks for newbies and Convolutional Neural Networks. This is prerequisite material to understand deep convolutional architecture.
A comprehensive tutorial on Convolutional Neural Networks (CNN) which talks about the motivation behind CNNs and Deep Learning in general, followed by a description of the various components involved in a typical CNN layer. It explains the theory involved with the different variants used in practice and also, gives a big picture of the whole network by putting everything together.
Next, there's a discussion of the various state-of-the-art frameworks being used to implement CNNs to tackle real-world classification and regression problems.
Finally, the implementation of the CNNs is demonstrated by implementing the paper 'Age ang Gender Classification Using Convolutional Neural Networks' by Hassner (2015).
This document provides an overview of deep learning, including definitions of AI, machine learning, and deep learning. It discusses neural network models like artificial neural networks, convolutional neural networks, and recurrent neural networks. The document explains key concepts in deep learning like activation functions, pooling techniques, and the inception model. It provides steps for fitting a deep learning model, including loading data, defining the model architecture, adding layers and functions, compiling, and fitting the model. Examples and visualizations are included to demonstrate how neural networks work.
The document summarizes the Batch Normalization technique presented in the paper "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift". Batch Normalization aims to address the issue of internal covariate shift in deep neural networks by normalizing layer inputs to have zero mean and unit variance. It works by computing normalization statistics for each mini-batch and applying them to the inputs. This helps in faster and more stable training of deep networks by reducing the distribution shift across layers. The paper presented ablation studies on MNIST and ImageNet datasets showing Batch Normalization improves training speed and accuracy compared to prior techniques.
This document provides an introduction to deep learning. It discusses the history of machine learning and how neural networks work. Specifically, it describes different types of neural networks like deep belief networks, convolutional neural networks, and recurrent neural networks. It also covers applications of deep learning, as well as popular platforms, frameworks and libraries used for deep learning development. Finally, it demonstrates an example of using the Nvidia DIGITS tool to train a convolutional neural network for image classification of car park images.
Artificial Intelligence, Machine Learning, Deep Learning
The 5 myths of AI
Deep Learning in action
Basics of Deep Learning
NVIDIA Volta V100 and AWS P3
最近のNLP×DeepLearningのベースになっている"Transformer"について、研究室の勉強会用に作成した資料です。参考資料の引用など正確を期したつもりですが、誤りがあれば指摘お願い致します。
This is a material for the lab seminar about "Transformer", which is the base of recent NLP x Deep Learning research.
This document provides an overview of artificial neural networks and their application as a model of the human brain. It discusses the biological neuron, different types of neural networks including feedforward, feedback, time delay, and recurrent networks. It also covers topics like learning in perceptrons, training algorithms, applications of neural networks, and references key concepts like connectionism, associative memory, and massive parallelism in the brain.
This document provides an overview of convolutional neural networks and summarizes four popular CNN architectures: AlexNet, VGG, GoogLeNet, and ResNet. It explains that CNNs are made up of convolutional and subsampling layers for feature extraction followed by dense layers for classification. It then briefly describes key aspects of each architecture like ReLU activation, inception modules, residual learning blocks, and their performance on image classification tasks.
It’s long ago, approx. 30 years, since AI was not only a topic for Science-Fiction writers, but also a major research field surrounded with huge hopes and investments. But the over-inflated expectations ended in a subsequent crash and followed by a period of absent funding and interest – the so-called AI winter. However, the last 3 years changed everything – again. Deep learning, a machine learning technique inspired by the human brain, successfully crushed one benchmark after another and tech companies, like Google, Facebook and Microsoft, started to invest billions in AI research. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new Hype? How is Deep Learning different from previous approaches? Are the advancing AI technologies really a threat for humanity? Let’s look behind the curtain and unravel the reality. This talk will explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why "Deep Learning is probably one of the most exciting things that is happening in the computer industry” (Jen-Hsun Huang – CEO NVIDIA).
Either a new AI “winter is coming” (Ned Stark – House Stark) or this new wave of innovation might turn out as the “last invention humans ever need to make” (Nick Bostrom – AI Philosoph). Or maybe it’s just another great technology helping humans to achieve more.
Transformer Architectures in Vision
[2018 ICML] Image Transformer
[2019 CVPR] Video Action Transformer Network
[2020 ECCV] End-to-End Object Detection with Transformers
[2021 ICLR] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
classify images from the CIFAR-10 dataset. The dataset consists of airplanes, dogs, cats, and other objects.we'll preprocess the images, then train a convolutional neural network on all the samples. The images need to be normalized and the labels need to be one-hot encoded.
The document summarizes the Transformer neural network model proposed in the paper "Attention is All You Need". The Transformer uses self-attention mechanisms rather than recurrent or convolutional layers. It achieves state-of-the-art results in machine translation by allowing the model to jointly attend to information from different representation subspaces. The key components of the Transformer include multi-head self-attention layers in the encoder and masked multi-head self-attention layers in the decoder. Self-attention allows the model to learn long-range dependencies in sequence data more effectively than RNNs.
Introduction to Recurrent Neural NetworkKnoldus Inc.
The document provides an introduction to recurrent neural networks (RNNs). It discusses how RNNs differ from feedforward neural networks in that they have internal memory and can use their output from the previous time step as input. This allows RNNs to process sequential data like time series. The document outlines some common RNN types and explains the vanishing gradient problem that can occur in RNNs due to multiplication of small gradient values over many time steps. It discusses solutions to this problem like LSTMs and techniques like weight initialization and gradient clipping.
This document provides an overview of convolutional neural networks (CNNs). It defines CNNs as multiple layer feedforward neural networks used to analyze visual images by processing grid-like data. CNNs recognize images through a series of layers, including convolutional layers that apply filters to detect patterns, ReLU layers that apply an activation function, pooling layers that detect edges and corners, and fully connected layers that identify the image. CNNs are commonly used for applications like image classification, self-driving cars, activity prediction, video detection, and conversion applications.
Teaching machines to see the process of designing (datasets) with aiDevFest DC
This document summarizes a presentation about teaching machines to see and recognize images using artificial intelligence. It discusses Clarifai, a company that provides image and video recognition services using convolutional neural networks (CNNs). The presentation explains how CNNs can learn representations of images without needing manual feature engineering. It demonstrates how CNNs can be trained to detect safe versus unsafe image content. Finally, it discusses how CNNs have progressed over time to surpass traditional computer vision techniques for tasks like object detection.
Evolution of Deep Learning and new advancementsChitta Ranjan
Earlier known as neural networks, deep learning saw a remarkable resurgence in the past decade. Neural networks did not find enough adopters in the past century due to its limited accuracy in real world applications (due to various reasons) and difficult interpretation. Many of these limitations got resolved in the recent years, and it was re-branded as deep learning. Now deep learning is widely used in industry and has become a popular research topic in academia. Learning about the passage of its evolution and development is intriguing. In this presentation, we will learn about how we resolved the issues in last generation neural networks, how we reached to the recent advanced methods from the earlier works, and different components of deep learning models.
Deep Learning with Python: Getting started and getting from ideas to insights in minutes.
PyData Seattle 2015
Alex Korbonits (@korbonits)
This presentation was given July 25, 2015 at the PyData Seattle conference hosted by PyData and NumFocus.
The document summarizes Junho Cho's presentation on image translation using generative adversarial networks (GANs). It discusses several papers on this topic, including pix2pix, which uses conditional GANs to perform supervised image-to-image translation on paired datasets; Domain Transfer Network (DTN), which uses an unsupervised method to perform cross-domain image generation; and CycleGAN and DiscoGAN, which can perform unpaired image-to-image translation using cycle-consistent adversarial networks. The presentation provides an overview of each method and shows examples of their applications to tasks such as semantic segmentation, style transfer, and domain adaptation.
- Researchers used a hierarchical convolutional neural network (CNN) optimized for object categorization performance to predict neural responses in higher visual cortex.
- The top layer of the CNN accurately predicted responses in inferior temporal (IT) cortex, and intermediate layers predicted responses in V4 cortex.
- This suggests that biological performance optimization directly shaped neural mechanisms in visual processing areas, as the CNN was not explicitly trained on neural data but emerged as predictive of responses in IT and V4.
The document discusses deep learning and learning hierarchical representations. It makes three key points:
1. Deep learning involves learning multiple levels of representations or features from raw input in a hierarchical manner, unlike traditional machine learning which uses engineered features.
2. Learning hierarchical representations is important because natural data lies on low-dimensional manifolds and disentangling the factors of variation can lead to more robust features.
3. Architectures for deep learning involve multiple levels of non-linear feature transformations followed by pooling to build increasingly abstract representations at each level. This allows the representations to become more invariant and disentangled.
[Mmlab seminar 2016] deep learning for human pose estimationWei Yang
This document summarizes recent advances in deep learning approaches for human pose estimation. It describes early methods like DeepPose that used cascades of regressors. Later works introduced heatmap regression to capture spatial information. Convolutional Pose Machine and Stacked Hourglass networks further improved accuracy by incorporating stronger context modeling through deeper networks with larger receptive fields and intermediate supervision. These approaches demonstrate that both local appearance cues and modeling of global context and structure are important for accurate human pose estimation.
The document discusses sparse coding and its applications in visual recognition tasks. It introduces sparse coding as an unsupervised learning technique that learns bases to represent image patches. Sparse coding has been shown to outperform bag-of-words models with vector quantization on datasets like Caltech-101 and PASCAL VOC. The document also discusses extensions of sparse coding, including hierarchical sparse coding and supervised methods, that have achieved further improvements on image classification benchmarks.
Scalable image recognition model with deep embedding捷恩 蔡
This document proposes a method called deep embedding to perform scalable image recognition on mobile and IoT devices. Deep neural networks achieve high performance but require too many parameters to run on limited devices. The method uses kernel preserving projection to project features from a pretrained DNN into a lower dimensional space, reducing parameters by 86% while only dropping accuracy 1.12%. This allows image classification to be done directly on mobile and IoT devices using a small, efficient model encoded with high-level semantic information from DNNs.
Deep Learning: concepts and use cases (October 2018)Julien SIMON
An introduction to Deep Learning theory
Neurons & Neural Networks
The Training Process
Backpropagation
Optimizers
Common network architectures and use cases
Convolutional Neural Networks
Recurrent Neural Networks
Long Short Term Memory Networks
Generative Adversarial Networks
Getting started
1. The document discusses Convolutional Neural Networks (CNNs) for object recognition and scene understanding. It covers the biological inspiration from the human visual cortex, classical computer vision techniques, and the foundations of CNNs including LeNet and learning visual features.
2. CNNs apply successive layers of convolutions, nonlinear activations, and pooling to learn hierarchical representations of images. Modern CNN architectures have millions of parameters and dozens of layers to learn increasingly complex features.
3. CNNs have countless applications in areas like image classification, segmentation, detection, generation, and more due to their general architecture for learning spatial hierarchies of features from data.
Convolutional neural networks (CNNs) have traditionally been used for computer vision tasks but recent work has applied them to language modeling as well. CNNs treat sequences of words as signals over time rather than independent units. They use convolution and pooling layers to identify important n-gram features. Results show CNNs can be effective for classification tasks like sentiment analysis but have had less success with sequence modeling tasks. Overall, CNNs provide an alternative to recurrent neural networks for certain natural language processing problems and help understand each model's strengths and weaknesses.
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
Abstract: While it is nearly effortless for humans to quickly assess the perceptual similarity between two images, the underlying processes are thought to be quite complex. Despite this, the most widely used perceptual metrics today, such as PSNR and SSIM, are simple, shallow functions, and fail to account for many nuances of human perception. Recently, the deep learning community has found that features of the VGG network trained on ImageNet classification have been remarkably useful as a training loss for image synthesis. But how perceptual are these so-called "perceptual losses"? What elements are critical for their success? To answer these questions, we introduce a new dataset of human perceptual similarity judgments. We systematically evaluate deep features across different architectures and tasks and compare them with classic metrics. We find that deep features outperform all previous metrics by large margins on our dataset. More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised). Our results suggest that perceptual similarity is an emergent property shared across deep visual representations.
Despite their strong transfer performance, deep convolutional representations surprisingly lack a basic low-level property -- shift-invariance, as small input shifts or translations can cause drastic changes in the output. Commonly used downsampling methods, such as max-pooling, strided-convolution, and average-pooling, ignore the sampling theorem. The well-known signal processing fix is anti-aliasing by low-pass filtering before downsampling. However, simply inserting this module into deep networks degrades performance; as a result, it is seldomly used today. We show that when integrated correctly, it is compatible with existing architectural components, such as max-pooling and strided-convolution. We observe increased accuracy in ImageNet classification, across several commonly-used architectures, such as ResNet, DenseNet, and MobileNet, indicating effective regularization. Furthermore, we observe better generalization, in terms of stability and robustness to input corruptions. Our results demonstrate that this classical signal processing technique has been undeservingly overlooked in modern deep networks.
This document provides an introduction to neural networks. It discusses how neural networks have recently achieved state-of-the-art results in areas like image and speech recognition and how they were able to beat a human player at the game of Go. It then provides a brief history of neural networks, from the early perceptron model to today's deep learning approaches. It notes how neural networks can automatically learn features from data rather than requiring handcrafted features. The document concludes with an overview of commonly used neural network components and libraries for building neural networks today.
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Wanjin Yu
The document discusses using sequence-to-sequence learning models for tasks like machine translation, question answering, and image captioning. It describes how recurrent neural networks like LSTMs can be used in seq2seq models to incorporate memory. Finally, it proposes that seq2seq models can be enhanced by incorporating external memory structures like knowledge bases to enable capabilities like causal reasoning for question answering.
The document summarizes research on hierarchical models for object recognition, including:
- Hierarchies are inspired by the primate visual system which uses hierarchies of features of increasing complexity.
- Convolutional neural networks and the Neocognitron model use hierarchical architectures with layers of feature extraction.
- Learning hierarchical compositional representations allows constructing objects from reusable parts.
- Identifying images from brain activity showed it is possible to predict fMRI activity and identify images based on a voxel activity model.
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...sameer shah
"Join us for STATATHON, a dynamic 2-day event dedicated to exploring statistical knowledge and its real-world applications. From theory to practice, participants engage in intensive learning sessions, workshops, and challenges, fostering a deeper understanding of statistical methodologies and their significance in various fields."
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
13. ImageNet
Challenge
ILSVRC+
ImageNet Classification error throughout years and groups
Li
Fei-‐Fei:
ImageNet
Large
Scale
Visual
Recogni,on
Challenge,
2014
14. Alexnet
Architecture
-‐
2012
Input
Conv
Relu
Pool
Conv
Relu
Pool
Conv
Relu
Conv
Relu
Conv
Relu
Pool
FC
Dropout
FC
Dropout
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
FC
1000
15. Alexnet
Architecture
-‐
2012
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
17. Tradi,onal
Approach
To
Image
Classifica,on
Input
Image
Hand
Extracted
Features
Classifier
Object
Label
18. Issues
• Who
makes
the
features?
– Need
an
expert
for
each
problem
domain
• Which
features?
– Are
they
the
same
for
every
problem
type?
• How
robust
are
these
features
to
real
images?
– Transla,on,
Rota,on,
contrast
changes,
etc.
20. Features
Are
Hierarchical
• A
squirrel
is
a
combina,on
of
fur,
arms,
legs,
&
a
tail
in
specific
propor,ons.
• A
tail
is
made
of
texture,
color,
and
spa,al
rela,onships
• A
texture
is
made
of
oriented
edges,
gradients,
and
colors
21. Image
Features
• A
feature
is
something
in
the
image
or
derived
from
it
that’s
relevant
to
the
task
• Edges
• Lines
at
different
angles,
curves,
etc.
• Colors,
or
pa@erns
of
colors
• SIFT,
SURF,
HOG,
GIST,
ORB,
etc
32. Backpropaga,on
• Error
propagates
backward
and
it
all
works
via
(normally
stochas,c)
gradient
descent.
• (wave
hands)
33.
34. Alexnet
Architecture
-‐
2012
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
37. Input:
Pixels
Are
Just
Numbers
h@ps://medium.com/@ageitgey/machine-‐learning-‐is-‐fun-‐part-‐3-‐deep-‐learning-‐and-‐convolu,onal-‐neural-‐networks-‐
f40359318721
39. Goals
• Need
to
detect
the
same
feature
anywhere
in
an
image
• Reuse
the
same
weights
over
and
over
• What
we
really
want
is
one
neuron
that
detects
a
feature
that
we
slide
over
the
image
40. Neuron
=
Filter
• Act
as
detectors
for
some
specific
image
feature
• Take
images
as
inputs
and
produce
image
like
feature
maps
as
outputs
41. Convolu,on
• Like
sliding
a
matrix
over
the
input
and
performing
dot
products
• It’s
all
just
matrix
mul,plica,on
55. Alexnet
Architecture
-‐
2012
Input
Conv
Relu
Pool
Conv
Relu
Pool
Conv
Relu
Conv
Relu
Conv
Relu
Pool
FC
Dropout
FC
Dropout
FC
1000
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
Dropout
57. Let’s
Predict
Something!
• We
have
all
these
features,
how
do
we
learn
to
label
something
based
on
them?
58. Alexnet
Architecture
-‐
2012
Input
Conv
Relu
Pool
Conv
Relu
Pool
Conv
Relu
Conv
Relu
Conv
Relu
Pool
FC
Dropout
FC
Dropout
FC
1000
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
Fully
Connected
59. Fully
Connected
Layers
• Each
neuron
is
connected
to
all
inputs
• Standard
mul,layer
neural
net
• Learns
non-‐linear
combina,ons
of
the
feature
maps
to
make
predic,ons
61. Alexnet
Architecture
-‐
2012
Input
Conv
Relu
Pool
Conv
Relu
Pool
Conv
Relu
Conv
Relu
Conv
Relu
Pool
FC
Dropout
FC
Dropout
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
FC
1000
62. Which
Class
Is
It
Again?
• FC-‐1000
gives
us
1000
numbers,
one
per
class,
how
do
we
compare
them?
63. Soqmax
• Mul,-‐class
version
of
logis,c
func,on
• Outputs
normalized
class
“probabili,es”
• Takes
m
inputs
and
produces
m
outputs
between
zero
and
one,
that
sum
to
one
• Cross-‐entropy
loss
• Differen,able
65. Alexnet
Architecture
-‐
2012
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
Layer
1
67. Alexnet
Architecture
-‐
2012
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
Layer
2
69. Alexnet
Architecture
-‐
2012
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
Layer
3
71. Alexnet
Architecture
-‐
2012
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
Layer
4
Layer
5
73. Alexnet
Architecture
-‐
2012
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
74. Alexnet
Architecture
-‐
2012
Input
Conv
Relu
Pool
Conv
Relu
Pool
Conv
Relu
Conv
Relu
Conv
Relu
Pool
FC
Dropout
FC
Dropout
ImageNet
Classifica,on
with
Deep
Convolu,onal
Neural
Networks
Alex
Krizhevsky,
Ilya
Sutskever
and
Geoffrey
E.
Hinton
Advances
in
Neural
Informa,on
Processing
Systems
25
eds.F.
Pereira,
C.J.C.
Burges,
L.
Bo@ou
and
K.Q.
Weinberger
pp.
1097-‐1105,
2012
FC
1000