This slide is used for the tutorial in Deep Learning Summer School, held in IBS, Daejeon. Based on the recent effort on detecting misleading headlines through deep neural networks (Yoon et al., AAAI 2019), it explains how RNN and Attention mechanism works for text. Moreover, implementations based on TensorFlow 1.x are introduced.
Processing of large data requires new approaches to data mining: low, close to linear, complexity and stream processing. While in the traditional data mining the practitioner is usually presented with a static dataset, which might have just a timestamp attached to it, to infer a model for predicting future/takeout observations, in stream processing the problem is often posed as extracting as much information as possible on the current data to convert them to an actionable model within a limited time window. In this talk I present an approach based on HBase counters for mining over streams of data, which allows for massively distributed processing and data mining. I will consider overall design goals as well as HBase schema design dilemmas to speed up knowledge extraction process. I will also demo efficient implementations of Naive Bayes, Nearest Neighbor and Bayesian Learning on top of Bayesian Counters.
The document summarizes a deep learning programming course for artificial intelligence. The course covers topics like machine learning, deep learning, convolutional neural networks, recurrent neural networks, and applications of deep learning in medicine. It provides an overview of each week's topics, including an introduction to AI and machine learning in week 3, deep learning in week 4, and applications of AI in medicine in week 5.
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
This document provides an overview of a hands-on tutorial on machine learning in Python. It discusses various machine learning algorithms including linear regression, logistic regression, and regularization. It explains key concepts such as model selection, cross-validation, preprocessing, and evaluation metrics. Examples are provided to illustrate linear regression, regularization techniques like Ridge and Lasso regression, and logistic regression. The document encourages participants to practice these techniques on exercises.
[251] implementing deep learning using cu dnnNAVER D2
This document provides an overview of deep learning and implementation on GPU using cuDNN. It begins with a brief history of neural networks and an introduction to common deep learning models like convolutional neural networks. It then discusses implementing deep learning models using cuDNN, including initialization, forward and backward passes for layers like convolution, pooling and fully connected. It covers optimization issues like initialization and speeding up training. Finally, it introduces VUNO-Net, the company's deep learning framework, and discusses its performance, applications and visualization.
1) The document discusses recent advances in deep reinforcement learning algorithms for continuous control tasks. It examines factors like network architecture, reward scaling, random seeds, environments and codebases that impact reproducibility of deep RL results.
2) It analyzes the performance of algorithms like ACKTR, PPO, DDPG and TRPO on benchmarks like Hopper, HalfCheetah and identifies unstable behaviors and unfair comparisons.
3) Simpler approaches like nearest neighbor policies are explored as alternatives to deep networks for solving continuous control tasks, especially in sparse reward settings.
Processing of large data requires new approaches to data mining: low, close to linear, complexity and stream processing. While in the traditional data mining the practitioner is usually presented with a static dataset, which might have just a timestamp attached to it, to infer a model for predicting future/takeout observations, in stream processing the problem is often posed as extracting as much information as possible on the current data to convert them to an actionable model within a limited time window. In this talk I present an approach based on HBase counters for mining over streams of data, which allows for massively distributed processing and data mining. I will consider overall design goals as well as HBase schema design dilemmas to speed up knowledge extraction process. I will also demo efficient implementations of Naive Bayes, Nearest Neighbor and Bayesian Learning on top of Bayesian Counters.
The document summarizes a deep learning programming course for artificial intelligence. The course covers topics like machine learning, deep learning, convolutional neural networks, recurrent neural networks, and applications of deep learning in medicine. It provides an overview of each week's topics, including an introduction to AI and machine learning in week 3, deep learning in week 4, and applications of AI in medicine in week 5.
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
This document provides an overview of a hands-on tutorial on machine learning in Python. It discusses various machine learning algorithms including linear regression, logistic regression, and regularization. It explains key concepts such as model selection, cross-validation, preprocessing, and evaluation metrics. Examples are provided to illustrate linear regression, regularization techniques like Ridge and Lasso regression, and logistic regression. The document encourages participants to practice these techniques on exercises.
[251] implementing deep learning using cu dnnNAVER D2
This document provides an overview of deep learning and implementation on GPU using cuDNN. It begins with a brief history of neural networks and an introduction to common deep learning models like convolutional neural networks. It then discusses implementing deep learning models using cuDNN, including initialization, forward and backward passes for layers like convolution, pooling and fully connected. It covers optimization issues like initialization and speeding up training. Finally, it introduces VUNO-Net, the company's deep learning framework, and discusses its performance, applications and visualization.
1) The document discusses recent advances in deep reinforcement learning algorithms for continuous control tasks. It examines factors like network architecture, reward scaling, random seeds, environments and codebases that impact reproducibility of deep RL results.
2) It analyzes the performance of algorithms like ACKTR, PPO, DDPG and TRPO on benchmarks like Hopper, HalfCheetah and identifies unstable behaviors and unfair comparisons.
3) Simpler approaches like nearest neighbor policies are explored as alternatives to deep networks for solving continuous control tasks, especially in sparse reward settings.
Final project, Machine Learning Having it Deep and Structured, NTU
- Rank 1/25 in peer review, original score: 16.2/17
- 2nd presentation pride (voted by audience)
This document provides information about Olivier Duchenne and his experience and qualifications. It summarizes his educational background which includes a Ph.D in Computer Science from ENS Paris/INRIA and a postdoctoral fellowship at Carnegie Mellon University. It also lists his professional experience which includes positions at NEC Labs, Intel, and as a co-founder of Solidware. The document then provides guidelines for machine learning and discusses challenges such as having enough and changing data. It explores the history and reasons for increased use of machine learning in computer vision.
The document outlines the steps for conducting a deep learning experiment in Korean. It introduces the speaker and their background in artificial intelligence and natural language processing. It then lists the steps, which include understanding neural networks, deep neural networks with techniques like pretraining, rectified linear units and dropout, using the Theano library, writing deep learning code with Theano, and applying deep learning to natural language processing with libraries like Gensim. It also discusses recent interest in deep learning and example applications.
DRAW is a recurrent neural network proposed by Google DeepMind for image generation. It works by reconstructing images "step-by-step" through iterative applications of selective attention. At each step, DRAW samples from a latent space to generate values for its canvas. It uses an encoder-decoder RNN architecture with selective attention to focus on different regions of the image. This allows it to capture fine-grained details across the entire image.
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
The document provides an introduction and overview of machine learning, deep learning, and data analysis. It discusses key concepts like supervised and unsupervised learning. It also summarizes the speaker's experience taking online courses and studying resources to learn machine learning techniques. Examples of commonly used machine learning algorithms and neural network architectures are briefly outlined.
This document provides an overview of deep learning concepts including neural networks, activation functions, loss functions, training, optimization techniques like stochastic gradient descent, convolutional neural networks, recurrent neural networks, generative adversarial networks, and scaling deep learning with platforms like Amazon SageMaker. It also demonstrates deep learning models for tasks like image classification, machine translation, and image generation.
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsJason Tsai
Given lecture for Deep Learning 101 study group with Frank Wu on Dec. 9th, 2016.
Reference: https://www.deeplearningbook.org/
Initiated by Taiwan AI Group (https://www.facebook.com/groups/Taiwan.AI.Group/)
Introduction to Deep Learning with Pythonindico data
A presentation by Alec Radford, Head of Research at indico Data Solutions, on deep learning with Python's Theano library.
The emphasis of the presentation is high performance computing, natural language processing (using recurrent neural nets), and large scale learning with GPUs.
Video of the talk available here: https://www.youtube.com/watch?v=S75EdAcXHKk
This document summarizes an introduction to deep learning with MXNet and R. It discusses MXNet, an open source deep learning framework, and how to use it with R. It then provides an example of using MXNet and R to build a deep learning model to predict heart disease by analyzing MRI images. Specifically, it discusses loading MRI data, architecting a convolutional neural network model, training the model, and evaluating predictions against actual heart volume measurements. The document concludes by discussing additional ways the model could be explored and improved.
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...Cloudera, Inc.
Processing of large data requires new approaches to data mining: low, close to linear, complexity and stream processing. While in the traditional data mining the practitioner is usually presented with a static dataset, which might have just a timestamp attached to it, to infer a model for predicting future/takeout observations, in stream processing the problem is often posed as extracting as much information as possible on the current data to convert them to an actionable model within a limited time window. In this talk I present an approach based on HBase counters for mining over streams of data, which allows for massively distributed processing and data mining. I will consider overall design goals as well as HBase schema design dilemmas to speed up knowledge extraction process. I will also demo efficient implementations of Naive Bayes, Nearest Neighbor and Bayesian Learning on top of Bayesian Counters.
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
This document summarizes a paper on learning to remember rare events using a memory-augmented neural network. The paper proposes a memory module that stores examples from previous tasks to help learn new rare tasks from only a single example. The memory module is trained end-to-end with the neural network on two tasks: one-shot learning on Omniglot characters and machine translation of rare words. The implementation uses a TensorFlow memory module that stores key-value pairs to retrieve examples similar to a query. Experiments show the memory module improves one-shot learning performance and handles rare words better than baselines.
Personalized News Recommendation System: (https://github.com/VishrutMehta/PersonalizedNewsRecommendationEngine)
News recommendation approach which considers the exclusive characteristics (e.g. news content, access patterns, named entities, popularity and recency) of news items when performing recommendation. Also, a principled framework for news selection based on the intrinsic property of user interest with a good balance between the global interest and personal interest for recommendation.
- News articles clustering
- Named entity extraction
- User profile construction (history, interests etc)
- Ranking news based on global/personal interests.
Tensorflow, deep learning and recurrent neural networks without a ph dDanielGinot
This document discusses recurrent neural networks and batch normalization. It begins by introducing RNN cells and how they can be stacked into deep RNNs. It then discusses the LSTM cell and GRU cell variations of RNNs that are better able to learn long-term dependencies. The document next explains how batch normalization works, including its use in convolutional networks. It provides TensorFlow code examples for implementing batch normalization and language models using RNNs.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
The document discusses fingerprint representations of chemical structures that can be used for tasks like searching, prediction, and clustering. It provides examples of generating, reading, manipulating, and comparing fingerprints in R using the fingerprint package. Fingerprints allow efficient comparison of large collections of molecules through bit vector representations. The document also discusses using fingerprints for predictive modeling of compound properties from high-throughput screening data and analyzing results.
This document describes a fast single-pass k-means clustering algorithm. It begins with an overview and rationale for using k-means clustering to enable fast search through large datasets. It then covers the theory behind clusterable data and k-means failure modes. The document outlines ball k-means and surrogate clustering algorithms. It discusses how to implement fast vector search methods like locality sensitive hashing. The document presents results on synthetic datasets and discusses applications like customer segmentation for a company with 100 million customers.
The presentation is an introduction to AI (deep learning). The key to success with AI is “asking good questions.” The talk was given in "Seminar in Information Systems and Applications" at National Tsing Hua University in Taiwan. During this talk, we discussed what a good question is, how we use design thinking process to improve our question, and how can we “answer” the question by deep learning.
The document discusses deep learning and artificial neural networks. It provides an agenda for topics covered, including gradient descent, backpropagation, activation functions, and examples of neural network architectures like convolutional neural networks. It explains concepts like how neural networks learn patterns from data using techniques like stochastic gradient descent to minimize loss functions. Deep learning requires large amounts of processing power and labeled training data. Common deep learning networks are used for tasks like image recognition, object detection, and time series analysis.
Human-in-a-loop: a design pattern for managing teams which leverage MLPaco Nathan
Human-in-a-loop: a design pattern for managing teams which leverage ML
Big Data Spain, 2017-11-16
https://www.bigdataspain.org/2017/talk/human-in-the-loop-a-design-pattern-for-managing-teams-which-leverage-ml
Human-in-the-loop is an approach which has been used for simulation, training, UX mockups, etc. A more recent design pattern is emerging for human-in-the-loop (HITL) as a way to manage teams working with machine learning (ML). A variant of semi-supervised learning called _active learning_ allows for mostly automated processes based on ML, where exceptions get referred to human experts. Those human judgements in turn help improve new iterations of the ML models.
This talk reviews key case studies about active learning, plus other approaches for human-in-the-loop which are emerging among AI applications. We'll consider some of the technical aspects -- including available open source projects -- as well as management perspectives for how to apply HITL:
* When is HITL indicated vs. when isn't it applicable?
* How do HITL approaches compare/contrast with more "typical" use of Big Data?
* What's the relationship between use of HITL and preparing an organization to leverage Deep Learning?
* Experiences training and managing a team which uses HITL at scale
* Caveats to know ahead of time
* In what ways do the humans involved learn from the machines?
In particular, we'll examine use cases at O'Reilly Media where ML pipelines for categorizing content are trained by subject matter experts providing examples, based on HITL and leveraging open source [Project Jupyter](https://jupyter.org/ for implementation).
Human-in-the-loop: a design pattern for managing teams that leverage MLPaco Nathan
Strata Singapore 2017 session talk 2017-12-06
https://conferences.oreilly.com/strata/strata-sg/public/schedule/detail/65611
Human-in-the-loop is an approach which has been used for simulation, training, UX mockups, etc. A more recent design pattern is emerging for human-in-the-loop (HITL) as a way to manage teams working with machine learning (ML). A variant of semi-supervised learning called active learning allows for mostly automated processes based on ML, where exceptions get referred to human experts. Those human judgements in turn help improve new iterations of the ML models.
This talk reviews key case studies about active learning, plus other approaches for human-in-the-loop which are emerging among AI applications. We’ll consider some of the technical aspects — including available open source projects — as well as management perspectives for how to apply HITL:
* When is HITL indicated vs. when isn’t it applicable?
* How do HITL approaches compare/contrast with more “typical” use of Big Data?
* What’s the relationship between use of HITL and preparing an organization to leverage Deep Learning?
* Experiences training and managing a team which uses HITL at scale
* Caveats to know ahead of time:
* In what ways do the humans involved learn from the machines?
* In particular, we’ll examine use cases at O’Reilly Media where ML pipelines for categorizing content are trained by subject matter experts providing examples, based on HITL and leveraging open source [Project Jupyter](https://jupyter.org/ for implementation).
Final project, Machine Learning Having it Deep and Structured, NTU
- Rank 1/25 in peer review, original score: 16.2/17
- 2nd presentation pride (voted by audience)
This document provides information about Olivier Duchenne and his experience and qualifications. It summarizes his educational background which includes a Ph.D in Computer Science from ENS Paris/INRIA and a postdoctoral fellowship at Carnegie Mellon University. It also lists his professional experience which includes positions at NEC Labs, Intel, and as a co-founder of Solidware. The document then provides guidelines for machine learning and discusses challenges such as having enough and changing data. It explores the history and reasons for increased use of machine learning in computer vision.
The document outlines the steps for conducting a deep learning experiment in Korean. It introduces the speaker and their background in artificial intelligence and natural language processing. It then lists the steps, which include understanding neural networks, deep neural networks with techniques like pretraining, rectified linear units and dropout, using the Theano library, writing deep learning code with Theano, and applying deep learning to natural language processing with libraries like Gensim. It also discusses recent interest in deep learning and example applications.
DRAW is a recurrent neural network proposed by Google DeepMind for image generation. It works by reconstructing images "step-by-step" through iterative applications of selective attention. At each step, DRAW samples from a latent space to generate values for its canvas. It uses an encoder-decoder RNN architecture with selective attention to focus on different regions of the image. This allows it to capture fine-grained details across the entire image.
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
The document provides an introduction and overview of machine learning, deep learning, and data analysis. It discusses key concepts like supervised and unsupervised learning. It also summarizes the speaker's experience taking online courses and studying resources to learn machine learning techniques. Examples of commonly used machine learning algorithms and neural network architectures are briefly outlined.
This document provides an overview of deep learning concepts including neural networks, activation functions, loss functions, training, optimization techniques like stochastic gradient descent, convolutional neural networks, recurrent neural networks, generative adversarial networks, and scaling deep learning with platforms like Amazon SageMaker. It also demonstrates deep learning models for tasks like image classification, machine translation, and image generation.
Deep Learning: Introduction & Chapter 5 Machine Learning BasicsJason Tsai
Given lecture for Deep Learning 101 study group with Frank Wu on Dec. 9th, 2016.
Reference: https://www.deeplearningbook.org/
Initiated by Taiwan AI Group (https://www.facebook.com/groups/Taiwan.AI.Group/)
Introduction to Deep Learning with Pythonindico data
A presentation by Alec Radford, Head of Research at indico Data Solutions, on deep learning with Python's Theano library.
The emphasis of the presentation is high performance computing, natural language processing (using recurrent neural nets), and large scale learning with GPUs.
Video of the talk available here: https://www.youtube.com/watch?v=S75EdAcXHKk
This document summarizes an introduction to deep learning with MXNet and R. It discusses MXNet, an open source deep learning framework, and how to use it with R. It then provides an example of using MXNet and R to build a deep learning model to predict heart disease by analyzing MRI images. Specifically, it discusses loading MRI data, architecting a convolutional neural network model, training the model, and evaluating predictions against actual heart volume measurements. The document concludes by discussing additional ways the model could be explored and improved.
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...Cloudera, Inc.
Processing of large data requires new approaches to data mining: low, close to linear, complexity and stream processing. While in the traditional data mining the practitioner is usually presented with a static dataset, which might have just a timestamp attached to it, to infer a model for predicting future/takeout observations, in stream processing the problem is often posed as extracting as much information as possible on the current data to convert them to an actionable model within a limited time window. In this talk I present an approach based on HBase counters for mining over streams of data, which allows for massively distributed processing and data mining. I will consider overall design goals as well as HBase schema design dilemmas to speed up knowledge extraction process. I will also demo efficient implementations of Naive Bayes, Nearest Neighbor and Bayesian Learning on top of Bayesian Counters.
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
This document summarizes a paper on learning to remember rare events using a memory-augmented neural network. The paper proposes a memory module that stores examples from previous tasks to help learn new rare tasks from only a single example. The memory module is trained end-to-end with the neural network on two tasks: one-shot learning on Omniglot characters and machine translation of rare words. The implementation uses a TensorFlow memory module that stores key-value pairs to retrieve examples similar to a query. Experiments show the memory module improves one-shot learning performance and handles rare words better than baselines.
Personalized News Recommendation System: (https://github.com/VishrutMehta/PersonalizedNewsRecommendationEngine)
News recommendation approach which considers the exclusive characteristics (e.g. news content, access patterns, named entities, popularity and recency) of news items when performing recommendation. Also, a principled framework for news selection based on the intrinsic property of user interest with a good balance between the global interest and personal interest for recommendation.
- News articles clustering
- Named entity extraction
- User profile construction (history, interests etc)
- Ranking news based on global/personal interests.
Tensorflow, deep learning and recurrent neural networks without a ph dDanielGinot
This document discusses recurrent neural networks and batch normalization. It begins by introducing RNN cells and how they can be stacked into deep RNNs. It then discusses the LSTM cell and GRU cell variations of RNNs that are better able to learn long-term dependencies. The document next explains how batch normalization works, including its use in convolutional networks. It provides TensorFlow code examples for implementing batch normalization and language models using RNNs.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
The document discusses fingerprint representations of chemical structures that can be used for tasks like searching, prediction, and clustering. It provides examples of generating, reading, manipulating, and comparing fingerprints in R using the fingerprint package. Fingerprints allow efficient comparison of large collections of molecules through bit vector representations. The document also discusses using fingerprints for predictive modeling of compound properties from high-throughput screening data and analyzing results.
This document describes a fast single-pass k-means clustering algorithm. It begins with an overview and rationale for using k-means clustering to enable fast search through large datasets. It then covers the theory behind clusterable data and k-means failure modes. The document outlines ball k-means and surrogate clustering algorithms. It discusses how to implement fast vector search methods like locality sensitive hashing. The document presents results on synthetic datasets and discusses applications like customer segmentation for a company with 100 million customers.
The presentation is an introduction to AI (deep learning). The key to success with AI is “asking good questions.” The talk was given in "Seminar in Information Systems and Applications" at National Tsing Hua University in Taiwan. During this talk, we discussed what a good question is, how we use design thinking process to improve our question, and how can we “answer” the question by deep learning.
The document discusses deep learning and artificial neural networks. It provides an agenda for topics covered, including gradient descent, backpropagation, activation functions, and examples of neural network architectures like convolutional neural networks. It explains concepts like how neural networks learn patterns from data using techniques like stochastic gradient descent to minimize loss functions. Deep learning requires large amounts of processing power and labeled training data. Common deep learning networks are used for tasks like image recognition, object detection, and time series analysis.
Human-in-a-loop: a design pattern for managing teams which leverage MLPaco Nathan
Human-in-a-loop: a design pattern for managing teams which leverage ML
Big Data Spain, 2017-11-16
https://www.bigdataspain.org/2017/talk/human-in-the-loop-a-design-pattern-for-managing-teams-which-leverage-ml
Human-in-the-loop is an approach which has been used for simulation, training, UX mockups, etc. A more recent design pattern is emerging for human-in-the-loop (HITL) as a way to manage teams working with machine learning (ML). A variant of semi-supervised learning called _active learning_ allows for mostly automated processes based on ML, where exceptions get referred to human experts. Those human judgements in turn help improve new iterations of the ML models.
This talk reviews key case studies about active learning, plus other approaches for human-in-the-loop which are emerging among AI applications. We'll consider some of the technical aspects -- including available open source projects -- as well as management perspectives for how to apply HITL:
* When is HITL indicated vs. when isn't it applicable?
* How do HITL approaches compare/contrast with more "typical" use of Big Data?
* What's the relationship between use of HITL and preparing an organization to leverage Deep Learning?
* Experiences training and managing a team which uses HITL at scale
* Caveats to know ahead of time
* In what ways do the humans involved learn from the machines?
In particular, we'll examine use cases at O'Reilly Media where ML pipelines for categorizing content are trained by subject matter experts providing examples, based on HITL and leveraging open source [Project Jupyter](https://jupyter.org/ for implementation).
Human-in-the-loop: a design pattern for managing teams that leverage MLPaco Nathan
Strata Singapore 2017 session talk 2017-12-06
https://conferences.oreilly.com/strata/strata-sg/public/schedule/detail/65611
Human-in-the-loop is an approach which has been used for simulation, training, UX mockups, etc. A more recent design pattern is emerging for human-in-the-loop (HITL) as a way to manage teams working with machine learning (ML). A variant of semi-supervised learning called active learning allows for mostly automated processes based on ML, where exceptions get referred to human experts. Those human judgements in turn help improve new iterations of the ML models.
This talk reviews key case studies about active learning, plus other approaches for human-in-the-loop which are emerging among AI applications. We’ll consider some of the technical aspects — including available open source projects — as well as management perspectives for how to apply HITL:
* When is HITL indicated vs. when isn’t it applicable?
* How do HITL approaches compare/contrast with more “typical” use of Big Data?
* What’s the relationship between use of HITL and preparing an organization to leverage Deep Learning?
* Experiences training and managing a team which uses HITL at scale
* Caveats to know ahead of time:
* In what ways do the humans involved learn from the machines?
* In particular, we’ll examine use cases at O’Reilly Media where ML pipelines for categorizing content are trained by subject matter experts providing examples, based on HITL and leveraging open source [Project Jupyter](https://jupyter.org/ for implementation).
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
This document discusses speaker diarization, which is the process of segmenting an audio stream into homogeneous segments according to speaker identity. It covers feature extraction methods like MFCCs, segmentation using Bayesian Information Criteria to compare Gaussian mixture models, and clustering algorithms like k-means and hierarchical agglomerative clustering. Dendrogram visualizations are used to identify natural speaker clusters. The overall goal is to partition audio recordings of discussions or debates into homogeneous segments to attribute speech segments to individual speakers.
Deep Dive on Deep Learning (June 2018)Julien SIMON
This document provides a summary of a presentation on deep learning concepts, common architectures, Apache MXNet, and infrastructure for deep learning. The agenda includes an overview of deep learning concepts like neural networks and training, common architectures like convolutional neural networks and LSTMs, a demonstration of Apache MXNet's symbolic and imperative APIs, and a discussion of infrastructure for deep learning on AWS like optimized EC2 instances and Amazon SageMaker.
TDC2017 | São Paulo - Trilha Java EE How we figured out we had a SRE team at ...tdc-globalcode
This document discusses various techniques for feature engineering raw data to improve machine learning model performance. It describes transforming data through techniques like handling missing values, aggregation, binning, encoding categorical features, and feature selection. The goal of feature engineering is to represent the underlying problem to models in a way that results in better accuracy on new data.
The document proposes a formalized approach to software version numbering to address current problems. It analyzes why automatic document section numbering cannot be directly applied to software history. The proposed solution extends the section numbering approach formally, allowing version numbers to include arbitrary sets in addition to natural numbers. This addresses issues like inconsistent practices and ambiguous versioning that undermine automation and software quality.
Language translation with Deep Learning (RNN) with TensorFlowS N
This document provides an overview of a meetup on language translation with deep learning using TensorFlow on FloydHub. It will cover the language translation challenge, introducing key concepts like deep learning, RNNs, NLP, TensorFlow and FloydHub. It will then describe the solution approach to the translation task, including a demo and code walkthrough. Potential next steps and references for further learning are also mentioned.
Machine Learning : why we should know and how it worksKevin Lee
This document provides an overview of machine learning, including:
- An introduction to machine learning and why it is important.
- The main types of machine learning algorithms: supervised learning, unsupervised learning, and deep neural networks.
- Examples of how machine learning algorithms work, such as logistic regression, support vector machines, and k-means clustering.
- How machine learning is being applied in various industries like healthcare, commerce, and more.
LSA works by first separating text into sentences, then building a matrix of word counts in each sentence. It normalizes the matrix using tf-idf to weigh common words lower. SVD transforms the matrix into a conceptual space, where each sentence is represented as a vector. The top sentences are picked based on the absolute values of their vectors in this space.
This document provides a tutorial on recurrent neural networks (RNNs) and long short-term memory (LSTM) networks. It begins with introductions to the speaker and an overview of the content. It then explains RNNs and how they work sequentially through hidden layers. Issues like vanishing gradients are discussed. LSTMs are introduced as an advanced RNN that can retain information over longer periods of time using gates. Pre-trained word embeddings like Word2Vec, GloVe, and FastText are briefly explained. Finally, homework is assigned to build a sentiment analysis model using an LSTM and pre-trained word embeddings on a Chinese text dataset.
1) The document provides an overview of deep learning concepts including neural networks, convolutional neural networks, LSTM networks, and training processes. It discusses key algorithms like backpropagation and stochastic gradient descent.
2) It also introduces the MXNet deep learning framework and provides demos of using it for tasks like image classification, machine translation, and integrating AI with IoT devices.
3) Resources are listed for learning more about MXNet, deep learning, and Amazon Web Services AI services.
This document discusses the process of backpropagation in neural networks. It begins with an example of forward propagation through a neural network with an input, hidden and output layer. It then introduces backpropagation, which uses the calculation of errors at the output to calculate gradients and update weights in order to minimize the overall error. The key steps are outlined, including calculating the error derivatives, weight updates proportional to the local gradient, and backpropagating error signals from the output through the hidden layers. Formulas for calculating each step of backpropagation are provided.
Feature Engineering - Getting most out of data for predictive models - TDC 2017Gabriel Moreira
How should data be preprocessed for use in machine learning algorithms? How to identify the most predictive attributes of a dataset? What features can generate to improve the accuracy of a model?
Feature Engineering is the process of extracting and selecting, from raw data, features that can be used effectively in predictive models. As the quality of the features greatly influences the quality of the results, knowing the main techniques and pitfalls will help you to succeed in the use of machine learning in your projects.
In this talk, we will present methods and techniques that allow us to extract the maximum potential of the features of a dataset, increasing flexibility, simplicity and accuracy of the models. The analysis of the distribution of features and their correlations, the transformation of numeric attributes (such as scaling, normalization, log-based transformation, binning), categorical attributes (such as one-hot encoding, feature hashing, Temporal (date / time), and free-text attributes (text vectorization, topic modeling).
Python, Python, Scikit-learn, and Spark SQL examples will be presented and how to use domain knowledge and intuition to select and generate features relevant to predictive models.
Synthetic dialogue generation with Deep LearningS N
A walkthrough of a Deep Learning based technique which would generate TV scripts using Recurrent Neural Network. The model will generate a completely new TV script for a scene, after being training from a dataset. One will learn the concepts around RNN, NLP and various deep learning techniques.
Technologies to be used:
Python 3, Jupyter, TensorFlow
Source code: https://github.com/syednasar/talks/tree/master/synthetic-dialog
What are algorithms? How can I build a machine learning model? In machine learning, training large models on a massive amount of data usually improves results. Our customers report, however, that training such models and deploying them is either operationally prohibitive or outright impossible for them. At Amazon, we created a collection of machine learning algorithms that scale to any amount of data, including k-means clustering for data segmentation, factorisation machines for recommendations, and time-series forecasting. This talk will discuss those algorithms, understand where and how they can be used, and our design choices.
This document provides an overview of deep learning and convolutional neural networks (CNNs). It discusses topics like artificial neural networks, CNN architecture including convolution, ReLU, pooling and fully connected layers. It also explains how CNNs work by scanning images through these layers and detecting patterns. Code examples in Python are given to demonstrate preprocessing data, building a CNN model, training it and making predictions. Key concepts like softmax and cross-entropy functions used for classification are also overviewed.
This is a single day course, allows the learner to get experience with the basic details of deep learning, first half is building a network using python/numpy only and the second half we build the more advanced netwrok using TensorFlow/Keras.
At the end you will find a list of usefull pointers to continue.
course git: https://gitlab.com/eshlomo/EazyDnn
This lecture provides an introduction to recurrent neural networks, which include a layer whose hidden state is aware of its values in a previous time-step.
These slides were used in the Master in Computer Vision Barcelona 2019/2020, in the Module 6 dedicated to Video Analysis.
http://pagines.uab.cat/mcv/
Similar to Detecting Misleading Headlines in Online News: Hands-on Experiences on Attention-based RNN (20)
Positivity Bias in Customer Satisfaction RatingsKunwoo Park
This slide is for my presentation at The Web Conference 2018 (also known as WWW). You can check the paper at the following link: https://dl.acm.org/authorize.cfm?key=N655133
Persistent Sharing of Fitness App Status on TwitterKunwoo Park
2016년 7월 25일 Naver labs에서 발표한 자료입니다. CSCW '16에서 발표된 아래 논문을 한글로 소개하였습니다.
Title: Persistent Sharing of Fitness App Status on Twitter
Author: Kunwoo Park, Ingmar Weber, Meeyoung Cha, Chul Lee
소셜 데이터를 이용한 연구소개 - 피트니스 앱의 지속 사용에 관한 연구Kunwoo Park
2015년 12월 18일 한빛미디어에서 개최된 생활 데이터 모임에서 발표한 내용입니다. 소셜 데이터를 이용한 연구 사례로 피트니스 앱의 지속 사용에 관한 연구를 공유하였습니다. 소개된 논문은 다음 링크에서 확인 가능합니다: http://kunwpark.kr/wp-content/uploads/2015/12/cscw16_park.pdf
MS thesis defense - Gender swapping and its effects in MMORPGsKunwoo Park
- The document discusses a study on the phenomenon of gender swapping in MMORPG games and its effects. It analyzes player demographic data from the Fairyland Online game.
- Females are found to participate in gender swapping more than males. Older and more experienced players also swap genders more. Gender swapping is found to affect in-game behaviors and social networks.
- Players' levels increase faster when their avatar gender matches their real gender, following real-world gender roles. Females profit more from trades, also following online gender roles. Social networks are affected by both real and virtual gender.
[DISC2013] Mood and Weather: Feeling the Heat?Kunwoo Park
The document discusses a study that analyzed the relationship between mood expressed on Twitter and weather conditions using a dataset of 38.1 million tweets from the United States in April 2009 along with corresponding weather data. The researchers found a weak positive correlation between temperature and positive sentiment across states on average, but also found some states showed negative correlations. The study concluded that how weather affects mood varies significantly by region due to cultural and economic factors.
[CS570] Machine Learning Team Project (I know what items really are)Kunwoo Park
This document summarizes a team's approach to predicting which items users might be interested in using a recommendation system. It describes extracting features from user and item metadata to train an SVM model, but this was too computationally expensive. Instead, the team used logistic regression with stochastic gradient descent. They tested features like age, gender and network similarities. Their combined model outperformed random prediction baselines on the KDD Cup 2012 Track 1 dataset.
Social Network Analysis:Methods and Applications Chapter 9Kunwoo Park
This document discusses structural equivalence and positional analysis in networks. It defines structural equivalence as two actors having identical ties to and from all other actors. It describes methods for measuring approximate structural equivalence using metrics like Euclidean distance and correlation. It also outlines techniques for partitioning actors into positions based on their structural equivalence, including CONCOR and hierarchical clustering algorithms. The document emphasizes that positional analysis aims to simplify network data by grouping similarly positioned actors.
Social Network Analysis : Methods and Applications Chapter 6 and 7Kunwoo Park
1) The chapter discusses methods for identifying cohesive subgroups within networks, including cliques, n-clans, and lambda sets.
2) Cohesive subgroups are defined as subsets of nodes that are relatively more strongly connected to each other than to nodes outside the subgroup.
3) Different methods take into account factors like reachability between nodes, nodal degree, and comparing the frequency of ties within versus outside the subgroup.
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...PriyankaKilaniya
Energy efficiency has been important since the latter part of the last century. The main object of this survey is to determine the energy efficiency knowledge among consumers. Two separate districts in Bangladesh are selected to conduct the survey on households and showrooms about the energy and seller also. The survey uses the data to find some regression equations from which it is easy to predict energy efficiency knowledge. The data is analyzed and calculated based on five important criteria. The initial target was to find some factors that help predict a person's energy efficiency knowledge. From the survey, it is found that the energy efficiency awareness among the people of our country is very low. Relationships between household energy use behaviors are estimated using a unique dataset of about 40 households and 20 showrooms in Bangladesh's Chapainawabganj and Bagerhat districts. Knowledge of energy consumption and energy efficiency technology options is found to be associated with household use of energy conservation practices. Household characteristics also influence household energy use behavior. Younger household cohorts are more likely to adopt energy-efficient technologies and energy conservation practices and place primary importance on energy saving for environmental reasons. Education also influences attitudes toward energy conservation in Bangladesh. Low-education households indicate they primarily save electricity for the environment while high-education households indicate they are motivated by environmental concerns.
Digital Twins Computer Networking Paper Presentation.pptxaryanpankaj78
A Digital Twin in computer networking is a virtual representation of a physical network, used to simulate, analyze, and optimize network performance and reliability. It leverages real-time data to enhance network management, predict issues, and improve decision-making processes.
Applications of artificial Intelligence in Mechanical Engineering.pdfAtif Razi
Historically, mechanical engineering has relied heavily on human expertise and empirical methods to solve complex problems. With the introduction of computer-aided design (CAD) and finite element analysis (FEA), the field took its first steps towards digitization. These tools allowed engineers to simulate and analyze mechanical systems with greater accuracy and efficiency. However, the sheer volume of data generated by modern engineering systems and the increasing complexity of these systems have necessitated more advanced analytical tools, paving the way for AI.
AI offers the capability to process vast amounts of data, identify patterns, and make predictions with a level of speed and accuracy unattainable by traditional methods. This has profound implications for mechanical engineering, enabling more efficient design processes, predictive maintenance strategies, and optimized manufacturing operations. AI-driven tools can learn from historical data, adapt to new information, and continuously improve their performance, making them invaluable in tackling the multifaceted challenges of modern mechanical engineering.
Open Channel Flow: fluid flow with a free surfaceIndrajeet sahu
Open Channel Flow: This topic focuses on fluid flow with a free surface, such as in rivers, canals, and drainage ditches. Key concepts include the classification of flow types (steady vs. unsteady, uniform vs. non-uniform), hydraulic radius, flow resistance, Manning's equation, critical flow conditions, and energy and momentum principles. It also covers flow measurement techniques, gradually varied flow analysis, and the design of open channels. Understanding these principles is vital for effective water resource management and engineering applications.
Build the Next Generation of Apps with the Einstein 1 Platform.
Rejoignez Philippe Ozil pour une session de workshops qui vous guidera à travers les détails de la plateforme Einstein 1, l'importance des données pour la création d'applications d'intelligence artificielle et les différents outils et technologies que Salesforce propose pour vous apporter tous les bénéfices de l'IA.
Height and depth gauge linear metrology.pdfq30122000
Height gauges may also be used to measure the height of an object by using the underside of the scriber as the datum. The datum may be permanently fixed or the height gauge may have provision to adjust the scale, this is done by sliding the scale vertically along the body of the height gauge by turning a fine feed screw at the top of the gauge; then with the scriber set to the same level as the base, the scale can be matched to it. This adjustment allows different scribers or probes to be used, as well as adjusting for any errors in a damaged or resharpened probe.
We have designed & manufacture the Lubi Valves LBF series type of Butterfly Valves for General Utility Water applications as well as for HVAC applications.
This presentation is about Food Delivery Systems and how they are developed using the Software Development Life Cycle (SDLC) and other methods. It explains the steps involved in creating a food delivery app, from planning and designing to testing and launching. The slide also covers different tools and technologies used to make these systems work efficiently.
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...DharmaBanothu
The Network on Chip (NoC) has emerged as an effective
solution for intercommunication infrastructure within System on
Chip (SoC) designs, overcoming the limitations of traditional
methods that face significant bottlenecks. However, the complexity
of NoC design presents numerous challenges related to
performance metrics such as scalability, latency, power
consumption, and signal integrity. This project addresses the
issues within the router's memory unit and proposes an enhanced
memory structure. To achieve efficient data transfer, FIFO buffers
are implemented in distributed RAM and virtual channels for
FPGA-based NoC. The project introduces advanced FIFO-based
memory units within the NoC router, assessing their performance
in a Bi-directional NoC (Bi-NoC) configuration. The primary
objective is to reduce the router's workload while enhancing the
FIFO internal structure. To further improve data transfer speed,
a Bi-NoC with a self-configurable intercommunication channel is
suggested. Simulation and synthesis results demonstrate
guaranteed throughput, predictable latency, and equitable
network access, showing significant improvement over previous
designs
AI in customer support Use cases solutions development and implementation.pdfmahaffeycheryld
AI in customer support will integrate with emerging technologies such as augmented reality (AR) and virtual reality (VR) to enhance service delivery. AR-enabled smart glasses or VR environments will provide immersive support experiences, allowing customers to visualize solutions, receive step-by-step guidance, and interact with virtual support agents in real-time. These technologies will bridge the gap between physical and digital experiences, offering innovative ways to resolve issues, demonstrate products, and deliver personalized training and support.
https://www.leewayhertz.com/ai-in-customer-support/#How-does-AI-work-in-customer-support
Detecting Misleading Headlines in Online News: Hands-on Experiences on Attention-based RNN
1. Detecting Misleading
Headlines in Online News
Hands-on Experiences on Attention-based RNN
Kunwoo Park
24th June 2019
IBS deep learning summer school
2. Who am I
• Kunwoo Park (박건우)
• Post doc, Data Analytics, QCRI (2018 - present)
• PhD, School of Computing, KAIST (2018)
with outstanding dissertation award
• Research interest
• Computational social science using machine learning
• Text style transfer using RNN and RL
2
3. This talk will..
• Help audience understand the attention mechanism for text
• Introduce a recent research effort on detecting misleading
news headlines using deep neural networks
• Explain the building blocks of the state-of-the-art model and
show how they are implemented in TensorFlow (1.x)
• Give a hand-on experience in implementing text classifier
using attention mechanism
3
5. Target problem
• Detect incongruity between news headline and body text:
A news headline does not correctly represent the story
5
6. Overall model architecture
Deep Neural Net for
Encoding Headline
Deep Neural Net for
Encoding Body Text
Embedding
Layer
Output
Layer
Input
Layer
Goal: Detecting headline incongruity
from the textual relationship between body text and headline
6
7. Overall model architecture
Deep Neural Net for
Encoding Headline
Deep Neural Net for
Encoding Body Text
Embedding
Layer
Output
Layer
Input
Layer
7
8. Input data
• Transform words into vocabulary indices
headline:
[1, 30, 5, …, 9951, 2]
body text:
[ 875, 22, 39, …, 2481, 2,
9, 93, 9593, …, 431, 77,
1, 30, 5, …, 9951, 2, … ]
8
9. Define input layer in TF
• Using tf.placeholders
• Parameters
• data type: tf.int32
• shape: [None, self.max_words]
• name: used for debugging
headline:
[1, 30, 5, …, 9951, 2]
body text:
[ 875, 22, 39, …, 2481, 2,
9, 93, 9593, …, 431, 77,
1, 30, 5, …, 9951, 2, … ]
9
10. Feed data into placeholders
• At the last end of computation graph: usually at optimizer
headline:
[1, 30, 5, …, 9951, 2]
body text:
[ 875, 22, 39, …, 2481, 2,
9, 93, 9593, …, 431, 77,
1, 30, 5, …, 9951, 2, … ]
10
19. Overall model architecture
Deep Neural Net for
Encoding Headline
Deep Neural Net for
Encoding Body Text
Embedding
Layer
Output
Layer
Input
Layer
19
21. Which neural net can we use?
• Feedforward neural network
• Convolutional network
• Recurrent neural network
21
22. Recurrent neural network
• Efficient in modeling inputs with sequential dependencies
(e.g., text, time-series, …)
• To make an output for each step, RNNs incorporate the current
input with what we have learned so far
https://colah.github.io/posts/2015-08-Understanding-LSTMs/22
x2
x3
x4 xt
h2
⋯
h3
h4 ht
x1
h1
25. Cell state
• Kind of memory units that keep past information
• LSTM has an ability to add or remove information to the state
by special structures called gates
25
26. Forget gate layer
• Decide what information we’re going to throw away from the
cell state
• 1: “completely keep this”. 0: “completely get rid of this”
26
27. Taking input
• What new information we’re going to store in the cell state
• Input gate layer: sigmoid decides which values we’ll update
• tanh layer: creates a vector of candidate values
27
28. Update cell state
• Combine the old cell state with the new candidate value
through andft it
28
30. GRU
• Update gate: combination of forget gate and input gate
• Merge cell state and hidden state
30
31. Bi-directional RNN
• Combining two RNNs together:
One RNN reads inputs from left to right and
another RNN reads inputs from right to left
• Able to understand context better
https://towardsdatascience.com/understanding-bidirectional-rnn-in-pytorch-5bd25a5dd6631
32. How to build RNN in TF
1. Decide which cell you use for RNN
2. Decide the number of layers in RNN
3. Decide whether RNN is uni- or bi- directional
32
35. Uni-directional RNN
• tf.nn.dynamic_rnn()
• outputs: the sequence of hidden states
[batch_size, max_sequences, output_size]
• state: the final state
[batch_size, output_size]
35
39. Hierarchical RNN
Word-level RNN
Paragraph-level RNN
ht
p = f(ht−1
p , xt
p; θf )
up = g(up−1, ht
p; θg)
x1
1 x2
1 x3
1
xt
1
h1
1
⋯
h2
1 h3
1
ht
1
⋯
x1
2 x2
2 x3
2
xt
2
h1
2
⋯
h2
2 h3
2
ht
2
x1
p x2
p x3
p xt
p
h1
p
⋯
h2
p h3
p ht
p
ht
1 ht
2 ht
p⋯
u1 u2
up
39
40. Hierarchical RNN
Word-level RNN
Paragraph-level RNN
ht
p = f(ht−1
p , xt
p; θf )
up = g(up−1, ht
p; θg)
x1
1 x2
1 x3
1
xt
1
h1
1
⋯
h2
1 h3
1
ht
1
⋯
x1
2 x2
2 x3
2
xt
2
h1
2
⋯
h2
2 h3
2
ht
2
x1
p x2
p x3
p xt
p
h1
p
⋯
h2
p h3
p ht
p
ht
1 ht
2 ht
p⋯
u1 u2
up
The maximum length of RNN
can be reduced significantly
Therefore, we can train models with a
fewer number of parameters effectively
40
46. Attention mechanism in NMT
46https://aws.amazon.com/ko/blogs/machine-learning/train-neural-machine-translation-models-with-sockeye/
Source
(German)
Target
(English)
47. Attention mechanism
47
• In detecting incongruity, we can pay a different amount of
attention for each paragraph
48. Attention mechanism
ht
1 ht
2 ht
p⋯
uB
1 uB
2 uB
p
⋯
uH
RNN for headline (target) RNN for body text (source)
Alignment
Model
Weighted sum
uB
48
• In detecting incongruity, we can pay a different amount of
attention for each paragraph
49. RNN for headline (target) RNN for body text (source)
Alignment model
ht
1 ht
2 ht
p⋯
uB
1 uB
2 uB
p
⋯
uH
Weighted sum
uB
Alignment
Model
Alignment
Model
Alignment
Model
Alignment
Model
aH(s) = align(uH
, uB
s )
=
exp(score(uH
, uB
s )
∑s′
exp(score(uH, uB
s′)
• Calculate attention weights between each paragraph (source)
and headline (target)
49
uB
1 uB
2 uB
puH
50. RNN for headline (target) RNN for body text (source)
Alignment model
ht
1 ht
2 ht
p⋯
uB
1 uB
2 uB
p
⋯
uH
Weighted sum
uB
Alignment
Model
Alignment
Model
Alignment
Model
Alignment
Model
50
uB
1 uB
2 uB
puH
• Score is a content-based function
(Luong et al. 2015)
51. RNN for headline (target) RNN for body text (source)
Context vector
ht
1 ht
2 ht
p⋯
uB
1 uB
2 uB
p
⋯
uH
Context vector
uB
Alignment
Model
• Represents the body text with different attention weights
across paragraphs
uB
=
∑
s′
aH(s)uB
s′ Weighted sum
uB
51
uB
1 uB
2 uB
p
Alignment
Model
52. Attention in TF
• Using dot-product similarity
• bodytext_outputs: sequence of the hidden states
• headline_states: the last hidden state
52
53. Overall model architecture
Deep Neural Net for
Encoding Headline
Deep Neural Net for
Encoding Body Text
Embedding
Layer
Output
Layer
Input
Layer
53
54. Measure similarity
• : last hidden state of RNN for encoding headline
• : context vector that encodes body text
• : learnable similarity matrix, : bias term
• :
p(label) = σ((uH
)⊤
MuB
+ b)
uH
uB
M
σ
b
54
59. How to prevent overfitting?
• Add more data! (most effective if possible)
• Data augmentation: add noises to input to better generalized
• Regularization: L1/L2, Dropout, Early stopping
• Reduce architecture complexity
59
63. Attention for text classification
• Giving different weights over word sequences (Zhou et al., ACL 2016)
63
H = [h1, h2, ⋯, hT]
M = tanh(H)
α = softmax(wt
M)
r = HαT
64. Attention for text classification
• Focusing on important sentence representation, each of which
pay a different amount of attention to words (Yang et al., NAACL 2016)
64
65. Attention for text classification
• Transfer learning on Transformer language model, trained by
multi-head attention (Vaswani et al., NIPS 2017, Devlin et al., NAACL 2019)
65