http://www.meetup.com/SF-Bay-ACM/events/227480571/
(see also YouTube for a recording of the presentation)
The talk will cover a brief review of neural network basics and the following types of neural network deep learning:
* autocorrelational - unsupervised learning for extracting features. He will describe how additional layers build complexity in the feature extraction.
* convolutional - how to detect shift invariant patterns in various data sources. Horizontal shift invariant detection applies to signals like speech recognition or IoT data. Horizontal and vertical shift invariance applies to images or videos, for faces or self driving cars
* discuss details of applying deep net systems for continuous or real time scoring
* reinforcement learning or Q Learning - such as learning how to play Atari video games
* continuous space word models - such as word2vec, skipgram training, NLP understanding and translation
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Greg Makowski
This talk covers 4 configurations of deep learning to solve different types of application needs. Also, strategies for speed up and real-time scoring are discussed.
Deep Learning with Python: Getting started and getting from ideas to insights in minutes.
PyData Seattle 2015
Alex Korbonits (@korbonits)
This presentation was given July 25, 2015 at the PyData Seattle conference hosted by PyData and NumFocus.
BigDL webinar - Deep Learning Library for SparkDESMOND YUEN
BigDL is a distributed deep learning library for Apache Spark*
and a unified Big Data Platform Driving Analytics and Data Science.
If you like what you read be sure you ♥ it below. Thank you!
Deep learning is making news across the country as one of the most promising techniques in machine learning research. However, these methods are complex to implement, finicky to tune, and state-of-the-art accuracy is only achieved by a few experts in the field. In this session, we give a beginner-friendly explanation of deep learning using neural networks—what it is, what it does, and how; and introduce the concept of deep features, which allows you to obtain great performance with reduced running times and data set sizes. We then show how these methods can easily be deployed on GPU instances (G2) on Amazon EC2.
Yinyin Liu presents at SD Robotics Meetup on November 8th, 2016. Deep learning has made great success in image understanding, speech, text recognition and natural language processing. Deep Learning also has tremendous potential to tackle the challenges in robotic vision, and sensorimotor learning in a robotic learning environment. In this talk, we will talk about how current and future deep learning technologies can be applied for robotic applications.
Suggestions:
1) For best quality, download the PDF before viewing.
2) Open at least two windows: One for the Youtube video, one for the screencast (link below), and optionally one for the slides themselves.
3) The Youtube video is shown on the first page of the slide deck, for slides, just skip to page 2.
Screencast: http://youtu.be/VoL7JKJmr2I
Video recording: http://youtu.be/CJRvb8zxRdE (Thanks to Al Friedrich!)
In this talk, we take Deep Learning to task with real world data puzzles to solve.
Data:
- Higgs binary classification dataset (10M rows, 29 cols)
- MNIST 10-class dataset
- Weather categorical dataset
- eBay text classification dataset (8500 cols, 500k rows, 467 classes)
- ECG heartbeat anomaly detection
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Greg Makowski
This talk covers 4 configurations of deep learning to solve different types of application needs. Also, strategies for speed up and real-time scoring are discussed.
Deep Learning with Python: Getting started and getting from ideas to insights in minutes.
PyData Seattle 2015
Alex Korbonits (@korbonits)
This presentation was given July 25, 2015 at the PyData Seattle conference hosted by PyData and NumFocus.
BigDL webinar - Deep Learning Library for SparkDESMOND YUEN
BigDL is a distributed deep learning library for Apache Spark*
and a unified Big Data Platform Driving Analytics and Data Science.
If you like what you read be sure you ♥ it below. Thank you!
Deep learning is making news across the country as one of the most promising techniques in machine learning research. However, these methods are complex to implement, finicky to tune, and state-of-the-art accuracy is only achieved by a few experts in the field. In this session, we give a beginner-friendly explanation of deep learning using neural networks—what it is, what it does, and how; and introduce the concept of deep features, which allows you to obtain great performance with reduced running times and data set sizes. We then show how these methods can easily be deployed on GPU instances (G2) on Amazon EC2.
Yinyin Liu presents at SD Robotics Meetup on November 8th, 2016. Deep learning has made great success in image understanding, speech, text recognition and natural language processing. Deep Learning also has tremendous potential to tackle the challenges in robotic vision, and sensorimotor learning in a robotic learning environment. In this talk, we will talk about how current and future deep learning technologies can be applied for robotic applications.
Suggestions:
1) For best quality, download the PDF before viewing.
2) Open at least two windows: One for the Youtube video, one for the screencast (link below), and optionally one for the slides themselves.
3) The Youtube video is shown on the first page of the slide deck, for slides, just skip to page 2.
Screencast: http://youtu.be/VoL7JKJmr2I
Video recording: http://youtu.be/CJRvb8zxRdE (Thanks to Al Friedrich!)
In this talk, we take Deep Learning to task with real world data puzzles to solve.
Data:
- Higgs binary classification dataset (10M rows, 29 cols)
- MNIST 10-class dataset
- Weather categorical dataset
- eBay text classification dataset (8500 cols, 500k rows, 467 classes)
- ECG heartbeat anomaly detection
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Squeezing Deep Learning Into Mobile PhonesAnirudh Koul
A practical talk by Anirudh Koul aimed at how to run Deep Neural Networks to run on memory and energy constrained devices like smart phones. Highlights some frameworks and best practices.
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
25-min talk about Machine Learning and a little bit of Deep Learning. Starts with some basic definitions (Supervised and Unsupervised Learning). Then, neural networks basic functionality is explained, ending up in Deep Learning and Convolutional Neural Networks.
Machine Learning Meetup that happened in Porto Alegre, Brazil.
A simplified way of approaching machine learning and deep learning from the ground up. The case for deep learning and an attempt to develop intuition for how/why it works. Advantages, state-of-the-art, and trends.
Presented at NYU Center for Genomics for NY Deep Learning Meetup
Urs Köster and Yinyin Liu present at ODSC West. Deep learning has had a major impact in the last three years. Imperfect interactions with machines, such as speech, natural language, or image processing have been made robust by deep learning and deep learning holds promise in finding usable structure in large datasets. The training process is lengthy and has proven to be difficult to scale due to constraints of existing compute architectures and there is a need of standardized tools for building and scaling deep learning solutions. Urs will outline some of these challenges and how fundamental changes to the organization of computation and communication can lead to large advances in capabilities. Urs will dive deep into the field of Deep Learning and focus on Convolutional and Recurrent Neural Networks. The talk will be followed by a workshop highlighting neon™, an open source python based deep learning framework that has been built from the ground up for speed and ease of use. This session is targeted at data scientists and researchers interested in taking deep learning to the next level of speed and scalability. The tutorial covers how to use neon™ to build and train Recurrent Neural Networks to generate text, and Convolutional Networks to perform image classification.
Deep learning goes beyond the traditional machine learning of big data and analytics. In this session, we will review the AWS offering, Amazon Machine Learning, and the AWS GPU-intensive family of servers that run native machine learning and deep-learning algorithms. We will also cover some basic deep-learning algorithms using open source software. Session sponsored by Day1 Solutions.
Language translation with Deep Learning (RNN) with TensorFlowS N
The author is going to take you into the realm of Recurrent Neural Network (RNN). He will be training a sequence to sequence model on a dataset of English and French sentences that can translate new (unseen) sentences from English to French.
This will be a walkthrough of an end to end technique to train a Deep RNN model. You will learn to build various components necessary to build a Sequence-to-Sequence model.
You will learn about the fundamentals of Deep Learning, mainly RNN, concepts that will be required in this solution. A familiarity of Deep Learning concepts would be handy, but most of the concepts used in this example will be covered during the demo.
Technologies to be used:
Python, Jupyter, TensorFlow, FloydHub
Source code: https://github.com/syednasar/deeplearning/blob/master/language-translation/dlnd_language_translation.ipynb
...
Talk given at PYCON Stockholm 2015
Intro to Deep Learning + taking pretrained imagenet network, extracting features, and RBM on top = 97 Accuracy after 1 hour (!) of training (in top 10% of kaggle cat vs dog competition)
Zaikun Xu from the Università della Svizzera Italiana presented this deck at the 2016 Switzerland HPC Conference.
“In the past decade, deep learning as a life-changing technology, has gained a huge success on various tasks, including image recognition, speech recognition, machine translation, etc. Pio- neered by several research groups, Geoffrey Hinton (U Toronto), Yoshua Benjio (U Montreal), Yann LeCun(NYU), Juergen Schmiduhuber (IDSIA, Switzerland), Deep learning is a renaissance of neural network in the Big data era.
Neural network is a learning algorithm that consists of input layer, hidden layers and output layers, where each circle represents a neural and the each arrow connection associates with a weight. The way neural network learns is based on how different between the output of output layer and the ground truth, following by calculating the gradients of this discrepancy w.r.b to the weights and adjust the weight accordingly. Ideally, it will find weights that maps input X to target y with error as lower as possible.”
Watch the video presentation: http://insidehpc.com/2016/03/deep-learning/
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Applying your Convolutional Neural NetworksDatabricks
Part 3 of the Deep Learning Fundamentals Series, this session starts with a quick primer on activation functions, learning rates, optimizers, and backpropagation. Then it dives deeper into convolutional neural networks discussing convolutions (including kernels, local connectivity, strides, padding, and activation functions), pooling (or subsampling to reduce the image size), and fully connected layer. The session also provides a high-level overview of some CNN architectures. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
by Vikram Madan, Sr. Product Manager, AWS Deep Learning
In this workshop, we will provide cover deep learning fundamentals and focus on the powerful and scalable Apache MXNet open source deep learning framework. At the end of this tutorial you’ll be able to train your own deep neural network and fine tune existing state of the art models for image and object recognition. We’ll also deep dive on setting up your deep learning infrastructure on AWS and model deployment on AWS Lambda.
Scalable Data Science and Deep Learning with H2O
In this session, we introduce the H2O data science platform. We will explain its scalable in-memory architecture and design principles and focus on the implementation of distributed deep learning in H2O. Advanced features such as adaptive learning rates, various forms of regularization, automatic data transformations, checkpointing, grid-search, cross-validation and auto-tuning turn multi-layer neural networks of the past into powerful, easy-to-use predictive analytics tools accessible to everyone. We will present a broad range of use cases and live demos that include world-record deep learning models, anomaly detection tools and approaches for Kaggle data science competitions. We also demonstrate the applicability of H2O in enterprise environments for real-world customer production use cases.
By the end of the hands-on-session, attendees will have learned to perform end-to-end data science workflows with H2O using both the easy-to-use web interface and the flexible R interface. We will cover data ingest, basic feature engineering, feature selection, hyperparameter optimization with N-fold cross-validation, multi-model scoring and taking models into production. We will train supervised and unsupervised methods on realistic datasets. With best-of-breed machine learning algorithms such as elastic net, random forest, gradient boosting and deep learning, you will be able to create your own smart applications.
A local installation of RStudio is recommended for this session.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Deep Learning is the area of machine learning and one of the most talked about trends in business and computer science today.
In this talk, I will give a review of Deep Learning explaining what it is, what kinds of tasks it can do today, and what it probably could do in the future.
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
Deep Learning is all the rage these days, but where does the reality of what Deep Learning can do end and the media hype begin? In this talk, I will dispel common myths about Deep Learning that are not necessarily true and help you decide whether you should practically use Deep Learning in your software stack.
I’ll begin with a technical overview of common neural network architectures like CNNs, RNNs, GANs and their common use cases like computer vision, language understanding or unsupervised machine learning. Then I’ll separate the hype from reality around questions like:
• When should you prefer traditional ML systems like scikit learn or Spark.ML instead of Deep Learning?
• Do you no longer need to do careful feature extraction and standardization if using Deep Learning?
• Do you really need terabytes of data when training neural networks or can you ‘steal’ pre-trained lower layers from public models by using transfer learning?
• How do you decide which activation function (like ReLU, leaky ReLU, ELU, etc) or optimizer (like Momentum, AdaGrad, RMSProp, Adam, etc) to use in your neural network?
• Should you randomly initialize the weights in your network or use more advanced strategies like Xavier or He initialization?
• How easy is it to overfit/overtrain a neural network and what are the common techniques to ovoid overfitting (like l1/l2 regularization, dropout and early stopping)?
Squeezing Deep Learning Into Mobile PhonesAnirudh Koul
A practical talk by Anirudh Koul aimed at how to run Deep Neural Networks to run on memory and energy constrained devices like smart phones. Highlights some frameworks and best practices.
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
25-min talk about Machine Learning and a little bit of Deep Learning. Starts with some basic definitions (Supervised and Unsupervised Learning). Then, neural networks basic functionality is explained, ending up in Deep Learning and Convolutional Neural Networks.
Machine Learning Meetup that happened in Porto Alegre, Brazil.
A simplified way of approaching machine learning and deep learning from the ground up. The case for deep learning and an attempt to develop intuition for how/why it works. Advantages, state-of-the-art, and trends.
Presented at NYU Center for Genomics for NY Deep Learning Meetup
Urs Köster and Yinyin Liu present at ODSC West. Deep learning has had a major impact in the last three years. Imperfect interactions with machines, such as speech, natural language, or image processing have been made robust by deep learning and deep learning holds promise in finding usable structure in large datasets. The training process is lengthy and has proven to be difficult to scale due to constraints of existing compute architectures and there is a need of standardized tools for building and scaling deep learning solutions. Urs will outline some of these challenges and how fundamental changes to the organization of computation and communication can lead to large advances in capabilities. Urs will dive deep into the field of Deep Learning and focus on Convolutional and Recurrent Neural Networks. The talk will be followed by a workshop highlighting neon™, an open source python based deep learning framework that has been built from the ground up for speed and ease of use. This session is targeted at data scientists and researchers interested in taking deep learning to the next level of speed and scalability. The tutorial covers how to use neon™ to build and train Recurrent Neural Networks to generate text, and Convolutional Networks to perform image classification.
Deep learning goes beyond the traditional machine learning of big data and analytics. In this session, we will review the AWS offering, Amazon Machine Learning, and the AWS GPU-intensive family of servers that run native machine learning and deep-learning algorithms. We will also cover some basic deep-learning algorithms using open source software. Session sponsored by Day1 Solutions.
Language translation with Deep Learning (RNN) with TensorFlowS N
The author is going to take you into the realm of Recurrent Neural Network (RNN). He will be training a sequence to sequence model on a dataset of English and French sentences that can translate new (unseen) sentences from English to French.
This will be a walkthrough of an end to end technique to train a Deep RNN model. You will learn to build various components necessary to build a Sequence-to-Sequence model.
You will learn about the fundamentals of Deep Learning, mainly RNN, concepts that will be required in this solution. A familiarity of Deep Learning concepts would be handy, but most of the concepts used in this example will be covered during the demo.
Technologies to be used:
Python, Jupyter, TensorFlow, FloydHub
Source code: https://github.com/syednasar/deeplearning/blob/master/language-translation/dlnd_language_translation.ipynb
...
Talk given at PYCON Stockholm 2015
Intro to Deep Learning + taking pretrained imagenet network, extracting features, and RBM on top = 97 Accuracy after 1 hour (!) of training (in top 10% of kaggle cat vs dog competition)
Zaikun Xu from the Università della Svizzera Italiana presented this deck at the 2016 Switzerland HPC Conference.
“In the past decade, deep learning as a life-changing technology, has gained a huge success on various tasks, including image recognition, speech recognition, machine translation, etc. Pio- neered by several research groups, Geoffrey Hinton (U Toronto), Yoshua Benjio (U Montreal), Yann LeCun(NYU), Juergen Schmiduhuber (IDSIA, Switzerland), Deep learning is a renaissance of neural network in the Big data era.
Neural network is a learning algorithm that consists of input layer, hidden layers and output layers, where each circle represents a neural and the each arrow connection associates with a weight. The way neural network learns is based on how different between the output of output layer and the ground truth, following by calculating the gradients of this discrepancy w.r.b to the weights and adjust the weight accordingly. Ideally, it will find weights that maps input X to target y with error as lower as possible.”
Watch the video presentation: http://insidehpc.com/2016/03/deep-learning/
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Applying your Convolutional Neural NetworksDatabricks
Part 3 of the Deep Learning Fundamentals Series, this session starts with a quick primer on activation functions, learning rates, optimizers, and backpropagation. Then it dives deeper into convolutional neural networks discussing convolutions (including kernels, local connectivity, strides, padding, and activation functions), pooling (or subsampling to reduce the image size), and fully connected layer. The session also provides a high-level overview of some CNN architectures. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
by Vikram Madan, Sr. Product Manager, AWS Deep Learning
In this workshop, we will provide cover deep learning fundamentals and focus on the powerful and scalable Apache MXNet open source deep learning framework. At the end of this tutorial you’ll be able to train your own deep neural network and fine tune existing state of the art models for image and object recognition. We’ll also deep dive on setting up your deep learning infrastructure on AWS and model deployment on AWS Lambda.
Scalable Data Science and Deep Learning with H2O
In this session, we introduce the H2O data science platform. We will explain its scalable in-memory architecture and design principles and focus on the implementation of distributed deep learning in H2O. Advanced features such as adaptive learning rates, various forms of regularization, automatic data transformations, checkpointing, grid-search, cross-validation and auto-tuning turn multi-layer neural networks of the past into powerful, easy-to-use predictive analytics tools accessible to everyone. We will present a broad range of use cases and live demos that include world-record deep learning models, anomaly detection tools and approaches for Kaggle data science competitions. We also demonstrate the applicability of H2O in enterprise environments for real-world customer production use cases.
By the end of the hands-on-session, attendees will have learned to perform end-to-end data science workflows with H2O using both the easy-to-use web interface and the flexible R interface. We will cover data ingest, basic feature engineering, feature selection, hyperparameter optimization with N-fold cross-validation, multi-model scoring and taking models into production. We will train supervised and unsupervised methods on realistic datasets. With best-of-breed machine learning algorithms such as elastic net, random forest, gradient boosting and deep learning, you will be able to create your own smart applications.
A local installation of RStudio is recommended for this session.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Deep Learning is the area of machine learning and one of the most talked about trends in business and computer science today.
In this talk, I will give a review of Deep Learning explaining what it is, what kinds of tasks it can do today, and what it probably could do in the future.
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
Deep Learning is all the rage these days, but where does the reality of what Deep Learning can do end and the media hype begin? In this talk, I will dispel common myths about Deep Learning that are not necessarily true and help you decide whether you should practically use Deep Learning in your software stack.
I’ll begin with a technical overview of common neural network architectures like CNNs, RNNs, GANs and their common use cases like computer vision, language understanding or unsupervised machine learning. Then I’ll separate the hype from reality around questions like:
• When should you prefer traditional ML systems like scikit learn or Spark.ML instead of Deep Learning?
• Do you no longer need to do careful feature extraction and standardization if using Deep Learning?
• Do you really need terabytes of data when training neural networks or can you ‘steal’ pre-trained lower layers from public models by using transfer learning?
• How do you decide which activation function (like ReLU, leaky ReLU, ELU, etc) or optimizer (like Momentum, AdaGrad, RMSProp, Adam, etc) to use in your neural network?
• Should you randomly initialize the weights in your network or use more advanced strategies like Xavier or He initialization?
• How easy is it to overfit/overtrain a neural network and what are the common techniques to ovoid overfitting (like l1/l2 regularization, dropout and early stopping)?
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiDatabricks
Apache Spark provides an elegant API for developing machine learning pipelines that can be deployed seamlessly in production. However, one of the most intriguing and performant family of algorithms – deep learning – remains difficult for many groups to deploy in production, both because of the need for tremendous compute resources and also because of the inherent difficulty in tuning and configuring.
In this session, you’ll discover how to deploy the Microsoft Cognitive Toolkit (CNTK) inside of Spark clusters on the Azure cloud platform. Learn about the key considerations for administering GPU-enabled Spark clusters, configuring such workloads for maximum performance, and techniques for distributed hyperparameter optimization. You’ll also see a real-world example of training distributed deep learning learning algorithms for speech recognition and natural language processing.Microsoft Cognitive Toolkit (CNTK) inside of Spark clusters on the Azure cloud platform. We’ll discuss the key considerations for administering GPU-enabled Spark clusters, configuring such workloads for maximum performance, and techniques for distributed hyperparameter optimization. We’ll illustrate a real-world example of training distributed deep learning learning algorithms for speech recognition and natural language processing.
This is an 1 hour presentation on Neural Networks, Deep Learning, Computer Vision, Recurrent Neural Network and Reinforcement Learning. The talks later have links on how to run Neural Networks on
This is a 2 hours overview on the deep learning status as for Q1 2017.
Starting with some basic concepts, continue to basic networks topologies , tools, HW/Accelerators and finally Intel's take on the the different fronts.
This talk was presented in Startup Master Class 2017 - http://aaiitkblr.org/smc/ 2017 @ Christ College Bangalore. Hosted by IIT Kanpur Alumni Association and co-presented by IIT KGP Alumni Association, IITACB, PanIIT, IIMA and IIMB alumni.
My co-presenter was Biswa Gourav Singh. And contributor was Navin Manaswi.
http://dataconomy.com/2017/04/history-neural-networks/ - timeline for neural networks
Deep learning: the future of recommendationsBalázs Hidasi
An informative talk about deep learning and its potential uses in recommender systems. Presented at the Budapest Startup Safary, 21 April, 2016.
The breakthroughs of the last decade in neural network research and the quick increasing of computational power resulted in the revival of deep neural networks and the field focusing on their training: deep learning. Deep learning methods have succeeded in complex tasks where other machine learning methods have failed, such as computer vision and natural language processing. Recently deep learning has began to gain ground in recommender systems as well. This talk introduces deep learning and its applications, with emphasis on how deep learning methods can solve long standing recommendation problems.
Machine Learning is increasingly being used by organisations to move from analysis to prediction. How AWS and open source technology can help you to perform both Deep Learning and Machine Learning
Explore big data at speed of thought with Spark 2.0 and SnappydataData Con LA
Abstract:
Data exploration often requires running aggregation/slice-dice queries on data sourced from disparate sources. You may want to identify distribution patterns, outliers, etc and aid the feature selection process as you train your predictive models. As you begin to understand your data, you want to ask ad-hoc questions expressed through your visualization tool (which typically translates to SQL queries), study the results and iteratively explore the data set through more queries. Unfortunately, even when data sets can be in-memory, large data set computations take time breaking the train of thought and increasing time to insight . We know Spark can be fast through its in-memory parallel processing. But, Spark 1.x isn’t quite there. Spark 2.0 promises to offer 10X better speed than its predecessor. Spark 2.0 ushers some impressive improvements to interactive query performance. We first explore these advances - compiling the query plan eliminating virtual function calls, and other improvements in the Catalyst engine. We compare the performance to other popular popular query processing engines by studying the spark query plans. We then go through SnappyData (an open source project that integrates Spark with a database that offers OLTP, OLAP and stream processing in a single cluster) where we use smarter data colocation and Synopses data (.e.g. Stratified sampling) to dramatically cut down on the memory requirements as well as the query latency. We explain the key concepts in summarizing data using structures like stratified sampling by walking through some examples in Apache Zeppelin notebooks (a open source visualization tool for spark) and demonstrate how we can explore massive data sets with just your laptop resources while achieving remarkable speeds.
Bio:
Jags is a founder and the CTO of SnappyData. Previously, Jags was the Chief Architect for “fast data” products at Pivotal and served in the extended leadership team of the company. At Pivotal and previously at VMWare, he led the technology direction for GemFire and other distributed in-memory Bio:
Jags Ramnarayan is a founder and the CTO of SnappyData. Previously, Jags was the Chief Architect for “fast data” products at Pivotal and served in the extended leadership team of the company. At Pivotal and previously at VMWare, he led the technology direction for GemFire and other distributed in-memory products.
Machine learning for IoT - unpacking the blackboxIvo Andreev
Have you ever considered Machine Learning as a black box? It sounds as a kind of magic happening. Although being one among many solutions available, Azure ML has proved to be a great balance between flexibility, usability and affordable price. But how does Azure ML compare with the other ML providers? How to choose the appropriate algorithm? Do you understand the key performance indicators and how to improve the quality of your models? The session is about understanding the black box and using it for IoT workload and not only.
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...DataStax Academy
This session covers our experience with using the Spark and Shark frameworks for running real-time queries on top of Cassandra data.We will start by surveying the current Cassandra analytics landscape, including Hadoop and HIVE, and touch on the use of custom input formats to extract data from Cassandra. We will then dive into Spark and Shark, two memory-based cluster computing frameworks, and how they enable often dramatic improvements in query speed and productivity, over the standard solutions today.
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsWee Hyong Tok
In this session, we will share about cutting-edge deep learning innovations, and present emerging trends in the AI community. This session is for data scientists, developers who have a keen interest in getting started in an AI project, and wants to learn the tools of the trade. We will draw on practical experiences from working on various AI projects, and share the key learning, and pitfalls
Making NumPy-style and Pandas-style code faster and run in parallel. Continuum has been working on scaled versions of NumPy and Pandas for 4 years. This talk describes how Numba and Dask provide scaled Python today.
Deep learning continues to push the state of the art in domains such as computer vision, natural language understanding and recommendation engines. One of the key reasons for this progress is the availability of highly flexible and developer friendly deep learning frameworks. During this workshop, we will provide a short background on Deep Learning focusing on relevant application domains and an introduction to the powerful and scalable Deep Learning framework, Apache MXNet. At the end of this tutorial you’ll be able to train your own deep neural network, fine tune existing state of the art models for image and object recognition. We’ll also deep dive on setting up your deep learning infrastructure on AWS and model deployment on AWS Lambda.
Similar to Using Deep Learning to do Real-Time Scoring in Practical Applications (20)
Understanding Hallucinations in LLMs - 2023 09 29.pptxGreg Makowski
Hallucinations are a current fundamental problem for LLMs.
For one example, June this year in New York, attorneys did "research" on past cases with ChatGPT and turned it in to the Judge as a brief. The opposing council reported to the judge that they could not find the cases. When the judge confronted the GPT using attorneys, they stood behind their brief. The judge find the firm $5000.
Could this happen to you? YES. What can be done to avoid this in the future? I will answer.
In this talk, I will explain some fundamental areas of LLM's to explain how and why hallucinations occur. To understand that, an introduction into how words, concepts and dialogs are represented will help.
Words were first represented as a point in an embedding space with Word2Vec in 2013. This could compress 10,000 words into a vector of 300 elements, with a word being represented as a point in the 300-dimensional embedding space. Not just words can be represented, but also longer text, such as books can be compressed into a type of embedding. In that situation, areas of embedding space relate to different genres, such as: non-fiction, science fiction, children's fiction and so on. A new data point between training data points, when converted to text, would be a hallucination. In the area of "legal cases" in embedding space, if there is not an exact match, the text generation would try to generate what is plausible.
During an LLM conversation, the output of the previous text provides context for the next text in the style of a recurrent neural network. The starting position of a conversation matters. Understanding areas of weight space represent genres like "non-fiction" or other language aspects, and the starting position of a discussion time series matters, helps to understand why prompt engineering helps. The neural network conversation is represented in the activations of the 7B or 500B weights, a much larger space. During a conversation, learning is not occurring, but neural network activations are changing. The neural network is not a database. Even if you reach the exact set of weight activations from a training record, due to lossy compression, the exact text may not be regenerated.
Chat GPT does not use word embeddings. For implementation efficiency reasons, it is practical to break down what is embedded to about 50,000 items in a lookup table. Also, if we want to support proper nouns, like names, and dozens of languages, the number of words would be in the millions. Chat GPT and other LLMs use "tokens" for embedding. Examples of Byte Pair Encoding (BPE) and its process is given. The ChatGPT embedding is a vector of numbers 1,536 long for each token.
A solution for today is Retrieval Augmented Generation (RAG). As a brief introduction, you can ask with an English or natural question. It can be matched against a large library or database of paragraphs from internal documents or websites.
Give a background of Data Science and Artificial Intelligence, to better understand the current state of the art (SOTA) for Large Language Models (LLMs) and Generative AI. Then start a discussion on the direction things are going in the future.
A Successful Hiring Process for Data ScientistsGreg Makowski
Discuss one successful hiring process for data scientists. The current "best" algorithms are constantly changing. Also, it is not uncommon to need to learn about a new vertical market for a DS application. From my DS hiring experience over 2010-2022, I have focused on hiring people that are good at learning and adapting.
Kdd 2019: Standardizing Data Science to Help HiringGreg Makowski
Initiative for Analytics and Data Science Standards (IADSS) workshop presentation at the ACM KDD conference (Association of Computing Machinery Knowledge Discovery in Databases).
Tales from an ip worker in consulting and softwareGreg Makowski
Discussion around intellectual property, leverage over consulting projects to build vertical application software. In my use case, data mining, artificial intelligence and intelligence augmentation are part of the value add. Also, discuss software frameworks, open source software and clauses on prior inventions in hiring contracts
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
Describing a predictive data mining model can provide a competitive advantage for solving business problems with a model. The SSA approach can also provide reasons for the forecast for each record. This can help drive investigations into fields and interactions during a data mining project, as well as identifying "data drift" between the original training data, and the current scoring data. I am working on open source version of SSA, first in R.
Production model lifecycle management 2016 09Greg Makowski
This talk covers going over the various stages of building data mining models, putting them into production and eventually replacing them. A common theme throughout are three attributes of predictive models: accuracy, generalization and description. I assert you can have it all, and having all three is important for managing the lifecycle. A subtle point is that this is a step to developing embedded, automated data mining systems which can figure out themselves when they need to be updated.
SFbayACM ACM Data Science Camp 2015 10 24Greg Makowski
This is the slide deck for the 7th annual ACM Data Science Camp. It is an unconference, with content generated by the audience. For the primary event site, see http://www.sfbayacm.org/event/silicon-valley-data-science-camp-2015
How to Create 80% of a Big Data Pilot ProjectGreg Makowski
When evaluating Open Source Software, or other software of a certain size or complexity, organizations frequently want to conduct a Pilot project, or Proof of Concept (POC). This talk describes a process to reduce the length of the Pilot, by leveraging configurations from performance testing to POC starting configurations.
Powering Realtime Decision Engines in Finance and Healthcare using Open Sour...Greg Makowski
http://www.kdd.org/kdd2015/industry-gov-talks.html
Financial services and healthcare companies could be the biggest beneficiaries of big data. Their realtime decision engines can be vastly improved by leveraging the latest advances in big data analytics. However, these companies are challenged in leveraging Open Software Systems (OSS). This presentation covers how, in collaboration with financial services and healthcare institutions, we built an OSS project to deliver a realtime decisioning engine for their respective applications. I will address two key issues. First, I will describe the strategy behind our hiring process to attract millennial big data developers and the results of this endeavor. Second, I will recount the collaboration effort that we had with our large clients and the various milestones we achieved during that process. I will explain the goals regarding big data analysis that our large clients presented to us and how we accomplished those goals. In particular, I will discuss how we leveraged open source to deliver a realtime decisioning software product called Kamanja to these institutions. An advantage of developing applications in Kamanja is that it is already integrated with Hadoop, Kafka for realtime data streaming, HBase and Cassandra for NoSQL data storage. I will talk about how these companies benefited from Kamanja and some of challenges we had in the design of this software. I will provide quantifiable improvements in key metrics driven by Kamanja and interesting, unsolved problems/challenges that need to be addressed for faster and wider adoption of OSS by these companies.
Kamanja: Driving Business Value through Real-Time Decisioning SolutionsGreg Makowski
This is a first presentation of Kamanja, a new open-source real-time software product, which integrates with other big-data systems. See also links: http://www.meetup.com/SF-Bay-ACM/events/223615901/ and http://Kamanja.org to download, for docs or community support. For the YouTube video, see https://www.youtube.com/watch?v=g9d87rvcSNk (you may want to start at minute 33).
Heuristic design of experiments w meta gradient searchGreg Makowski
Once you have started learning about predictive algorithms, and the basic knowledge discovery in databases process, what is the next level of detail to learn for a consulting project?
* Give examples of the many model training parameters
* Track results in a "model notebook"
* Use a model metric that combines both accuracy and generalization to rank models
* How to strategically search over the model training parameters - use a gradient descent approach
* One way to describe an arbitrarily complex predictive system is by using sensitivity analysis
Three case studies deploying cluster analysisGreg Makowski
Three case studies are discussed, that include cluster analysis as a component.
1) Customer description for a credit card attrition model, to describe how to talk to customers.
2) Hotel price optimization. Use clusters to find subsets of similar behavior, and optimize prices within each cluster. Use a neural net as the objective function.
3) Retail supply chain, planning replenishment using 52 week demand curves using thousands of seasonal "profiles" or clusters.
This presentation is a summary of section 2 (of 6) of the book "The 360º Leader" by best-selling author John C Maxwell. Challenges and solutions include:
* Tension (the pressure of being caught in the middle),
* Frustration (following an ineffective leader),
* Multi-Hat (one person – demands and expectations from all quarters),
* Ego (being hidden in the middle),
* Fulfillment (stuck in the middle, when would rather be in front),
* Vision (how to champion it when you did not create it),
* Influence (influencing others whom you do not manage).
This presentation covers material from John Maxwell's book, "The 360 Degree Leader." Specifically, the first of six sections is presented, including "The 7 Myths of Leading from the Middle of an Organization" and "5 Levels of Leadership Development."
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
2. • Big Picture of 2016 Technology
• Neural Net Basics
• Deep Network Configurations for Practical Applications
– Auto-Encoder (i.e. data compression or Principal Components
Analysis)
– Convolutional (shift invariance in time or space for voice, image or IoT)
– Real Time Scoring and Lambda Architecture
– Deep Net libraries and tools (Theano, Tourch, TensorFlow, ...
Kamanja)
– Reinforcement Learning, Q-Learning (i.e. beat people at Atari games,
IoT)
– Continuous Space Word Models (i.e. word2vec)
Deep Learning - Outline
4. Is Deep Learning Hype?
Is this just a “buzzword of the day or year?”
Is this improvement at the normal pace?
5. Is Deep Learning Hype?
Is this just a “buzzword of the day or year?”
Is this improvement at the normal pace?
NO !
Not only a buzzword
This is a leap in the rate of improvement!
So What? Show me…
6. http://whatsnext.nuance.com/in-the-labs/what-is-deep-machine-learning/
Deep Learning Caused about an 18% / Year Reduction
in Error in Speech Recognition (Nuance)
not only did DNNs drive
error rates down at once, …
they promise a lot of
poten8al for the years to
come.
It is no overstatement to say
that DNNs were the single
largest contributor to
innovation across many of
our products in recent years.
7. http://whatsnext.nuance.com/in-the-labs/what-is-deep-machine-learning/
2008, 09, 10, 11, 12, 13, 14, 15
Deep Learning Caused about an 18% / Year Reduction
in Error in Speech Recognition (Nuance)
What if Moore’s Law changed from 2X to 4x over the last 7 years,
because of a new technology advance!
not only did DNNs drive
error rates down at once, …
they promise a lot of
poten8al for the years to
come.
It is no overstatement to say
that DNNs were the single
largest contributor to
innovation across many of
our products in recent years.
8. Neural Net training is 10+ times faster on GPU’s
The gaming market is pushing for faster GPU speeds
https://jonpeddie.com/publications/
whitepapers/an-analysis-of-the-gpu-
market
https://developer.nvidia.com/cudnn
9. • Big Picture of 2016 Technology
• Neural Net Basics
• Deep Network Configurations for Practical Applications
– Auto-Encoder (i.e. data compression or Principal Components
Analysis)
– Convolutional (shift invariance in time or space for voice, image or IoT)
– Real Time Scoring and Lambda Architecture
– Deep Net libraries and tools (Theano, Tourch, TensorFlow, ...
Kamanja)
– Reinforcement Learning, Q-Learning (i.e. beat people at Atari games,
IoT)
– Continuous Space Word Models (i.e. word2vec)
Deep Learning - Outline
11. Advantages of a Net !
over Regression!
11
field 1
field 2
$
c
$
$
$
$
$
$
$
$ $
$ $
$
$
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c c
c
c
c c
c
c
c
A Neural Net
Solution
“Non-Linear”
Several
regions
which are
not adjacent
Hidden nodes
can be line or
circle
https://en.wikipedia.org/wiki/Artificial_neural_network
12. !
A Comparison of a Neural Net!
and Regression!
A Logistic regression formula:
Y = f( a0 + a1*X1 + a2*X2 + a3*X3)
a* are coefficients
Backpropagation, cast in a similar form:
H1 = f(w0 + w1*I1 + w2*I2 + w3*I3)
H2 = f(w4 + w5*I1 + w6*I2 + w7*I3)
:
Hn = f(w8 + w9*I1 + w10*I2 + w11*I3)
O1 = f(w12 + w13*H1 + .... + w15*Hn)
On = ....
w* are weights, AKA coefficients
I1..In are input nodes or input variables.
H1..Hn are hidden nodes, which extract features of the data.
O1..On are the outputs, which group disjoint categories.
Look at ratio of training records v.s. free parameters (complexity, regularization)
a0
a1 a2 a3
X1 X2 X3
Y
Input 1 I2 I3
Bias
H1 Hidden 2
Output
w1
w2
w3
Dot product is
Cosine similarity,
used broadly
Tensors
are matrices
of N dimensions
13. Think of Separating Land vs. Water
13
1 line,
Regression
(more errors)
5 Hidden Nodes in
a Neural Network
Different algorithms use
different Basis Functions:
• One line
• Many horizontal & vertical lines
• Many diagonal lines
• Circles
Decision Tree
12 splits
(more elements,
Less computation)
Q) What is too detailed? “Memorizing high tide boundary” and applying it at all times
14. • Big Picture of 2016 Technology
• Neural Net Basics
• Deep Network Configurations for Practical Applications
– Auto-Encoder (i.e. data compression or Principal Components
Analysis)
– Convolutional (shift invariance in time or space for voice, image or IoT)
– Real Time Scoring and Lambda Architecture
– Deep Net libraries and tools (Theano, Tourch, TensorFlow, ...
Kamanja)
– Reinforcement Learning, Q-Learning (i.e. beat people at Atari games,
IoT)
– Continuous Space Word Models (i.e. word2vec)
Deep Learning - Outline
http://deeplearning.net/
http://www.kdnuggets.com/
http://www.analyticbridge.com/
15. Leading up to an Auto Encoder
• Supervised Learning
– Regression (one layer, one line, one dot-product)
• 50 inputs à 1 output
– Possible nets:
• 256 à 120 à 1
• 256 à 120 à 5 (trees, regression, SVM & most algs are limited to 1
output)
• 256 ! 120 ! 60 ! 1 (can try 2 hidden layers, 3 sets of weights)
• 256 à 180 à 120 à 60 à 1 (start getting into training stability problems,
with 1990’s training processes)
• Unsupervised Learning
– Clustering (traditional unsupervised):
• 60 inputs (no output target); produce 1-2 new (cluster ID & distance)
16. Auto Encoder (like data compression)
Relate input to output, through compressed middle
At each step of training
Only train the black connections
256
256
180
Output
(same as input values)
Input
…256
120
180 …
256
180
…
…
…
Step 1,
Train 1st Hidden Layer (Tensor)
Step 2,
Train 2nd Hidden Layer (Tensor)
Called “Auto Encoder” because
input values = target values
Unsupervised, there are no additional target values
“Data Compression” because
Compress 256 numbers into 180 numbers
17. Auto Encoder (like data compression)
Relate input to output, through compressed middle
• Supervised Learning
– Regression, Tree or Net: 50 inputs à 1 output
– Possible nets:
• 256 à 120 à 1
• 256 à 120 à 5 (trees, regressions, SVD and most are limited to 1 output)
• 256 à 120 à 60 à 1
• 256 à 180 à 120 à 60 à 1
• Unsupervised Learning
– Clustering (traditional unsupervised):
• 60 inputs (no target); produce 1-2 new (cluster ID & distance)
– Unsupervised training of a net, assign (target record == input record) AUTO-
ENCODING
– Train net in stages,
• 256 à 180 à 256
à 120 à
à 120 à
à 120 à
• Add supervised layer to forecast 10 target categories
à 10
Because of symmetry,
Only need to update
mirrored weights once
(start getting long training times to stabilize, or may not finish,
The BREAKTHROUGH provided by DEEP LEARNING)
4 hidden layers w/ unsupervised training
1 layer at end w/ supervised training
https://en.wikipedia.org/wiki/Deep_learning
18. Auto Encoder (like data compression)
With Supervised Layers on Top
Unsupervised output
Like cluster output,
Only large values are a match
(not distance)
Train Supervised Layers on Top
Regular Back Propagation
Using unsupervised nodes as input
256
180
120
120
:
Target specific to the problem
Fraud risk * $
Cat, dog, human, other
256
180
120
120
:
50
1, 2, 10 or…
19. Auto Encoder
How it can be generally used to solve problems
• Add supervised layer to forecast 10 target categories
– 4 hidden layers trained with unuspervised training,
– 1 new layer, trained with supervised learning
à 10
• Outlier detection
• The “activation” at each of the 120 output nodes indicates the “match” to
that cluster or compressed feature
• When scoring new records, can detect outliers with a process like
If ( max_output_match < 0.333) then suspected outlier
• How is it like PCA?
– Individual hidden nodes in the same layer are “different” or “orthogonal”
20. Fraud Detection Example using
Deep Learning – auto encoders
• Unsupervised Learning of Normal Behavior(Outlier Detection)
– May want to preprocess transaction data - in the context of the
person’s past normal behavior
• 0..1, where 1 is the most SURPRISING for that person to act
• 0..1, where 1 is the most RISKY of fraud
• General, descriptive attributes that can be used for interactions
• Filter out from the training data – the most surprising & risky
• Want to the net to learn “normal” records
– Train 5-10 layers deep, end up with 50 to 100+ nodes at end
– Score records on membership in final nodes
• Transactions that are far from all final nodes are candidates for outliers
• Validate with existing surprising & risky. Add application post-processing
• Supervised Learning
– Add two layers on top, train to predict normal vs. surprising/risky
labeled data (if it is available)
21. • Big Picture of 2016 Technology
• Neural Net Basics
• Deep Network Configurations for Practical Applications
– Auto-Encoder (i.e. data compression or Principal Components
Analysis)
– Convolutional (shift invariance in time or space for voice, image or IoT)
– Real Time Scoring and Lambda Architecture
– Deep Net libraries and tools (Theano, Tourch, TensorFlow, ...
Kamanja)
– Reinforcement Learning, Q-Learning (i.e. beat people at Atari games,
IoT)
– Continuous Space Word Models (i.e. word2vec)
Deep Learning - Outline
22. Internet of Things (IoT) is heavily signal data
http://www.datasciencecentral.com/profiles/blogs/the-internet-of-things-data-science-and-big-data
23. Convolutional Neural Net (CNN)
Enables detecting shift invariant patterns
In Speech and Image applications, patterns vary by size, can be shifted right or left
Challenge: finding a bounding box for a pattern is almost as hard as detecting the pat.
Neural Nets can be explicitly trained to provide a FFT (Fast Fourier Transform)
to convert data from time domain to the frequency domain – but typically an explicit FFT is used
Internet of
Things Signal
Data
24. Convolutional Neural Net (CNN)
Enables detecting shift invariant patterns
In Speech and Image applications, patterns vary by size, can be shifted right or left
Challenge: finding a bounding box for a pattern is almost as hard as detecting the pat.
Solution: use a siding convolution to detect the pattern
CNN can use very long observational windows, up to 400 ms, long context
25. Convolution – Shift Horizontal
• SAME 25 WEIGHTS FEED INTO
EACH OUTPUT
• Backpropaga8on weight
update is averaged
• Otherwise NO convolu8on and
HUGE complexity!
Max pooling
Layer output = 1.2
28. Convolution – 3 Weight Patterns, Shifted 2D
Hidden Layer 1 Output Sections Per Convolution 3(10x3) – detection layer
Hidden Layer
1 Weights
Per Conv.
Pattern
Input pixels,
audio, video
or IoT signal
(14 x 7).
Convolutions
can be over 3+
dimensions
(video frames,
time invariance)
Max pooling layer output = 0.8 Max pooling layer output = 1.0 Max pooling layer output = 0.9
29. Convolution Neural Net (CNN)
Same Low Level Features can support different output
http://stats.stackexchange.com/questions/146413/why-convolutional-neural-networks-belong-to-deep-learning
Previous Slides Showed Training
this Hidden 1 Layer
Same Training process
for later hidden layers,
one at a time
Think of fraud detection
higher level
node patterns
30. Convolution Neural Net:
from LeNet-5
Gradient-Based Learning Applied to Document Recognition
Proceedings of the IEEE, Nov 1998
Yann LeCun, Leon Bottou, Yoshua Bengio and Patrick Haffner
Director
Facebook, AI Research
http://yann.lecun.com/Can do some size invariance,
but it adds to the layers
31.
32. Convolution Neural Net (CNN)
• How is a CNN trained differently than a typical back
propagation (BP) network?
– Parts of the training which is the same:
• Present input record
• Forward pass through the network
• Back propagate error (i.e. per epoch)
– Different parts of training:
• Some connections are CONSTRAINED to the same value
– The connections for the same pattern, sliding over all input space
• Error updates are averaged and applied equally to the one set of weight
values
• End up with the same pattern detector feeding many nodes at the next
level
http://www.cs.toronto.edu/~rgrosse/icml09-cdbn.pdf
Convolutional Deep Belief Networks for Scalable
Unsupervised Learning of Hierarchical Representations, 2009
33. The Mammalian Visual Cortex is Hierarchical
(The Brain is a Deep Neural Net - Yann LeCun)
http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf
0
1
2
3
4
5
6
7
8
9
1011
34. Convolution Neural Net (CNN)
Facebook example
https://gigaom.com/2014/03/18/facebook-shows-off-its-deep-learning-skills-with-deepface/
35. Convolution Neural Net (CNN)
Yahoo + Stanford example – find a face in a pic, even upside down
http://www.dailymail.co.uk/sciencetech/article-2958597/Facial-recognition-breakthrough-Deep-Dense-software-spots-faces-images-partially-hidden-UPSIDE-DOWN.html
37. • Big Picture of 2016 Technology
• Neural Net Basics
• Deep Network Configurations for Practical Applications
– Auto-Encoder (i.e. data compression or Principal Components
Analysis)
– Convolutional (shift invariance in time or space for voice, image or IoT)
– Real Time Scoring and Lambda Architecture
– Deep Net libraries and tools (Theano, Tourch, TensorFlow, ...
Kamanja)
– Reinforcement Learning, Q-Learning (i.e. beat people at Atari games,
IoT)
– Continuous Space Word Models (i.e. word2vec)
Deep Learning - Outline
38. Real Time Scoring
Neural Net Optimizations
• Auto-Encoding nets
– Can grow to millions of connections, and start to get computational
– Can reduce connections by 5% to 25+% with pruning & retraining
• Train with increased regularization settings
• Drop connections with near zero weights, then retrain
• Drop nodes with fan in connections which don’t get used much later,
such as in your predictive problem
• Perform sensitivity analysis – delete possible input fields
• Convolutional Neural Nets
– With large enough data, can even skip the FFT preprocessing step
– Can use wider than 10ms audio sampling rates for speed up
• Implement other preprocessing as lookup tables (i.e. Bayesian
Priors)
• Use cloud computing, do not limit to device computing
• Large models don’t fit à use model or data parallelism to
train
46. • Big Picture of 2016 Technology
• Neural Net Basics
• Deep Network Configurations for Practical Applications
– Auto-Encoder (i.e. data compression or Principal Components
Analysis)
– Convolutional (shift invariance in time or space for voice, image or IoT)
– Real Time Scoring and Lambda Architecture
– Deep Net libraries and tools (Theano, Tourch, TensorFlow, ...
Kamanja)
– Reinforcement Learning, Q-Learning (i.e. beat people at Atari games,
IoT)
– Continuous Space Word Models (i.e. word2vec)
Deep Learning - Outline
47. Reinforcement Learning (RL)
• Different than supervised and unsupervised learning
• Q) Can the network figure out hot to take one or more
actions NOW, to achieve a reward or payout (potentially
far-off, i.e. T steps in the FUTURE?
• Need to solve the credit assignment problem
– There is no teacher and very little labeled data
– Need to learn the best POLICY that will achieve the best outcome
– Assume no knowledge of the process model or reward function
• Next guess =
– Linear combination of ((current guess) and
(the new reward info just collected)), weighted by the learning rate
http://www.humphreysheil.com/blog/gorila-google-reinforcement-learning-architecture
http://robotics.ai.uiuc.edu/~scandido/?Developing_Reinforcement_Learning_from_the_Bellman_Equation
48. Deep Reinforcement Learning (RL),
Q-Learning
http://www.iclr.cc/lib/exe/fetch.php?media=iclr2015:silver-iclr2015.pdf David Silver, Google DeepMind
https://en.wikipedia.org/wiki/Reinforcement_learning
https://en.wikipedia.org/wiki/Q-learning
Think in terms of IoT….
Device agent measures, infers user’s action
Maximizes future reward, recommends to user or system
49. Deep Reinforcement Learning, Q-Learning
(Think about IoT possibilities)
http://www.iclr.cc/lib/exe/fetch.php?media=iclr2015:silver-iclr2015.pdf
David Silver, Google DeepMind
Use
last 4
screen
shots
50. Deep Reinforcement Learning, Q-Learning
http://www.iclr.cc/lib/exe/fetch.php?media=iclr2015:silver-iclr2015.pdf David Silver, Google DeepMind
Use 4
screen
shots
Use 4 screen shots
IoT challenge: How to replace game
score with IoT score?
Shift right fast
shift right
stay
shift left
shift left fast
51. Deep Reinforcement Learning, Q-Learning
http://www.iclr.cc/lib/exe/fetch.php?media=iclr2015:silver-iclr2015.pdf David Silver, Google DeepMind
Games w/ best Q-learning
Video Pinball
Breakout
Star Gunner
Crazy Climber
Gopher
52. • Big Picture of 2016 Technology
• Neural Net Basics
• Deep Network Configurations for Practical Applications
– Auto-Encoder (i.e. data compression or Principal Components
Analysis)
– Convolutional (shift invariance in time or space for voice, image or IoT)
– Real Time Scoring
– Deep Net libraries and tools (Theano, Tourch, TensorFlow, ...
Kamanja)
– Reinforcement Learning, Q-Learning (i.e. beat people at Atari games,
IoT)
– Continuous Space Word Models (i.e. word2vec)
Deep Learning - Outline
53. Continuous Space Word Models (word2vec)
• Before (a predictive “Bag of Words” model):
– One row per document, paragraph or web page
– Binary word space: 10k to 200k columns, one per word or phrase
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 …. “This word space model is
….”
– The “Bag of words model” relates input record to a target category
54. Continuous Space Word Models (word2vec)
• Before (a predictive “Bag of Words” model):
– One row per document, paragraph or web page
– Binary word space: 10k to 200k columns, one per word or phrase
0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 …. “This word space model is ….”
– The “Bag of words model” relates input record to a target category
• New:
– One row per word (word2vec), possibly per sentence (sent2vec)
– Continuous word space: 100 to 300 columns, continuous values
.01 .05 .02 .00 .00 .68 .01 .01 .35 ... .00 à “King”
.00 .00 .05 .01 .49 .52 .00 .11 .84 ... .01 à “Queen”
– The deep net training resulted in an Emergent Property:
• Numeric geometry location relates to concept space
• “King” – “man” + “woman” = “Queen” (math to change gender relation)
• “USA” – “Washington DC” + “England” = “London” (math for capital
relation)
55. Continuous Space Word Models (word2vec)
How to SCALE to larger vocabularies?
http://www.slideshare.net/hustwj/cikm-keynotenov2014?qid=f92c9e86-feea-41ac-a099-d086efa6fac1&v=default&b=&from_search=2
56. Training Continuous Space Word Models
• How to Train These Models?
– Raw data: “This example sentence shows the word2vec model
training.”
57. Training Continuous Space Word Models
• How to Train These Models?
– Raw data: “This example sentence shows the word2vec model
training.”
– Training data (with target values underscored, and other words as
input)
“This example sentence shows word2vec” (prune “the”)
“example sentence shows word2vec model”
“sentence shows word2vec model training”
– The context of the 2 to 5 prior and following words predict the
middle word
– Deep Net model architecture, data compression to 300 continuous
nodes
• 50k binary word input vector à ... à 300 à ... à 50k word target
vector
58. Training Continuous Space Word Models
• Use Pre-Trained Models https://code.google.com/p/word2vec/
– Trained on 100 billion words from Google News
– 300 dim vectors for 3 million words and phrases
– https://code.google.com/p/word2vec/
• Questions on re-use:
– What if I want to train to add client terms or docs?
– What about stability (keeping past training) vs. placticity (learning
new content)
59. Training Continuous Space Word Models
http://www.slideshare.net/hustwj/cikm-keynotenov2014?qid=f92c9e86-feea-41ac-a099-d086efa6fac1&v=default&b=&from_search=2
60. Applying Continuous Space Word Models
http://static.googleusercontent.com/media/research.google.com/en//people/jeff/BayLearn2015.pdf
State of the art in machine translation
Sequence to Sequence Learning with neural Networks, NIPS 2014
Language translation
Document summary
Generate text captions for pictures
.01
.05
.89
.00
.05
.62
.00
.34
61. “Greg’s Guts” on Deep Learning
• Some claim the need for preprocessing and knowledge
representation has ended
– For most of the signal processing applications à yes, simplify
– I am VERY READY TO COMPETE in other applications, continuing
• expressing explicit domain knowledge – using lookup data for context
• optimizing business value calculations
• Deep Learning gets big advantages from big data
– Why? Better populating high dimensional space combination
subsets
– Unsupervised feature extraction reduces need for large labeled data
• However, “regular sized data” gets a big boost as well
– The “ratio of free parameters” (i.e. neurons) to training set records
– For regressions or regular nets, want 5-10 times as many records
– Regularization and weight drop out reduces this pressure
– Especially when only training “the next auto encoding layer”
62. Deep Learning Summary – ITS EXCITING!
• Discussed Deep Learning architectures
– Auto Encoder, convolutional, reinforcement learning, continuous word
• Real Time speed up
– Train model, reduce complexity, retrain
– Simplify preprocessing with lookup tables
– Use cloud computing, do not be limited to device computing
– Lambda architecture like Kamanja, to combine real time and batch
• Applications
– Fraud detection
– Signal Data: IoT, Speech, Images
– Control System models (like Atari game playing, IoT)
– Language Models
https://www.quora.com/Why-is-deep-learning-in-such-demand-now