This document provides an introduction to deep learning. It discusses the history of machine learning and how neural networks work. Specifically, it describes different types of neural networks like deep belief networks, convolutional neural networks, and recurrent neural networks. It also covers applications of deep learning, as well as popular platforms, frameworks and libraries used for deep learning development. Finally, it demonstrates an example of using the Nvidia DIGITS tool to train a convolutional neural network for image classification of car park images.
HML: Historical View and Trends of Deep LearningYan Xu
The document provides a historical view and trends of deep learning. It discusses that deep learning models have evolved in several waves since the 1940s, with key developments including the backpropagation algorithm in 1986 and deep belief networks with pretraining in 2006. Current trends include growing datasets, increasing numbers of neurons and connections per neuron, and higher accuracy on tasks involving vision, NLP and games. Research trends focus on generative models, domain alignment, meta-learning, using graphs as inputs, and program induction.
Zaikun Xu from the Università della Svizzera Italiana presented this deck at the 2016 Switzerland HPC Conference.
“In the past decade, deep learning as a life-changing technology, has gained a huge success on various tasks, including image recognition, speech recognition, machine translation, etc. Pio- neered by several research groups, Geoffrey Hinton (U Toronto), Yoshua Benjio (U Montreal), Yann LeCun(NYU), Juergen Schmiduhuber (IDSIA, Switzerland), Deep learning is a renaissance of neural network in the Big data era.
Neural network is a learning algorithm that consists of input layer, hidden layers and output layers, where each circle represents a neural and the each arrow connection associates with a weight. The way neural network learns is based on how different between the output of output layer and the ground truth, following by calculating the gradients of this discrepancy w.r.b to the weights and adjust the weight accordingly. Ideally, it will find weights that maps input X to target y with error as lower as possible.”
Watch the video presentation: http://insidehpc.com/2016/03/deep-learning/
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
This document provides an introduction to deep learning. It defines artificial intelligence, machine learning, data science, and deep learning. Machine learning is a subfield of AI that gives machines the ability to improve performance over time without explicit human intervention. Deep learning is a subfield of machine learning that builds artificial neural networks using multiple hidden layers, like the human brain. Popular deep learning techniques include convolutional neural networks, recurrent neural networks, and autoencoders. The document discusses key components and hyperparameters of deep learning models.
Tijmen Blankenvoort, co-founder Scyfer BV, presentation at Artificial Intelligence Meetup 15-1-2014. Introduction into Neural Networks and Deep Learning.
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
The document provides an overview of machine learning and deep learning. It discusses the history and development of neural networks, including deep belief networks, convolutional neural networks, and recurrent neural networks. Applications of deep learning in areas like computer vision, natural language processing, and robotics are also covered. Finally, popular platforms, frameworks and libraries for developing deep learning models are presented, along with examples of pre-trained models that are available.
Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong
Deep Learning and Business Models
Tran Quoc Hoan discusses deep learning and its applications, as well as potential business models. Deep learning has led to significant improvements in areas like image and speech recognition compared to traditional machine learning. Some business models highlighted include developing deep learning frameworks, building hardware optimized for deep learning, using deep learning for IoT applications, and providing deep learning APIs and services. Deep learning shows promise across many sectors but also faces challenges in fully realizing its potential.
Deep learning is a type of machine learning that uses neural networks with multiple layers between the input and output layers. It allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. Deep learning has achieved great success in computer vision, speech recognition, and natural language processing due to recent advances in algorithms, computing power, and the availability of large datasets. Deep learning models can learn complex patterns directly from large amounts of unlabeled data without relying on human-engineered features.
This document provides an introduction to deep learning. It discusses the history of machine learning and how neural networks work. Specifically, it describes different types of neural networks like deep belief networks, convolutional neural networks, and recurrent neural networks. It also covers applications of deep learning, as well as popular platforms, frameworks and libraries used for deep learning development. Finally, it demonstrates an example of using the Nvidia DIGITS tool to train a convolutional neural network for image classification of car park images.
HML: Historical View and Trends of Deep LearningYan Xu
The document provides a historical view and trends of deep learning. It discusses that deep learning models have evolved in several waves since the 1940s, with key developments including the backpropagation algorithm in 1986 and deep belief networks with pretraining in 2006. Current trends include growing datasets, increasing numbers of neurons and connections per neuron, and higher accuracy on tasks involving vision, NLP and games. Research trends focus on generative models, domain alignment, meta-learning, using graphs as inputs, and program induction.
Zaikun Xu from the Università della Svizzera Italiana presented this deck at the 2016 Switzerland HPC Conference.
“In the past decade, deep learning as a life-changing technology, has gained a huge success on various tasks, including image recognition, speech recognition, machine translation, etc. Pio- neered by several research groups, Geoffrey Hinton (U Toronto), Yoshua Benjio (U Montreal), Yann LeCun(NYU), Juergen Schmiduhuber (IDSIA, Switzerland), Deep learning is a renaissance of neural network in the Big data era.
Neural network is a learning algorithm that consists of input layer, hidden layers and output layers, where each circle represents a neural and the each arrow connection associates with a weight. The way neural network learns is based on how different between the output of output layer and the ground truth, following by calculating the gradients of this discrepancy w.r.b to the weights and adjust the weight accordingly. Ideally, it will find weights that maps input X to target y with error as lower as possible.”
Watch the video presentation: http://insidehpc.com/2016/03/deep-learning/
See more talks in the Swiss Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
This document provides an introduction to deep learning. It defines artificial intelligence, machine learning, data science, and deep learning. Machine learning is a subfield of AI that gives machines the ability to improve performance over time without explicit human intervention. Deep learning is a subfield of machine learning that builds artificial neural networks using multiple hidden layers, like the human brain. Popular deep learning techniques include convolutional neural networks, recurrent neural networks, and autoencoders. The document discusses key components and hyperparameters of deep learning models.
Tijmen Blankenvoort, co-founder Scyfer BV, presentation at Artificial Intelligence Meetup 15-1-2014. Introduction into Neural Networks and Deep Learning.
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerPoo Kuan Hoong
The document provides an overview of machine learning and deep learning. It discusses the history and development of neural networks, including deep belief networks, convolutional neural networks, and recurrent neural networks. Applications of deep learning in areas like computer vision, natural language processing, and robotics are also covered. Finally, popular platforms, frameworks and libraries for developing deep learning models are presented, along with examples of pre-trained models that are available.
Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong
Deep Learning and Business Models
Tran Quoc Hoan discusses deep learning and its applications, as well as potential business models. Deep learning has led to significant improvements in areas like image and speech recognition compared to traditional machine learning. Some business models highlighted include developing deep learning frameworks, building hardware optimized for deep learning, using deep learning for IoT applications, and providing deep learning APIs and services. Deep learning shows promise across many sectors but also faces challenges in fully realizing its potential.
Deep learning is a type of machine learning that uses neural networks with multiple layers between the input and output layers. It allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. Deep learning has achieved great success in computer vision, speech recognition, and natural language processing due to recent advances in algorithms, computing power, and the availability of large datasets. Deep learning models can learn complex patterns directly from large amounts of unlabeled data without relying on human-engineered features.
The document provides an introduction to hierarchical temporal memory (HTM) theory and technology. It begins with a disclaimer noting that the speaker has limited knowledge and requests feedback to correct any mistakes. It then presents points related to cricket to introduce HTM concepts. The document discusses HTM algorithms, networks, properties of problems well-suited to HTM, and existing tools like NuPIC for applying HTM approaches.
Introduction to Deep Learning for Non-ProgrammersOswald Campesato
This session provides a brief history of AI, followed by AI-related topics, such as robots in AI, Machine Learning and Deep Learning, use cases for AI, some of the successes of AI, and also some of the significant challenges in AI. You will also learn about AI and mobile devices and the ethics of AI. An avid interest is recommended to derive the maximum benefit from this session.
“Automatically learning multiple levels of representations of the underlying distribution of the data to be modelled”
Deep learning algorithms have shown superior learning and classification performance.
In areas such as transfer learning, speech and handwritten character recognition, face recognition among others.
(I have referred many articles and experimental results provided by Stanford University)
It’s long ago, approx. 30 years, since AI was not only a topic for Science-Fiction writers, but also a major research field surrounded with huge hopes and investments. But the over-inflated expectations ended in a subsequent crash and followed by a period of absent funding and interest – the so-called AI winter. However, the last 3 years changed everything – again. Deep learning, a machine learning technique inspired by the human brain, successfully crushed one benchmark after another and tech companies, like Google, Facebook and Microsoft, started to invest billions in AI research. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new Hype? How is Deep Learning different from previous approaches? Are the advancing AI technologies really a threat for humanity? Let’s look behind the curtain and unravel the reality. This talk will explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why "Deep Learning is probably one of the most exciting things that is happening in the computer industry” (Jen-Hsun Huang – CEO NVIDIA).
Either a new AI “winter is coming” (Ned Stark – House Stark) or this new wave of innovation might turn out as the “last invention humans ever need to make” (Nick Bostrom – AI Philosoph). Or maybe it’s just another great technology helping humans to achieve more.
Deep learning uses neural networks, which are systems inspired by the human brain. Neural networks learn patterns from large amounts of data through forward and backpropagation. They are constructed of layers including an input layer, hidden layers, and an output layer. Deep learning can learn very complex patterns and has various applications including image classification, machine translation, and more. Recurrent neural networks are useful for sequential data like text and audio. Convolutional neural networks are widely used in computer vision tasks.
The document discusses deep learning and learning hierarchical representations. It makes three key points:
1. Deep learning involves learning multiple levels of representations or features from raw input in a hierarchical manner, unlike traditional machine learning which uses engineered features.
2. Learning hierarchical representations is important because natural data lies on low-dimensional manifolds and disentangling the factors of variation can lead to more robust features.
3. Architectures for deep learning involve multiple levels of non-linear feature transformations followed by pooling to build increasingly abstract representations at each level. This allows the representations to become more invariant and disentangled.
Deep Learning is the area of machine learning and one of the most talked about trends in business and computer science today.
In this talk, I will give a review of Deep Learning explaining what it is, what kinds of tasks it can do today, and what it probably could do in the future.
Deep learning is a type of machine learning that uses neural networks inspired by the human brain. It has been successfully applied to problems like image recognition, speech recognition, and natural language processing. Deep learning requires large datasets, clear goals, computing power, and neural network architectures. Popular deep learning models include convolutional neural networks and recurrent neural networks. Researchers like Geoffry Hinton and companies like Google have advanced the field through innovations that have won image recognition challenges. Deep learning will continue solving harder artificial intelligence problems by learning from massive amounts of data.
Deep Learning - Overview of my work IIMohamed Loey
Deep Learning Machine Learning MNIST CIFAR 10 Residual Network AlexNet VGGNet GoogleNet Nvidia Deep learning (DL) is a hierarchical structure network which through simulates the human brain’s structure to extract the internal and external input data’s features
The document provides an overview of deep learning examples and applications including computer vision tasks like image classification and object detection from images, speech recognition from audio, and natural language processing on text. It then discusses common deep learning network structures like convolutional neural networks and how they are applied to tasks like handwritten digit recognition. Finally, it outlines Intel's portfolio of AI tools and libraries for deep learning including frameworks, libraries, and hardware.
The document provides an overview of deep learning and reinforcement learning. It discusses the current state of artificial intelligence and machine learning, including how deep learning algorithms have achieved human-level performance in various tasks such as image recognition and generation. Reinforcement learning is introduced as learning through trial-and-error interactions with an environment to maximize rewards. Examples are given of reinforcement learning algorithms solving tasks like playing Atari games.
From Conventional Machine Learning to Deep Learning and Beyond.pptxChun-Hao Chang
In this slide, Deep Learning are compared with Conventional Learning and the strength of DNN models will be explained.
The target audience are people who have the knowledge of Machine Learning or Data Mining but not familiar with Deep Learning.
This document provides an introduction to deep learning. It begins by discussing modeling human intelligence with machines and the history of neural networks. It then covers concepts like supervised learning, loss functions, and gradient descent. Deep learning frameworks like Theano, Caffe, Keras, and Torch are also introduced. The document provides examples of deep learning applications and discusses challenges for the future of the field like understanding videos and text. Code snippets demonstrate basic network architecture.
Slides from Portland Machine Learning meetup, April 13th.
Abstract: You've heard all the cool tech companies are using them, but what are Convolutional Neural Networks (CNNs) good for and what is convolution anyway? For that matter, what is a Neural Network? This talk will include a look at some applications of CNNs, an explanation of how CNNs work, and what the different layers in a CNN do. There's no explicit background required so if you have no idea what a neural network is that's ok.
Deep Neural Networks that talk (Back)… with styleRoelof Pieters
Talk at Nuclai 2016 in Vienna
Can neural networks sing, dance, remix and rhyme? And most importantly, can they talk back? This talk will introduce Deep Neural Nets with textual and auditory understanding and some of the recent breakthroughs made in these fields. It will then show some of the exciting possibilities these technologies hold for "creative" use and explorations of human-machine interaction, where the main theorem is "augmentation, not automation".
http://events.nucl.ai/track/cognitive/#deep-neural-networks-that-talk-back-with-style
Sparse Distributed Representations: Our Brain's Data Structure Numenta
This document discusses sparse distributed representations (SDRs), which are theorized to be the common data structure used in the cortex. SDRs have several key properties, including extremely high storage capacity, robustness to noise and random deletions, ability to represent multiple patterns in a single structure, and enabling highly efficient computations. Analysis of SDRs can provide a foundation for understanding cortical computing and functions like perception, planning, and attention. The document outlines fundamental attributes and analysis of error bounds for SDR matching, unions, and other operations.
"You Can Do It" by Louis Monier (Altavista Co-Founder & CTO) & Gregory Renard (CTO & Artificial Intelligence Lead Architect at Xbrain) for Deep Learning keynote #0 at Holberton School (http://www.meetup.com/Holberton-School/events/228364522/)
If you want to assist to similar keynote for free, checkout http://www.meetup.com/Holberton-School/
This document provides an introduction to deep learning, including key developments in neural networks from the discovery of the neuron model in 1899 to modern networks with over 100 million parameters. It summarizes influential deep learning models such as AlexNet from 2012, ZF Net and GoogLeNet from 2013-2015, which helped reduce error rates on the ImageNet challenge. Top AI scientists who have contributed significantly to deep learning research are also mentioned. Common activation functions, convolutional neural networks, and deconvolution are briefly explained with examples.
This document provides biographical information about Şaban Dalaman and summaries of key concepts in artificial intelligence and machine learning. It summarizes Şaban Dalaman's educational and professional background, then discusses Alan Turing's universal machine concept, the 1956 Dartmouth workshop proposal that helped define the field of AI, and definitions of AI, machine learning, deep learning, and data science. It also lists different tribes and algorithms within machine learning.
Deep learning is introduced along with its applications and key players in the field. The document discusses the problem space of inputs and outputs for deep learning systems. It describes what deep learning is, providing definitions and explaining the rise of neural networks. Key deep learning architectures like convolutional neural networks are overviewed along with a brief history and motivations for deep learning.
The document provides an introduction to hierarchical temporal memory (HTM) theory and technology. It begins with a disclaimer noting that the speaker has limited knowledge and requests feedback to correct any mistakes. It then presents points related to cricket to introduce HTM concepts. The document discusses HTM algorithms, networks, properties of problems well-suited to HTM, and existing tools like NuPIC for applying HTM approaches.
Introduction to Deep Learning for Non-ProgrammersOswald Campesato
This session provides a brief history of AI, followed by AI-related topics, such as robots in AI, Machine Learning and Deep Learning, use cases for AI, some of the successes of AI, and also some of the significant challenges in AI. You will also learn about AI and mobile devices and the ethics of AI. An avid interest is recommended to derive the maximum benefit from this session.
“Automatically learning multiple levels of representations of the underlying distribution of the data to be modelled”
Deep learning algorithms have shown superior learning and classification performance.
In areas such as transfer learning, speech and handwritten character recognition, face recognition among others.
(I have referred many articles and experimental results provided by Stanford University)
It’s long ago, approx. 30 years, since AI was not only a topic for Science-Fiction writers, but also a major research field surrounded with huge hopes and investments. But the over-inflated expectations ended in a subsequent crash and followed by a period of absent funding and interest – the so-called AI winter. However, the last 3 years changed everything – again. Deep learning, a machine learning technique inspired by the human brain, successfully crushed one benchmark after another and tech companies, like Google, Facebook and Microsoft, started to invest billions in AI research. “The pace of progress in artificial general intelligence is incredible fast” (Elon Musk – CEO Tesla & SpaceX) leading to an AI that “would be either the best or the worst thing ever to happen to humanity” (Stephen Hawking – Physicist).
What sparked this new Hype? How is Deep Learning different from previous approaches? Are the advancing AI technologies really a threat for humanity? Let’s look behind the curtain and unravel the reality. This talk will explore why Sundar Pichai (CEO Google) recently announced that “machine learning is a core transformative way by which Google is rethinking everything they are doing” and explain why "Deep Learning is probably one of the most exciting things that is happening in the computer industry” (Jen-Hsun Huang – CEO NVIDIA).
Either a new AI “winter is coming” (Ned Stark – House Stark) or this new wave of innovation might turn out as the “last invention humans ever need to make” (Nick Bostrom – AI Philosoph). Or maybe it’s just another great technology helping humans to achieve more.
Deep learning uses neural networks, which are systems inspired by the human brain. Neural networks learn patterns from large amounts of data through forward and backpropagation. They are constructed of layers including an input layer, hidden layers, and an output layer. Deep learning can learn very complex patterns and has various applications including image classification, machine translation, and more. Recurrent neural networks are useful for sequential data like text and audio. Convolutional neural networks are widely used in computer vision tasks.
The document discusses deep learning and learning hierarchical representations. It makes three key points:
1. Deep learning involves learning multiple levels of representations or features from raw input in a hierarchical manner, unlike traditional machine learning which uses engineered features.
2. Learning hierarchical representations is important because natural data lies on low-dimensional manifolds and disentangling the factors of variation can lead to more robust features.
3. Architectures for deep learning involve multiple levels of non-linear feature transformations followed by pooling to build increasingly abstract representations at each level. This allows the representations to become more invariant and disentangled.
Deep Learning is the area of machine learning and one of the most talked about trends in business and computer science today.
In this talk, I will give a review of Deep Learning explaining what it is, what kinds of tasks it can do today, and what it probably could do in the future.
Deep learning is a type of machine learning that uses neural networks inspired by the human brain. It has been successfully applied to problems like image recognition, speech recognition, and natural language processing. Deep learning requires large datasets, clear goals, computing power, and neural network architectures. Popular deep learning models include convolutional neural networks and recurrent neural networks. Researchers like Geoffry Hinton and companies like Google have advanced the field through innovations that have won image recognition challenges. Deep learning will continue solving harder artificial intelligence problems by learning from massive amounts of data.
Deep Learning - Overview of my work IIMohamed Loey
Deep Learning Machine Learning MNIST CIFAR 10 Residual Network AlexNet VGGNet GoogleNet Nvidia Deep learning (DL) is a hierarchical structure network which through simulates the human brain’s structure to extract the internal and external input data’s features
The document provides an overview of deep learning examples and applications including computer vision tasks like image classification and object detection from images, speech recognition from audio, and natural language processing on text. It then discusses common deep learning network structures like convolutional neural networks and how they are applied to tasks like handwritten digit recognition. Finally, it outlines Intel's portfolio of AI tools and libraries for deep learning including frameworks, libraries, and hardware.
The document provides an overview of deep learning and reinforcement learning. It discusses the current state of artificial intelligence and machine learning, including how deep learning algorithms have achieved human-level performance in various tasks such as image recognition and generation. Reinforcement learning is introduced as learning through trial-and-error interactions with an environment to maximize rewards. Examples are given of reinforcement learning algorithms solving tasks like playing Atari games.
From Conventional Machine Learning to Deep Learning and Beyond.pptxChun-Hao Chang
In this slide, Deep Learning are compared with Conventional Learning and the strength of DNN models will be explained.
The target audience are people who have the knowledge of Machine Learning or Data Mining but not familiar with Deep Learning.
This document provides an introduction to deep learning. It begins by discussing modeling human intelligence with machines and the history of neural networks. It then covers concepts like supervised learning, loss functions, and gradient descent. Deep learning frameworks like Theano, Caffe, Keras, and Torch are also introduced. The document provides examples of deep learning applications and discusses challenges for the future of the field like understanding videos and text. Code snippets demonstrate basic network architecture.
Slides from Portland Machine Learning meetup, April 13th.
Abstract: You've heard all the cool tech companies are using them, but what are Convolutional Neural Networks (CNNs) good for and what is convolution anyway? For that matter, what is a Neural Network? This talk will include a look at some applications of CNNs, an explanation of how CNNs work, and what the different layers in a CNN do. There's no explicit background required so if you have no idea what a neural network is that's ok.
Deep Neural Networks that talk (Back)… with styleRoelof Pieters
Talk at Nuclai 2016 in Vienna
Can neural networks sing, dance, remix and rhyme? And most importantly, can they talk back? This talk will introduce Deep Neural Nets with textual and auditory understanding and some of the recent breakthroughs made in these fields. It will then show some of the exciting possibilities these technologies hold for "creative" use and explorations of human-machine interaction, where the main theorem is "augmentation, not automation".
http://events.nucl.ai/track/cognitive/#deep-neural-networks-that-talk-back-with-style
Sparse Distributed Representations: Our Brain's Data Structure Numenta
This document discusses sparse distributed representations (SDRs), which are theorized to be the common data structure used in the cortex. SDRs have several key properties, including extremely high storage capacity, robustness to noise and random deletions, ability to represent multiple patterns in a single structure, and enabling highly efficient computations. Analysis of SDRs can provide a foundation for understanding cortical computing and functions like perception, planning, and attention. The document outlines fundamental attributes and analysis of error bounds for SDR matching, unions, and other operations.
"You Can Do It" by Louis Monier (Altavista Co-Founder & CTO) & Gregory Renard (CTO & Artificial Intelligence Lead Architect at Xbrain) for Deep Learning keynote #0 at Holberton School (http://www.meetup.com/Holberton-School/events/228364522/)
If you want to assist to similar keynote for free, checkout http://www.meetup.com/Holberton-School/
This document provides an introduction to deep learning, including key developments in neural networks from the discovery of the neuron model in 1899 to modern networks with over 100 million parameters. It summarizes influential deep learning models such as AlexNet from 2012, ZF Net and GoogLeNet from 2013-2015, which helped reduce error rates on the ImageNet challenge. Top AI scientists who have contributed significantly to deep learning research are also mentioned. Common activation functions, convolutional neural networks, and deconvolution are briefly explained with examples.
This document provides biographical information about Şaban Dalaman and summaries of key concepts in artificial intelligence and machine learning. It summarizes Şaban Dalaman's educational and professional background, then discusses Alan Turing's universal machine concept, the 1956 Dartmouth workshop proposal that helped define the field of AI, and definitions of AI, machine learning, deep learning, and data science. It also lists different tribes and algorithms within machine learning.
Deep learning is introduced along with its applications and key players in the field. The document discusses the problem space of inputs and outputs for deep learning systems. It describes what deep learning is, providing definitions and explaining the rise of neural networks. Key deep learning architectures like convolutional neural networks are overviewed along with a brief history and motivations for deep learning.
Large Scale Data Mining using Genetics-Based Machine LearningXavier Llorà
We are living in the peta-byte era.We have larger and larger data to analyze, process and transform into useful answers for the domain experts. Robust data mining tools, able to cope with petascale volumes and/or high dimensionality producing human-understandable solutions are key on several domain areas. Genetics-based machine learning (GBML) techniques are perfect candidates for this task, among others, due to the recent advances in representations, learning paradigms, and theoretical modeling. If evolutionary learning techniques aspire to be a relevant player in this context, they need to have the capacity of processing these vast amounts of data and they need to process this data within reasonable time. Moreover, massive computation cycles are getting cheaper and cheaper every day, allowing researchers to have access to unprecedented parallelization degrees. Several topics are interlaced in these two requirements: (1) having the proper learning paradigms and knowledge representations, (2) understanding them and knowing when are they suitable for the problem at hand, (3) using efficiency enhancement techniques, and (4) transforming and visualizing the produced solutions to give back as much insight as possible to the domain experts are few of them.
This tutorial will try to answer this question, following a roadmap that starts with the questions of what large means, and why large is a challenge for GBML methods. Afterwards, we will discuss different facets in which we can overcome this challenge: Efficiency enhancement techniques, representations able to cope with large dimensionality spaces, scalability of learning paradigms. We will also review a topic interlaced with all of them: how can we model the scalability of the components of our GBML systems to better engineer them to get the best performance out of them for large datasets. The roadmap continues with examples of real applications of GBML systems and finishes with an analysis of further directions.
This talk was presented in Startup Master Class 2017 - http://aaiitkblr.org/smc/ 2017 @ Christ College Bangalore. Hosted by IIT Kanpur Alumni Association and co-presented by IIT KGP Alumni Association, IITACB, PanIIT, IIMA and IIMB alumni.
My co-presenter was Biswa Gourav Singh. And contributor was Navin Manaswi.
http://dataconomy.com/2017/04/history-neural-networks/ - timeline for neural networks
This document provides an introduction to deep learning. It begins with an overview of artificial intelligence techniques like computer vision, speech processing, and natural language processing that benefit from deep learning. It then reviews the history of deep learning algorithms from perceptrons to modern deep neural networks. The core concepts of deep learning processes, neural network architectures, and training techniques like backpropagation are explained. Popular deep learning frameworks like TensorFlow, Keras, and PyTorch are also introduced. Finally, examples of convolutional neural networks, recurrent neural networks, and generative adversarial networks are briefly described along with tips for training deep neural networks and resources for further learning.
The document provides an overview of a course on machine learning, including defining machine learning and artificial intelligence, discussing different applications of machine learning such as speech recognition, robotics, and computer vision, and outlining the topics that will be covered in the course such as classifiers, regression, neural networks, and learning theory. The course aims to provide students with the tools and foundations of machine learning including optimization, statistics, and computer science to solve problems in areas like natural language processing, computer vision, robotics, and medicine.
The field of Artificial Intelligence (AI) has been revitalized in this decade, primarily due to the large-scale application of Deep Learning (DL) and other Machine Learning (ML) algorithms. This has been most evident in applications like computer vision, natural language processing, and game bots. However, extraordinary successes within a short period of time have also had the unintended consequence of causing a sharp difference of opinion in research and industrial communities regarding the capabilities and limitations of deep learning. A few questions you might have heard being asked (or asked yourself) include:
a. We don’t know how Deep Neural Networks make decisions, so can we trust them?
b. Can Deep Learning deal with highly non-linear continuous systems with millions of variables?
c. Can Deep Learning solve the Artificial General Intelligence problem?
The goal of this seminar is to provide a 1000-feet view of Deep Learning and hopefully answer the questions above. The seminar will touch upon the evolution, current state of the art, and peculiarities of Deep Learning, and share thoughts on using Deep Learning as a tool for developing power system solutions.
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit
Deep recurrent neural networks are well-suited for sequence learning tasks like text classification and generation. The author discusses implementing recurrent neural networks in Spark for distributed deep learning on big data. Two use cases are described: predictive maintenance using sensor data to detect failures, and sentiment analysis of tweets using RNNs which achieve better accuracy than traditional classifiers.
Yuwei Cui from Numenta presented on real-time streaming data analysis using Hierarchical Temporal Memory (HTM). HTM is based on principles of the neocortex and allows for online learning of high-order sequences from streaming data. HTM can make multiple predictions simultaneously and is fault tolerant. It has been applied successfully to problems like anomaly detection in data center servers and geospatial tracking data. Numenta is working to further understand the neocortex and create more biologically accurate models to continue advancing machine intelligence.
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
The document provides an overview of deep learning, including its history, key concepts, applications, and recent advances. It discusses the evolution of deep learning techniques like convolutional neural networks, recurrent neural networks, generative adversarial networks, and their applications in computer vision, natural language processing, and games. Examples include deep learning for image recognition, generation, segmentation, captioning, and more.
How Can Machine Learning Help Your Research Forward?Wouter Deconinck
Machine learning is a buzzwords that conjures up visions of programming gurus and data magicians solving problems with little effort while others balk at the black-box nature and lack of first principles understanding. In this talk I hope to introduce some ways in which you can start to use powerful machine learning algorithms to solve certain classes of problems in ways that may be more generic than traditional approaches. I will use examples from a range of fields to demonstrate the power of machine learning, even though those field with access to large data sets have lead the charge. I will highlight differences between machine learning in physics and other data sciences. Finally, I will point out why a solid understanding of the underlying physical principles is a necessity to use machine learning in research with any success.
Data Science, Machine Learning and Neural NetworksBICA Labs
Lecture briefly overviewing state of the art of Data Science, Machine Learning and Neural Networks. Covers main Artificial Intelligence technologies, Data Science algorithms, Neural network architectures and cloud computing facilities enabling the whole stack.
Deep learning and Watson Studio can be used for various tasks including planet discoveries, particle physics experiments at CERN, and scientific publications analysis. Convolutional neural networks are commonly used for image-related tasks like cancer diagnosis, object detection, and style transfer, while recurrent neural networks with LSTM or GRU are useful for sequential data like text for machine translation, sentiment analysis, and music generation. Hybrid and complex models combine different neural network architectures for tasks such as named entity recognition, music generation, blockchain security, and lip reading. Deep learning is now implemented using frameworks like TensorFlow and Keras on GPUs and distributed systems. Transfer learning helps accelerate development by reusing pre-trained models. Watson Studio provides a platform for developing, testing, and deploy
Scene classification using Convolutional Neural Networks - Jayani WithanawasamWithTheBest
The document discusses scene classification using convolutional neural networks (CNNs). It begins with an outline of the topic, then provides background on computer vision as an AI problem and the importance and challenges of scene classification. It introduces CNNs as a deep learning technique for visual pattern recognition, describing their hierarchical organization and components like convolution and pooling layers. The document also discusses traditional machine learning approaches versus deep learning for scene classification and frameworks like Caffe that can be used to implement CNNs.
This document provides an overview of machine learning concepts and techniques including linear regression, logistic regression, unsupervised learning, and k-means clustering. It discusses how machine learning involves using data to train models that can then be used to make predictions on new data. Key machine learning types covered are supervised learning (regression, classification), unsupervised learning (clustering), and reinforcement learning. Example machine learning applications are also mentioned such as spam filtering, recommender systems, and autonomous vehicles.
Machine learning and deep learning techniques can be used to analyze diverse types of data such as images, text, signals and more. Deep learning uses neural networks to learn directly from raw data, enabling applications like object recognition, speech recognition, and analyzing time series signals. Deep learning has become popular due to labeled public datasets, increased GPU acceleration, and pre-trained models that provide a starting point for new problems.
AI & ML in Defence Systems - Sunil ChomalSunil Chomal
Talk on Artificial Intelligence & Machine Learning in Defense Systems at ‘Tutorial cum workshop on AI&ML’ organized by IEEE Bombay Section in collaboration with the India Council during August 10-11, 2018.
This document proposes a local receptive fields based extreme learning machine (ELM-LRF) for face recognition. ELM-LRF introduces local receptive fields to the input layer of an ELM for locally connected neural networks. It is tested on three face datasets: Caltech, CBCL, and UFI. ELM-LRF achieves high testing accuracies of 98.15%, 98.34%, and 66.11% respectively, outperforming other methods. The key advantages of ELM-LRF are that it reduces training time, provides fast results with no risk of getting stuck in local minima like backpropagation algorithms.
Deep Learning: concepts and use cases (October 2018)Julien SIMON
An introduction to Deep Learning theory
Neurons & Neural Networks
The Training Process
Backpropagation
Optimizers
Common network architectures and use cases
Convolutional Neural Networks
Recurrent Neural Networks
Long Short Term Memory Networks
Generative Adversarial Networks
Getting started
Similar to Big Sky Earth 2018 Introduction to machine learning (20)
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills MN
By harnessing the power of High Flux Vacuum Membrane Distillation, Travis Hills from MN envisions a future where clean and safe drinking water is accessible to all, regardless of geographical location or economic status.
The technology uses reclaimed CO₂ as the dyeing medium in a closed loop process. When pressurized, CO₂ becomes supercritical (SC-CO₂). In this state CO₂ has a very high solvent power, allowing the dye to dissolve easily.
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...Advanced-Concepts-Team
Presentation in the Science Coffee of the Advanced Concepts Team of the European Space Agency on the 07.06.2024.
Speaker: Diego Blas (IFAE/ICREA)
Title: Gravitational wave detection with orbital motion of Moon and artificial
Abstract:
In this talk I will describe some recent ideas to find gravitational waves from supermassive black holes or of primordial origin by studying their secular effect on the orbital motion of the Moon or satellites that are laser ranged.
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfSelcen Ozturkcan
Ozturkcan, S., Berndt, A., & Angelakis, A. (2024). Mending clothing to support sustainable fashion. Presented at the 31st Annual Conference by the Consortium for International Marketing Research (CIMaR), 10-13 Jun 2024, University of Gävle, Sweden.
2. Menu
• What is Machine Learning?
• Where does it come from?
• What now?
• Why now?
• Machine Learning in the Sky
• The Machine Learning Landscape
• Machine Learning Pipeline
• Neural Networks
• Some useful concepts
• Machine Learning Tools
• Zoom on selected libraries
• Zoom on a few algorithms
• Random Forest
• Gradient Boosting
• Kohonen’s map
• Autoencoder
• Convolutional Neural Network
• Generative Adversarial Network
4. What is Machine Learning?
Not quite so exciting… Learning? NO! Nor thinking…
More algorithms enabling to fit complex data relationships
More like advanced statistical inference
More like implicit programming
More like extracting information dynamic from data for
generalization…
Sobering thought: linear regression belongs to Machine Learning!
That said, some mimicking taking place:
➢ Trying to improve a system’s response to novel perception thanks
to experience
➢ Human biology inspired artificial neural networks
5. What is Machine Learning?
Artificial Intelligence
Machine Learning
Neural Networks
Deep Learning
6. What is Machine Learning?
• “Machine Learning at its most basic is the practice of using algorithms
to parse data, learn from it, and then make a determination or
prediction about something in the world.” – Nvidia
• “Machine learning is the science of getting computers to act without
being explicitly programmed.” – Stanford
• “Machine learning is based on algorithms that can learn from data
without relying on rules-based programming.”- McKinsey & Co.
• “Machine learning algorithms can figure out how to perform
important tasks by generalizing from examples.” – University of
Washington
• “The field of Machine Learning seeks to answer the question “How can
we build computer systems that automatically improve with
experience, and what are the fundamental laws that govern all
learning processes?” – Carnegie Mellon University
Source: https://www.techemergence.com/what-is-machine-learning/
7. What is Machine Learning?
To summarize:
A set of computing and mathematical techniques
whose aim is to achieve human-level or better-than-
human performance at cognitive tasks such as:
•Predicting
•Classifying
•Generating signals / interacting
•Etc.
Source: https://www.techemergence.com/what-is-machine-learning/
9. Differences between ML and Statistical
Modeling
Statistical Modeling Machine Learning
Parametric models that try to
“explain” the world. The focus is on
modeling causality
Non-parametric models that try to
“mimic” the world rather than
“explain” it. Often uses correlations as
proxies to causality
Deduce relations for observed
quantities by parameter estimation for
a pre-specified model of the world
Induce relations between observable
quantities, main goal is predictive
power
Small data (1-100 attributes,100-
1000 examples)
Large data (10-100K attributes, 1K-
100M examples)
Scalability is typically not the major
concern
Scalability is often critical in
applications
Based on a probabilistic approach
Some ML methods are not
probabilistic (SVM, neural networks,
clustering, etc.)
11. Where does it come from? Pioneer age
1943 – McCulloch-Pitts neurons (neuro-scientist and logician)
1950 - Alan Turing envisioned ML
1952 – Arthur Samuel self-improving chess program
1957 – Frank Rosenblatt, perceptron
1959 – David H. Hubel and Torsten Wiesel simple vs complex cells
1960 – Heny J. Kelley Control Theory Backpropagation
1965 – Alexey Ivakhnenko and V.G. Lapa Group Method of Data
Handling, 8-layer DNN
1980 – Kunihiko Fukushima Neocognitron (pattern recog’), led to CNN
1982 – John Hopfield, Hopfield Network, RNN
1985 – Terry Sejnowski NETtalk, English pronounciation
1986 – Rumelhart, Geoffrey Hinton and Romuald J. Williams,
backpropagation
1989 – Yann LeCun, handwritten digits with CNN
1989 – Christopher Watkins, Q-learning for Reinforcement Learning
Source: https://www.import.io/post/history-of-deep-learning/
12. What now? Modern days
1993 – Jürgen Schmidhuber, 1000-layers RNN
1995 – Corinna Cortes and Vladimir Vapnik, SVM
1997 - Jürgen Schmidhuber and Sepp Hochreiter, LSTM
1997 – IBM’s Deep Blue beat Garry Kasparov
1998 – Yann Lecun, stochastic gradient descent
2009 – Fei-Fei Li, ImageNet
2011 – Alex Krizhevsky, AlexNet CNN
2011 – IBM’s Watson wins Jeopardy
2012 – ImageNet won by AlexNet, better than humans
2014 – Facebook’s DeepFace
2014 – Ian Goodfellow, Generative Adversarial Network
2016-2017 - Google TensorFlow v1.0 in open source
Source: https://www.import.io/post/history-of-deep-learning/
13. What now? Modern days
Source: https://www.import.io/post/history-of-deep-learning/
15. Why now?
Trillion-fold increase of computing power and storage
Source: http://www.visualcapitalist.com/visualizing-trillion-fold-increase-computing-power/
18. Pause for thought: Artificial vs Natural Intelligence
Name # of neurons / # of synapses Visuals
Caenorhabditis elegans 302
Hydra vulgaris 5,600
Homarus americanus 100,000
Blatta Orientalis 1,000,000
Nile Crocodile 80,500,000
Digital Reasoning NN (2015) ~86,000,000 (est.) / 1.6E11
Rattus Rattatouillensis 200,000,000
Blue and yellow macaw 1,900,000,000
Chimpanzee 28,000,000,000
Homo Sapiens Sapiens 86,000,000,000 / 1.5E14
African Elephant 257,000,000,000
Source: https://en.wikipedia.org/wiki/List_of_animals_by_number_of_neurons
19. Machine Learning in the Sky
• Machine Learning owes a lot to astronomy: least-square regression
for orbital parameter estimation (Legendre-Laplace-Gauss)
20. Machine Learning in the Sky
Data Big Bang in Astronomy too:
109 object photometric catalogs from USNO, 2MASS, SDSS…
106-8 spectroscopic catalogs from SDSS, LAMOST…
106-7 multi-wavelength source catalogs from WISE, eROSITA…
109 object x 102 epochs surveys like LSST, DES, PTF, CRTS, SNF, VVV, Pan-
STARRS, Stripe 82
Spectral-image datacubes from VLA, ALMA, IUFs…
21. Machine Learning in the sky
Supernovae of data
Sources: Computer World, https://www.lsst.org/scientists/keynumbers
LSST
DR11 37 109 objects, 7 1012 sources, 5.5 million 3.2 Gigapixel images
30 terabytes of data nightly
Final volume of raw image data = 60 PB
Final image collection (DR11) = 0.5 EB
Final catalog size (DR11) = 15 PB
Final disk storage = 0.4 Exabytes
Peak number of nodes = 1750 nodes
Peak compute power in LSST data centers = about 2 PFLOPS
22. Machine Learning in the Sky
Explosion in number of papers too:
• From 2010 till 2018, 446 astro-ph papers on arXiv with “Machine
Learning” in the abstracts
• Only 5 papers in 2010
• 80% of the total were published after September 2014
• In all fields of astrophysics
23. The Machine Learning landscape
Supervised
Learning
Unsupervised
Learning
Regression Classification
Learn real-
valued
function
given
(Xi , Yi)
Learn
discrete
class
function
given
(Xi , Ci)
Clustering
Representation
Learning
Learn
discrete
class
function
given
(Xi ) only
Learn
representing
function given
(Xi ) only
Rn →[1,k] Rn →[1,k] Rn → RkRn → R
24. The Machine Learning landscape
Reinforcement
Learning
Policy
Optimization
Inverse RL
Learn policy
function given
(si , si+1, ai , ri )
Learn reward
function given
(si , si+1, ai )
Rn → Rk
Rn → R
Additional categories
Transfer learning
Semi-supervised learning
Active learning
Sequence modeling
RL methods for Supervised and
Unsupervised Learning
25. The Machine Learning landscape
Supervised
Learning
Unsupervised
Learning
Regression Classification
Linear Regression
Trees / CART
SVM/SVR
Ensemble methods
Neural Networks
Logistic Regression
Naive Bayes
Nearest neighbors
SVM
Decision trees
Ensemble methods
Clustering
Representation
Learning
K-means
Hierarchical
clustering
Gaussian mixtures
Hidden Markov
NN (SOM/ART)
PCA/ICA
Factor models
Dim. reduction
Manifold learning
NN (GAN/VAE/AR)
26. Overview of the Machine Learning landscape
Reinforcement
Learning
Policy
Optimization
Inverse RL
Model-based RL
Model-free RL
Batch/online RL
Linear models RL
Neural Networks
Model-based IRL
Model-free IRL
Batch/online IRL
MaxEnt IRL
Neural networks
• Neural networks is the most universal (and scalable) approach
• Two types of methods tend to dominate Kaggle competitions:
• Ensemble methods (Random Forests and Gradient Boosting)
• Deep Learning
30. Training
Data Preparation
Machine Learning Pipeline
Raw Dataset
Load Data
Prepared Data
Apply
Algorithm
Select
Features
Explore Data
Clean Data Normalize
ML
Algorithms
Evaluate &
Tune
Deploy
Model
Publish!
31. Machine Learning: neural networks
Single neuron: computation structure inspired by nature
|g(a)𝑎 = ∑𝑤𝑖 𝑥𝑖
x1
x2
…
xi
…
xn
w2
w1
wi
wn
Activation
Activation
function
z
If g = identity or sigmoid
Linear/logistic regression
34. Machine Learning: neural networks
Pick activation functions adapted to desired output
For multi-class output, choose Softmax function:
35. Machine Learning: Deep Learning
POWERFUL CPU/GPU x BIG DATA => LEVERAGE ALGORITHMS
Size matters!
Not deep Deep
Try playground.tensorflow.org
36. Some Useful Concepts
• Parameters and Hyperparameters
• Underfitting / Overfitting / Bias-variance trade-off
• Training/Dev/Test sets
• Loss or cost function
• Forward propagation / Back-propagation
• Batch vs mini-batch vs stochastic descent
• Dimensionality reduction
• Data augmentation
• Performance Metrics
37. Some Useful Concepts
Parameters and Hyperparameters
• Parameters are learned from the data
• Hyperparameters are set a priori then tuned
Examples :
Model Parameters Hyperparameters
Linear regression
Coefficients
Intercept
Number of features
k-means Indexing of clusters Number of clusters k
Neural Network
Weights
Biases
Number of layers
Number of neurons per
layers
Activation functions
Learning rate
Epochs / batch size
Etc.
39. Some Useful Concepts
Bias-variance trade-off
• Related to underfitting and overfitting
• Know data well but not too well for generalization
Sweet spot
40. Some Useful Concepts
Bias-variance trade-off
Low bias: model learned data well
Low variance: model can generalize well
Remedies
High Bias
• Train longer
• Increase model complexity
• more features
• more parameters,
• richer architecture
High Variance
• Get more data
• Decrease model complexity
• less features
• less parameters,
• simpler architecture
• Regularization
• Early stopping
• Drop-out
41. Some Useful Concepts
Training/dev/test sets
• Training set to fit model with a priori hyper-parameters
• Dev or (cross-)validation set to tune hyper-parameters
• Test set to assess final performance of model on unseen data
• Typical splits 60/20/20 or 80/10/10 or 98/1/1 in deep learning
42. Some Useful Concepts
Loss function
• Depends on problem tackled
• Measures the fit between current
output and target output
• Must decrease as training goes on:
Source: https://heartbeat.fritz.ai/5-regression-loss-functions-all-machine-learners-should-know-4fb140e9d4b0
On average!
43. Some Useful Concepts
Forward propagation and backpropagation
Forward propagation: get estimates during training and predictions after
Backpropagation: apply chain rule to gradient of loss function to adjust
weights/biases
44. Some Useful Concepts
Batch vs mini-batch vs stochastic descent
Batch: feed the whole training set at each training epoch
Mini-batch: feed subsets (random or not) at each training epoch
Stochastic descent: mini-batch of size 1
It’s a tradeoff!
45. Dimensionality reduction
• Too many features
• Expensive to store
• Slowing down computation
• Subject to Dimensionality curse
• Sample space gets harder and harder to fill as dimensions grow
• A reason why too many features lead to overfitting as data become sparse
• More and more data needed to fill the same % of space:
Select the features! And use PCA/ICA/SVD/LDA/QDA/Autoencoders…
Some Useful Concepts
46. Data Augmentation
• When more data are needed, make up new ones!
• Translate, rotate, flip, crop, lighten/darken, add noise, dephase, etc.
Some Useful Concepts
47. Some Useful Concepts
Performance metrics
• Compare error to simplest method as a benchmark, e.g. linear
regression or logistic regression
Classification problems
• Accuracy
• Precision-recall / F1-score
• ROC-AUC
• Confusion matrix
• Log-Loss
Regression problems
• MSE / RMSE/ MSPE / MAE
• R2 / Adjusted R2
Source: https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234
NOT DISCUSSED IN
THIS DOCUMENT
48. Some Useful Concepts
Classifier performance metric
Accuracy
Classified as Classified as
True positive True negative
False negative
False positive
49. Some Useful Concepts
Classifier performance metric
Accuracy = (TP + TN) / All cases
Classified as Classified as
True positive True negative
False negative
False positive
50. Some Useful Concepts
Classifier performance metric
Accuracy = (TP + TN) / All cases
Classified as Classified as
• Counts whenever the classifier is right
• Simple and intuitive metric
BUT
• Assigns same cost to false positives
and false negatives
• Use with caution because of the
accuracy paradox: a dumb classifier
based on majority class has better
accuracy!
• Absolutely avoid with highly
imbalanced classes
51. Some Useful Concepts
Classifier performance metric
Precision vs Recall
Classified as Classified as
True positive True negative
False negative
False positive
52. Some Useful Concepts
Classifier performance metric
Precision vs Recall
Classified as
True positive
False positive
Precision = TP / (TP + FP)
• High precision means high
selectivity
• A selected sample has high
probability to belong to the
correct class
• Some actual positives have been
brushed off
• A low precision means lots of false
positives
53. Some Useful Concepts
Classifier performance metric
Precision vs Recall
Classified as Classified as
True positive
False negative
Recall = TP / (TP + FN)
• High recall means most positives
have been identified as such, at the
cost of (some) false positives
• Low recall means lots of false
negatives
54. Some Useful Concepts
Classifier performance metric
F1-score
• F1-score synthesizes both precision and recall
F1 = 2 * Precision x Recall / ( Precision + Recall)
• Need to take into account the desirable trade-off:
• E.g. cancer diagnostics, better to have a higher recall to minimize
false negatives
• E.g. spam detection, better to let pass some false
negatives than to eliminate legit emails
• E.g. zombie apocalypse scenario,
better to have high precision to avoid
letting infected people into the safe zone…
55. Some Useful Concepts
Classifier performance metric
AUC-ROC
Area Under Curve – Receiver Operating Characteristics
FPR : False Positive Rate
A good classifier has high sensitivity and high specificity
Source: https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5
56. Some Useful Concepts
Classifier performance metric
AUC-ROC
How good is the model at distinguishing between classes at different
thresholds?
How much do you pay your true positives?
Source: https://towardsdatascience.com/understanding-auc-roc-curve-68b2303cc9c5
Lots of false positives
Lots of true positives
Few false positives
Few true positives
Ideal case: AUC = 1
58. Some Useful Concepts
Classifier performance metric
Confusion matrix
• Interesting for analysis of classifier performance on multiclass set
True class
Predicted class
59. Some Useful Concepts
Source: http://wiki.fast.ai/index.php/Log_Loss
−
1
𝑁
𝑖
𝑁
[𝑦𝑖 log 𝑝𝑖 + 1 − 𝑦𝑖 log(1 − 𝑝𝑖)]
−
1
𝑁
𝑖
𝑁
𝑐=1
𝑀
𝑦𝑖,𝑐log(𝑝𝑖,𝑐)
If more than 2 classes:GoodNot good
Classifier performance metric
Log-Loss
• Adapted to binary outputs and multi-classes data sets (if not too imbalanced)
• Punishes extreme probability values when these are wrong
60. Some Useful Concepts
• So, which metric to choose??? Well, it depends…
Source: https://medium.com/usf-msds/choosing-the-right-metric-for-evaluating-machine-learning-models-part-2-
86d5649a5428
Classes Balanced Imbalanced
Binary If probability differences are
critical: Log-Loss
If only class prediction important
and threshold tuning: AUC-ROC
score
F1 score is sensitive to threshold,
tune before comparing
Small class >0 or <0: ROC-AUC
score
Small positive class: F1
Multi-class Confusion matrix
Log-Loss
Averaging of Precision/Recall
over classes (macro-averaging)
• There are other metrics: Cohen’s kappa, Jaccard index, G-score…
61. Machine Learning Tools
Main Python libraries
Name Use Logo
Pandas Data Analysis
Spark Distributed Computing
Scikit-learn
Machine Learning
Toolbox
Keras Deep Learning
TensorFlow Deep Learning
Open-cv Computer Vision
65. Zoom on Scikit-Learn logic
1- Import model
from sklearn import svm
from sklearn.neighbors import KNeighborsClassifier
2 - Instantiate model class
clf = svm.SVC(gamma=0.001, C=100.)
knn = KNeighborsClassifier()
3 - Train with the fit() method
knn.fit(iris_X_train, iris_y_train)
4 - Make predictions with predict()
clf.predict(digits.data[-1:])
knn.predict(iris_X_test)
66. Zoom on Keras logic
1- Import model class
from keras.models import Sequential
2 - Instantiate model class
model = Sequential()
3 - Add layers with the add() method specifying input_dim or input_shape
model.add(Dense(32, input_dim=784))
4 - Add activation functions
model.add(Activation('relu'))
5 - Configure training with compile(loss=,optimizer=, metrics[])
model.compile(optimizer='rmsprop', loss='binary_crossentropy’,
metrics=['accuracy'])
6 - Train with the fit() method
model.fit(data, labels, epochs=10, batch_size=32)
7- Evaluate the model performance with the evaluate() method:
score = model.evaluate(x_test, y_test, verbose=0)
8 – Make predictions with predict():
predictions = model.predict(x_test)
67. Zoom on TensorFlow logic
1 – Define a computation graph:
2 – Start a TensorFlow session
3 – Actually execute the graph implementing nested loops on epochs and batches
Source: https://www.datacamp.com/community/tutorials/cnn-tensorflow-python
tf.variable to be
optimized (weights and
biases)
tf.constant as needed
tf.placeholder for inputs
All operations have a tf
counterpart
68. Zoom on TensorFlow: Logic
Basic example:
# tf Graph input
a = tf.placeholder(tf.int16)
b = tf.placeholder(tf.int16)
# Define some operations
add = tf.add(a, b)
mul = tf.multiply(a, b)
# Launch the default graph.
with tf.Session() as sess:
# Run every operation with variable input
print("Addition with variables: %i" % sess.run(add, feed_dict={a: 2, b: 3}))
print("Multiplication with variables: %i" % sess.run(mul, feed_dict={a: 2, b: 3}))
Source: https://www.datacamp.com/community/tutorials/cnn-tensorflow-python
69. Random Forest concept
Random forest uses decision trees as base learners
Regression
Classification
Decision trees are built so that the
splits are prioritised by the
amount of information provided
70. Random Forest concept
Random forests are built by applying many decision trees to random
subsets and random feature subsets: ensemble learning (here bagging)
71. Gradient Boosting concept
• Boosting: creates a series of weak learners where new ones focus on
data hard to classify. At the end of the process all learners are
weighted and combined
• Boosting can lead to
overfitting, stop
early enough!
• Many variants:
Gradient Boosting,
XGBoost, AdaBoost,
Gentle Boost
• XGBoost is state-of-
the-art
72. Gradient Boosting concept
Gradient Boosting vs Random Forest
Gradient Boosting Random Forest
Base learners
Trees
Linear regression
Trees
Bias-variance of
learners
Stumps: high bias and
low variance
Full trees: low bias and
high variance
Hyperparameters
tuning
Lots! (see next page) Number of trees!
Performance #1 Close 2nd
73. Gradient Boosting concept
Some important hyper parameters for gradient boosting (XGBoost)
to limit the tree growth:
• max_features
• min_sample_split
• min_samples_leaf
• max_depth
74. Self-organizing map concept
• Inspired by specialization of neural areas in natural brains
• Initially random vectors with same dimension as the input at each
neuron on the grid
• The closest to given input vector and its neighbours are nudged
toward current input
• Clustering, classification and visualization
• Kohonen 1984
75. Autoencoder concept
• A neural network whose output equals the input
• Hour-glass shape as data is encoded then decoded
• A way to extract meaningful features
Compressed signal
with reduced dimensions
77. CNN concept
• Convolutional Neural Networks are a category of Neural Networks
that have proven very effective in areas such as image recognition
and classification. CNNs have been successful in identifying faces,
objects and traffic signs apart from powering vision in robots and self
driving cars.
• :
Source: https://www.apsl.net/blog/2017/11/20/use-convolutional-neural-network-image-classification/
83. GAN concept
Trying to forge the data distribution
Trying to sort out real from fake
Sharing a common loss
function with opposite
goals (min max)
• A generative adversarial network learns a distribution not a relationship
• Alternatives are variational autoencoders
85. Machine Learning: references
Papers
• Check out arXiv for machine learning in astro-ph…
MOOCs
• All ML courses on Coursera by Andrew Ng
• Deep Learning A-Z™: Hands-On Artificial Neural Networks on Udemy
• Fast.ai courses
Books
“Statistics, Data Mining and Machine Learning in Astronomy: A Practical Python
Guide for the Analysis of Survey Data” by Željko Ivezić & al.
“Data Science from Scratch with Python: Step-by-Step Beginner Guide for
Statistics, Machine Learning, Deep learning and NLP using Python, Numpy,
Pandas, Scipy, Matplotlib, Sciki-Learn, TensorFlow” by Peter Morgan
Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts,
Tools, and Techniques to Build Intelligent Systems by Aurélien Géron
86. Hands-on exercises roadmap
1) keras_log_reg_EX (data = MNIST)
1) Complete missing lines
2) tf_log_reg_EX (data = iris.csv)
1) Complete missing lines
2) Play with learning rate
3) sk_xgboost_regression_EX (data = Boston)
1) Complete missing lines
2) Play with learning rate
3) Find a good value for n_estimators
4) Have a look at feature importance and sample tree
4) sk_sdss_EX (data = sdss_data.csv)
1) Reply to questions in the notebook as you execute cells after cells
5) tf_AutoEncoder_Fashion_EX (data = fashion-mnist_train.csv and fashion-
mnist_test.csv)
1) Reply to questions
2) Make suggested trials
6) keras_gan_bimodal2_EX (data generated in notebook)
7) Check out Others
ML comes from
Optimization problems
Statistics
Computer science
Specialized/Applied vs generalized AI
Artificial intelligence is wider and deals with different approaches like the symbolic approach
Also AI is about agents perceiving an environment and trying to act at best, to perform at best
Cloud coputing because you need speed and memory
data visualization because the data sets and results are complex
Business here is your branch of science i.e. astrophysics because it brings a lot in terms of insights (ML skills won’t replace your astrophysics instinct)
DeepFace is a deep learning facial recognition system created by a research group at Facebook. It identifies human faces in digital images. It employs a nine-layer neural net with over 120 million connection weights, and was trained on four million images uploaded by Facebook users. 97% accuracy, 9 layers, 120 million weights
Side note: It takes roughly 3 chimpanzees to run the US
Feature extraction
Dimension reduction
Semi-supervised learning
Active learning
Sequence modeling
RL methods for Supervised and Unsupervised Learning
Independent Component Analysis
Clustering as simple to understand algorithms THEY ARE IMPORTANT
SKIP Tthe RL part
Coming from a pro KAGGLE competitor
Matrix multiplications + element wise operations
Sigmoid for classification
Hyperbolic for class or reg
ReLu for reg
Softmax provides probabilities
Example in word prediction in a smartphone, the 3 highest probabilities
There are others
Overfitting is to be avoided in ML
If you have as many parameters as you have examples, you can learn perfectly the data!
You need to split the data to get an Unbiassed evaluation
LogLoss or Softmax
From one epoch to another the weights are updated
Trade-off between time to update and optimality of direction towards minimum
Translate, rotate, crop, lighten/darken, add noise, dephase, etc.
Mean Squared Prediction Error
Mean Absolute Error
Based on summing or averaging the difference between true value and estimation
Harmonic mean F1
Spam needs high precision and will have low recall
The AUC is independent of the threshold (global metric)
An ideal one is purely diagonal
Y_ic = 1 if sample i belongs to class c
Symmetric function!
Need probabilities
Mean Squared Prediction Error
Mean Absolute Error
Based on summing or averaging the difference between true value and estimation
The link between data array and input to the graph is done via feed_dict
A decision tree can totally overfit as it can totally split the feature space
2D convolutions
Blurried reconstruction
makes the input representations (feature dimension) smaller and more manageable
reduces the number of parameters and computations in the network, therefore, controlling overfitting 16
makes the network invariant to small transformations, distortions and translations in the input image (a small distortion in input will not change the output of Pooling – since we take the maximum / average value in a local neighborhood).
helps us arrive at an almost scale invariant representation of our image (the exact term is “equivariant”). This is very powerful since we can detect objects in an image no matter where they are located (read 18 for details).
makes the input representations (feature dimension) smaller and more manageable
reduces the number of parameters and computations in the network, therefore, controlling overfitting 16
makes the network invariant to small transformations, distortions and translations in the input image (a small distortion in input will not change the output of Pooling – since we take the maximum / average value in a local neighborhood).
helps us arrive at an almost scale invariant representation of our image (the exact term is “equivariant”). This is very powerful since we can detect objects in an image no matter where they are located (read 18 for details).