The document discusses different approaches to meta-learning, or learning to learn. It begins by explaining how humans are able to learn new tasks more quickly by leveraging prior knowledge from similar tasks. It then covers three main approaches to meta-learning for machine learning models: 1) Starting with what generally works based on previous task performance data, 2) Starting from what is most likely to work for similar tasks based on task meta-features, and 3) Starting from previously trained models on very similar tasks via transfer learning. The document dives into various techniques within each of these three approaches, such as warm-starting optimization searches, learning task embeddings, and few-shot learning.
Meta-learning, or learning how to learn, is our innate ability to learn new, ever more complex tasks very efficiently by building on prior experience. It is a very exciting direction for machine learning (and AI in general). In this tutorial, I introduce the main concepts and state of the art.
Automated machine learning lectures given at the Advanced Course on Data Science & Machine Learning. AutoML, hyperparameter optimization, Bayesian optimization, Neural Architecture Search, Meta-learning, MAML
- What is Meta-Learning?
- What are the pros and cons of Meta-Learning?
- Different Methods for Meta-Learning in AutoML.
Copyrights are reserved for Mohamed Maher - University of Tartu
Meta-learning, or learning how to learn, is our innate ability to learn new, ever more complex tasks very efficiently by building on prior experience. It is a very exciting direction for machine learning (and AI in general). In this tutorial, I introduce the main concepts and state of the art.
Automated machine learning lectures given at the Advanced Course on Data Science & Machine Learning. AutoML, hyperparameter optimization, Bayesian optimization, Neural Architecture Search, Meta-learning, MAML
- What is Meta-Learning?
- What are the pros and cons of Meta-Learning?
- Different Methods for Meta-Learning in AutoML.
Copyrights are reserved for Mohamed Maher - University of Tartu
Le Machine Learning, l’IA, le DeepLearning, les Statistiques, le Data Mining… bref, tous ces mots sont les buzz words du moment mais que se cache-t-il derrière ?
A travers des exemples concrets, on parcourra les différentes approches du Machine Learning, les grandes familles d’algorithmes (n’ayez crainte : sans rentrer dans le cœur de leurs implémentations), puis les outils et les frameworks à la disposition des Data Scientists… et pour finir, on essayera de prédire l’avenir !
Salon Data - Nantes - 19 Septembre 2017
https://salondata.fr/2017/07/12/0930-1030-ml/
Top 10 Data Science Practitioner PitfallsSri Ambati
Over-fitting, misread data, NAs, collinear column elimination and other common issues play havoc in the day of practicing data scientist. In this talk, Mark Landry, one of the world’s leading Kagglers, will review the top 10 common pitfalls and steps to avoid them.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Introduction to Machine Learning with Python and scikit-learnMatt Hagy
PyATL talk about machine learning. Provides both an intro to machine learning and how to do it with Python. Includes simple examples with code and results.
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
In this session we will show how to build a text classifier using the Apache Lucene/Solr with libSVM libraries. We classify our corpus of job offers into a number of predefined categories. Each indexed document (a job offer) then belongs to zero, one or more categories. Known machine learning techniques for text classification include naïve bayes model, logistic regression, neural network, support vector machine (SVM), etc. We use Lucene/Solr to construct the features vector. Then we use the libsvm library known as the reference implementation of the SVM model to classify the document. We construct as many one-vs-all svm classifiers as there are classes in our setting, then using the Hadoop MapReduce Framework we reconcile the result of our classifiers. The end result is a scalable multi-class classifier. Finally we outline how the classifier is used to enrich basic solr keyword search.
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
First public meetup at Twitter Seattle, for Seattle DAML:
http://www.meetup.com/Seattle-DAML/events/159043422/
We compare/contrast several open source frameworks which have emerged for Machine Learning workflows, including KNIME, IPython Notebook and related Py libraries, Cascading, Cascalog, Scalding, Summingbird, Spark/MLbase, MBrace on .NET, etc. The analysis develops several points for "best of breed" and what features would be great to see across the board for many frameworks... leading up to a "scorecard" to help evaluate different alternatives. We also review the PMML standard for migrating predictive models, e.g., from SAS to Hadoop.
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
25-min talk about Machine Learning and a little bit of Deep Learning. Starts with some basic definitions (Supervised and Unsupervised Learning). Then, neural networks basic functionality is explained, ending up in Deep Learning and Convolutional Neural Networks.
Machine Learning Meetup that happened in Porto Alegre, Brazil.
Data Structures and Algorithms (DSA) Tutorial for Beginners - Learn Data Structures and Algorithm using c, C++ and Java in simple and easy steps starting from basic to advanced concepts with examples
Understanding how high powered ML models arrive at their predictions is an important aspect of Machine Learning, and SHAP is a powerful tool that enables practitioners to understand how different features combine to help a model arrive at a prediction.
This slidedeck is from a presentation given at pydata global on the theoretical foundations of SHAP as well as how to use its library. Link to the presentation can be found here: https://pydata.org/global2021/schedule/presentation/3/behind-the-black-box-how-to-understand-any-ml-model-using-shap/
The ABC of Implementing Supervised Machine Learning with Python.pptxRuby Shrestha
It is to our fact that machine learning has taken a significant height. However, knowing and understanding how small problems can be solved from a machine learning perspective is necessary to form a good base, appreciate the process of implementation and get started in this domain. Therefore, in this post, I would like to talk about the ABC of implementing Supervised Machine Learning with Python by navigating through a simple example, which is, adding two numbers. So, to put it in simple terms, I would like to make a machine learn to add. This can be put in other words; I would like to develop a predictive model that can add. Sounds simple, right? View the presentation for more details.
Overview of Machine Learning and Feature EngineeringTuri, Inc.
Machine Learning 101 Tutorial at Strata NYC, Sep 2015
Overview of machine learning models and features. Visualization of feature space and feature engineering methods.
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Misha Bilenko, Principal Researcher, Microsoft at MLconf SEA - 5/01/15MLconf
Many Shades of Scale: Big Learning Beyond Big Data: In the machine learning research community, much of the attention devoted to ‘big data’ in recent years has been manifested as development of new algorithms and systems for distributed training on many examples. This focus has led to significant advances in the field, from basic but operational implementations on popular platforms to highly sophisticated prototypes in the literature. In the meantime, other aspects of scaling up learning have received relatively little attention, although they are often more pressing in practice. The talk will survey these less-studied facets of big learning: scaling to an extremely large number of features, to many components in predictive pipelines, and to multiple data scientists collaborating on shared experiments.
MachineLearning for dummies with Python
Have you heard that Machine Learning is the next big thing?
Are you a dummy in terms of Machine Learning, and think that is a topic for mathematics with black-magic skills?
If your response to both questions is 'Yes', we are in the same position.
Still, thanks to the Web, Python and OpenSource libraries, we can overcome this situation and do some interesting stuff with Machine Learning.
QCon Rio - Machine Learning for EveryoneDhiana Deva
Já não são mais necessários supercomputadores e times de PhDs do MIT para a criação de modelos preditivos baseados em dados. Estamos presenciando inovações em Aprendizado de Máquina que estão tornando este campo cada vez mais acessível.
Esta palestra tem como objetivo desmistificar o aprendizado de máquina, através da exposição de conceitos e uso de uma série de tecnologias.
Serão abordados os tipos de problemas desta área(classificação, regressão, clusterização, redução de dimensionalidade, etc.), suas as etapas (normalização, treinamento, otimização, regularização, etc.) e seus algoritmos, desde regressão linear, k-means, passando por árvores de decisão e até redes neurais, sempre aplicadas a problemas reais.
Na palestra, também conheceremos ferramentas como Sckit-learn, Pandas, R, MATLAB e Amazon Machine Learning, além de uma forma para praticar e experimentar estas ideias através de competições como o Kaggle.
Object Oriented Programming Lab Manual Abdul Hannan
Object oriented programing Lab manual for practicing and improve the coding skills of object oriented programming.
Published by Mohammad Ali Jinnah University Islamabad.
ODSC East: Effective Transfer Learning for NLPindico data
Presented by indico co-founder Madison May at ODSC East.
Abstract: Transfer learning, the practice of applying knowledge gained on one machine learning task to aid the solution of a second task, has seen historic success in the field of computer vision. The output representations of generic image classification models trained on ImageNet have been leveraged to build models that detect the presence of custom objects in natural images. Image classification tasks that would typically require hundreds of thousands of images can be tackled with mere dozens of training examples per class thanks to the use of these pretrained reprsentations. The field of natural language processing, however, has seen more limited gains from transfer learning, with most approaches limited to the use of pretrained word representations. In this talk, we explore parameter and data efficient mechanisms for transfer learning on text, and show practical improvements on real-world tasks. In addition, we demo the use of Enso, a newly open-sourced library designed to simplify benchmarking of transfer learning methods on a variety of target tasks. Enso provides tools for the fair comparison of varied feature representations and target task models as the amount of training data made available to the target model is incrementally increased.
Le Machine Learning, l’IA, le DeepLearning, les Statistiques, le Data Mining… bref, tous ces mots sont les buzz words du moment mais que se cache-t-il derrière ?
A travers des exemples concrets, on parcourra les différentes approches du Machine Learning, les grandes familles d’algorithmes (n’ayez crainte : sans rentrer dans le cœur de leurs implémentations), puis les outils et les frameworks à la disposition des Data Scientists… et pour finir, on essayera de prédire l’avenir !
Salon Data - Nantes - 19 Septembre 2017
https://salondata.fr/2017/07/12/0930-1030-ml/
Top 10 Data Science Practitioner PitfallsSri Ambati
Over-fitting, misread data, NAs, collinear column elimination and other common issues play havoc in the day of practicing data scientist. In this talk, Mark Landry, one of the world’s leading Kagglers, will review the top 10 common pitfalls and steps to avoid them.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Introduction to Machine Learning with Python and scikit-learnMatt Hagy
PyATL talk about machine learning. Provides both an intro to machine learning and how to do it with Python. Includes simple examples with code and results.
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMlucenerevolution
In this session we will show how to build a text classifier using the Apache Lucene/Solr with libSVM libraries. We classify our corpus of job offers into a number of predefined categories. Each indexed document (a job offer) then belongs to zero, one or more categories. Known machine learning techniques for text classification include naïve bayes model, logistic regression, neural network, support vector machine (SVM), etc. We use Lucene/Solr to construct the features vector. Then we use the libsvm library known as the reference implementation of the SVM model to classify the document. We construct as many one-vs-all svm classifiers as there are classes in our setting, then using the Hadoop MapReduce Framework we reconcile the result of our classifiers. The end result is a scalable multi-class classifier. Finally we outline how the classifier is used to enrich basic solr keyword search.
Data Workflows for Machine Learning - Seattle DAMLPaco Nathan
First public meetup at Twitter Seattle, for Seattle DAML:
http://www.meetup.com/Seattle-DAML/events/159043422/
We compare/contrast several open source frameworks which have emerged for Machine Learning workflows, including KNIME, IPython Notebook and related Py libraries, Cascading, Cascalog, Scalding, Summingbird, Spark/MLbase, MBrace on .NET, etc. The analysis develops several points for "best of breed" and what features would be great to see across the board for many frameworks... leading up to a "scorecard" to help evaluate different alternatives. We also review the PMML standard for migrating predictive models, e.g., from SAS to Hadoop.
An introduction to Machine Learning (and a little bit of Deep Learning)Thomas da Silva Paula
25-min talk about Machine Learning and a little bit of Deep Learning. Starts with some basic definitions (Supervised and Unsupervised Learning). Then, neural networks basic functionality is explained, ending up in Deep Learning and Convolutional Neural Networks.
Machine Learning Meetup that happened in Porto Alegre, Brazil.
Data Structures and Algorithms (DSA) Tutorial for Beginners - Learn Data Structures and Algorithm using c, C++ and Java in simple and easy steps starting from basic to advanced concepts with examples
Understanding how high powered ML models arrive at their predictions is an important aspect of Machine Learning, and SHAP is a powerful tool that enables practitioners to understand how different features combine to help a model arrive at a prediction.
This slidedeck is from a presentation given at pydata global on the theoretical foundations of SHAP as well as how to use its library. Link to the presentation can be found here: https://pydata.org/global2021/schedule/presentation/3/behind-the-black-box-how-to-understand-any-ml-model-using-shap/
The ABC of Implementing Supervised Machine Learning with Python.pptxRuby Shrestha
It is to our fact that machine learning has taken a significant height. However, knowing and understanding how small problems can be solved from a machine learning perspective is necessary to form a good base, appreciate the process of implementation and get started in this domain. Therefore, in this post, I would like to talk about the ABC of implementing Supervised Machine Learning with Python by navigating through a simple example, which is, adding two numbers. So, to put it in simple terms, I would like to make a machine learn to add. This can be put in other words; I would like to develop a predictive model that can add. Sounds simple, right? View the presentation for more details.
Overview of Machine Learning and Feature EngineeringTuri, Inc.
Machine Learning 101 Tutorial at Strata NYC, Sep 2015
Overview of machine learning models and features. Visualization of feature space and feature engineering methods.
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Misha Bilenko, Principal Researcher, Microsoft at MLconf SEA - 5/01/15MLconf
Many Shades of Scale: Big Learning Beyond Big Data: In the machine learning research community, much of the attention devoted to ‘big data’ in recent years has been manifested as development of new algorithms and systems for distributed training on many examples. This focus has led to significant advances in the field, from basic but operational implementations on popular platforms to highly sophisticated prototypes in the literature. In the meantime, other aspects of scaling up learning have received relatively little attention, although they are often more pressing in practice. The talk will survey these less-studied facets of big learning: scaling to an extremely large number of features, to many components in predictive pipelines, and to multiple data scientists collaborating on shared experiments.
MachineLearning for dummies with Python
Have you heard that Machine Learning is the next big thing?
Are you a dummy in terms of Machine Learning, and think that is a topic for mathematics with black-magic skills?
If your response to both questions is 'Yes', we are in the same position.
Still, thanks to the Web, Python and OpenSource libraries, we can overcome this situation and do some interesting stuff with Machine Learning.
QCon Rio - Machine Learning for EveryoneDhiana Deva
Já não são mais necessários supercomputadores e times de PhDs do MIT para a criação de modelos preditivos baseados em dados. Estamos presenciando inovações em Aprendizado de Máquina que estão tornando este campo cada vez mais acessível.
Esta palestra tem como objetivo desmistificar o aprendizado de máquina, através da exposição de conceitos e uso de uma série de tecnologias.
Serão abordados os tipos de problemas desta área(classificação, regressão, clusterização, redução de dimensionalidade, etc.), suas as etapas (normalização, treinamento, otimização, regularização, etc.) e seus algoritmos, desde regressão linear, k-means, passando por árvores de decisão e até redes neurais, sempre aplicadas a problemas reais.
Na palestra, também conheceremos ferramentas como Sckit-learn, Pandas, R, MATLAB e Amazon Machine Learning, além de uma forma para praticar e experimentar estas ideias através de competições como o Kaggle.
Object Oriented Programming Lab Manual Abdul Hannan
Object oriented programing Lab manual for practicing and improve the coding skills of object oriented programming.
Published by Mohammad Ali Jinnah University Islamabad.
ODSC East: Effective Transfer Learning for NLPindico data
Presented by indico co-founder Madison May at ODSC East.
Abstract: Transfer learning, the practice of applying knowledge gained on one machine learning task to aid the solution of a second task, has seen historic success in the field of computer vision. The output representations of generic image classification models trained on ImageNet have been leveraged to build models that detect the presence of custom objects in natural images. Image classification tasks that would typically require hundreds of thousands of images can be tackled with mere dozens of training examples per class thanks to the use of these pretrained reprsentations. The field of natural language processing, however, has seen more limited gains from transfer learning, with most approaches limited to the use of pretrained word representations. In this talk, we explore parameter and data efficient mechanisms for transfer learning on text, and show practical improvements on real-world tasks. In addition, we demo the use of Enso, a newly open-sourced library designed to simplify benchmarking of transfer learning methods on a variety of target tasks. Enso provides tools for the fair comparison of varied feature representations and target task models as the amount of training data made available to the target model is incrementally increased.
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Jeromy Anglim
Citation Information:
Anglim, J., & Wynton, S. K. (2015). Hierarchical Bayesian Models of Subtask Learning. Journal of Experimental Psychology. Learning, Memory, and Cognition. Online First. http://dx.doi.org/10.1037/xlm0000103
Abstract: In this talk I present some recent work looking at the question of how to understanding learning complex computer-based tasks in terms of component learning processes. The research tests and examines what Lee and Anderson (2001) labelled the "decomposition hypothesis" : i.e., that learning complex tasks can be understand as the result of learning many simpler subtasks. To test these ideas, we get participants to practice computer-based tasks where all mouse clicks and key presses are logged. We then extract a range of measures of strategy use, subtask performance, and overall task performance. We then use Bayesian hierarchical methods to test models of how strategy use and performance changes with practice at the individual-level. Overall, these model provide a more nuanced representation of how complex tasks can be decomposed in terms of simpler learning mechanisms. The research also presents a case study of how Bayesian methods can be used to yield novel insights to well-established psychological questions.
Bio: Dr Jeromy Anglim is a lecturer at Deakin University in Melbourne. He completed his PhD at University of Melbourne on mathematical models of learning, and his Post Doc in the Melbourne Business School on applications of Bayesian hierarchical models to psychology. His research interests are at the interface of statistics and industrial / organisational psychology with particular interest in skill acquisition, performance, individual differences, Bayesian data analysis, psychometrics, and selection and recruitment. He has a particular interest in refining and promoting methods for open and reproducible research in psychology. For further information go to http://jeromyanglim.blogspot.com
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
The automatic question answering (QA) task has long been considered a primary objective of artificial intelligence.
Among the QA sub-systems, we focused on answer-ranking part. In particular, we investigated a novel neural network architecture with additional data clustering module to improve the performance in ranking answer candidates which are longer than a single sentence. This work can be used not only for the QA ranking task, but also to evaluate the relevance of next utterance with given dialogue generated from the dialogue model.
In this talk, I'll present our research results (NAACL 2018), and also its potential use cases (i.e. fake news detection). Finally, I'll conclude by introducing some issues on previous research, and by introducing recent approach in academic.
EDM2014 paper: General Features in Knowledge Tracing to Model Multiple Subski...Yun Huang
This is our presentation of the paper General Features in Knowledge Tracing to Model Multiple Subskills, Temporal Item Response Theory, and Expert Knowledge, which is nominated for Best Paper Award. This is a new student model that allows flexible features to help inferring latent knowledge state. Code is available at http://ml-smores.github.io/fast/.
This is our tutorial on a toolkit allowing features in HMM (e.g., Knowledge Tracing). The paper was nominated for Best Paper Award in 2014 International Conference on Educational Data Mining.
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
Review of paper
Language Models are Unsupervised Multitask Learners
(GPT-2)
by Alec Radford et al.
Paper link: https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
YouTube presentation: https://youtu.be/f5zULULWUwM
(Slides are written in English, but the presentation is done in Korean)
Model remodeling with modern deep learning frameworksrosentep
SciPy 2019 Talk
Video: https://youtu.be/OoGaFn3aaMU
Conference Link: https://www.scipy2019.scipy.org/confschedule
Abstract:
While modern deep learning frameworks have revolutionized the ability for non-experts to train deep learning models, they have also democratized a host of other innovations which extend beyond the niche of deep learning. In this talk, I will explore some models and domains that are not commonly thought of as “machine learning” problems and show how PyTorch allows one to build more complex and scalable models than ever before. This represents an opportunity to revisit existing models, which I will do by showing how to implement them with PyTorch and integrate them into the rest of the PyData ecosystem.
This is an introductory workshop for machine learning. Introduced machine learning tasks such as supervised learning, unsupervised learning and reinforcement learning.
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningJoaquin Delgado PhD.
Search engines have focused on solving the document retrieval problem, so their scoring functions do not handle naturally non-traditional IR data types, such as numerical or categorical. Therefore, on domains beyond traditional search, scores representing strengths of associations or matches may vary widely. As such, the original model doesn’t suffice, so relevance ranking is performed as a two-phase approach with 1) regular search 2) external model to re-rank the filtered items. Metrics such as click-through and conversion rates are associated with the users’ response to items served. The predicted selection rates that arise in real-time can be critical for optimal matching. For example, in recommender systems, predicted performance of a recommended item in a given context, also called response prediction, is often used in determining a set of recommendations to serve in relation to a given serving opportunity. Similar techniques are used in the advertising domain. To address this issue the authors have created ML-Scoring, an open source framework that tightly integrates machine learning models into a popular search engine (SOLR/Elasticsearch), replacing the default IR-based ranking function. A custom model is trained through either Weka or Spark and it is loaded as a plugin used at query time to compute custom scores.
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningS. Diana Hu
Search engines have focused on solving the document retrieval problem, so their scoring functions do not handle naturally non-traditional IR data types, such as numerical or categorical. Therefore, on domains beyond traditional search, scores representing strengths of associations or matches may vary widely. As such, the original model doesn’t suffice, so relevance ranking is performed as a two-phase approach with 1) regular search 2) external model to re-rank the filtered items. Metrics such as click-through and conversion rates are associated with the users’ response to items served. The predicted selection rates that arise in real-time can be critical for optimal matching. For example, in recommender systems, predicted performance of a recommended item in a given context, also called response prediction, is often used in determining a set of recommendations to serve in relation to a given serving opportunity. Similar techniques are used in the advertising domain. To address this issue the authors have created ML-Scoring, an open source framework that tightly integrates machine learning models into a popular search engine (SOLR/Elasticsearch), replacing the default IR-based ranking function. A custom model is trained through either Weka or Spark and it is loaded as a plugin used at query time to compute custom scores.
OpenML: Making machine learning research more reproducible (and easier) by bringing it online.
From the ICML 2017 Reproducibility in Machine Learning Workshop
Building machine learning systems remains something of an art, from gathering and transforming the right data to selecting and finetuning the most fitting modeling techniques. If we want to make machine learning more accessible and foster skilfull use, we need novel ways to share and reuse findings, and streamline online collaboration. OpenML is an open science platform for machine learning, allowing anyone to easily share data sets, code, and experiments, and collaborate with people all over the world to build better models. It shows, for any known data set, which are the best models, who built them, and how to reproduce and reuse them in different ways. It is readily integrated into several machine learning environments, so that you can share results with the touch of a button or a line of code. As such, it enables large-scale, real-time collaboration, allowing anyone to explore, build on, and contribute to the combined knowledge of the field. Ultimately, this provides a wealth of information for a novel, data-driven approach to machine learning, where we learn from millions of previous experiments to either assist people while analyzing data (e.g., which modeling techniques will likely work well and why), or automate the process altogether.
Presentation on the OpenML initiative to enable open, collaborative machine learning during the data@Sheffield event. We discuss how data, machine learning algorithms and experiments can be analysed collaboratively by data scientists and domain scientists, as well as citizen scientists.
Tutorial given at the European Conference for Machine Learning (ECMLPKDD 2015). It covers OpenML, how to use it in your research, interfaces in Java, R, Python, use through machine learning tools such as WEKA and MOA. Also covers topics in open science and reproducible research.
OpenML Tutorial: Networked Science in Machine LearningJoaquin Vanschoren
Many sciences have made significant breakthroughs by adopting online tools that help organize, structure and mine information that is too detailed to be printed in journals. In this presentation, we introduce OpenML, a place for machine learning researchers to share and organize data in fine detail, so that they can collaborate more effectively with others to tackle harder problems. We discuss what benefits it brings for machine learning research, individual scientists, as well as students and practitioners. We show practical use cases and APIs for interacting with the system from machine learning software.
Tutorial on data science, what's it like to be a data scientist, big data, the data scientific method, probabilistic algorithms, map-reduce, sensor data analysis, visualization of twitter and foursquare feeds, open source tools (R, Python, NoSQL)
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Studia Poinsotiana
I Introduction
II Subalternation and Theology
III Theology and Dogmatic Declarations
IV The Mixed Principles of Theology
V Virtual Revelation: The Unity of Theology
VI Theology as a Natural Science
VII Theology’s Certitude
VIII Conclusion
Notes
Bibliography
All the contents are fully attributable to the author, Doctor Victor Salas. Should you wish to get this text republished, get in touch with the author or the editorial committee of the Studia Poinsotiana. Insofar as possible, we will be happy to broker your contact.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
1. Learningto Learn
Joaquin Vanschoren
Eindhoven University of Technology & JADS
j.vanschoren@tue.nl
@joavanschoren
Berlin Machine Learning Meetup, Jan 2019
Learning Learning LearningLearning Learning
Slides: www.automl.org/events - Video: bit.ly/2QsnsUT
Book (open access): www.automl.org/book (or www.amazon.de)
2. Learning is a never-ending process
!2
Humans don’t learn from scratch
3. Learning is a never-ending process
Task 1 Task 2 Task 3
Models ModelsModelsModels
Model
performance performance performance
Learning
ModelsModelsModelsModels
Learning Learning
Learning
episodes
meta-
learning
meta-
learning
meta-
learning
!3
Humans learn across tasks
Why? Less trial-and-error, less data
4. Task 1 Task 2 Task 3
ModelsModelsModels
performance performance performance
LearningLearning Learning
ModelsModelsModels
ModelsModelsModels
inductive bias
!4
Inductive bias: assumptions added to the training data to learn effectively
If prior tasks are similar, we can transfer prior knowledge to new tasks
Learning to learn
New Task
performance
ModelsModelsModels
Learning
prior beliefs
constraints
model parameters
representations
training data
5. Task 1 Task 2 Task 3
ModelsModelsModels
performance performance performance
LearningLearningLearners
LearningLearningLearners
LearningLearningLearners
ModelsModelsModels
ModelsModelsModels
}meta-data
!5
Meta-learning
New Task
performance
ModelsModelsModels
meta-learner
base-learner
additional
experiments
Meta-learner learns a (base-)learning algorithm, based on meta-data
6. How?
New Task
meta-learner
ModelsModelsModels
Task 1 Task j
ModelsModelsModels
performance performance
LearningLearningLearningLearning
ModelsModelsModels
…
performance
1. Start with what generally works
2. Start from what likely works (based on partially similar tasks)
3. Start from previously trained models (for very similar tasks)
!6
Learners Learners
7. 1. Start with what generally works
Store and use meta-data:
- configurations: settings that uniquely define the model
- performance (e.g. accuracy) on specific tasks
New Task
meta-learner
ModelsModelsModels
configurations
performances
Task 1 Task j
ModelsModelsModels
performance performance
LearningLearningLearningLearning
ModelsModelsModels
…
performance
λi
Pi,j
!7
Learners Learners
(hyperparameters, pipeline,
neural architecture,…)
8. Rankings
• Build a global (multi-objective) ranking, recommend the top-K
• Can be used as a warm start for optimization techniques
• E.g. Bayesian optimization, evolutionary techniques,…
Tasks
ModelsModelsModels
performance
LearningLearningLearning
1. λa
2. λb
3. λc
4. λd
5. λe
6. …
New Task
meta-learner
ModelsModelsModels
performance
Global ranking
(task independent)
λa..k
warm
start
Leite et al. 2012
Abdulrahman et al. 2018
}
(discrete)
(multi-objective)
Pi,j
!8
λi
9. • Functional ANOVA 1
Select hyperparameters that cause variance in the evaluations.
• Tunability 2
Learn good defaults, measure improvement from tuning over defaults
Tasks
ModelsModelsModels
performance
LearningLearningLearning
}
New Task
meta-learner
ModelsModelsModels
performance
To tune or not to tune?
hyperparameter
importance
1 van Rijn & Hutter 2018
λ1
λ2
λ3
λ4 constraints
priors
2 Probst et al. 2018
Pi,j
!9
λi
10. • Search space pruning
Exclude regions yielding bad performance on (similar) tasks
Tasks
ModelsModelsModels
performance
LearningLearningLearning
}
New Task
meta-learner
ModelsModelsModels
performance
To tune or not to tune?
constraints
Pi,j
Wistuba et al. 2015
P
λ1
!10
λi
λ2
11. • Experiments on the new task can tell us how it is similar to previous tasks
• Task are similar if observed performance of configurations is similar
• Use this to recommend new experiments
Tasks
ModelsModelsModels
performance
LearningLearningLearning
}
New Task
meta-learner
ModelsModelsModels
performance
Learning task similarity
λc
(discrete)
Pi,j
!11
λi
Pc,new ≅ Pc,j
12. • Tournament-style selection, warm-start with overall best config λbest
• Next candidate λc : the one that beats current λbest on similar tasks
Tasks
ModelsModelsModels
performance
LearningLearningLearning
}
New Task
meta-learner
ModelsModelsModels
performance
Active testing
Leite et al. 2012
λc
Select λc > λbest
on similar tasks
(discrete)
Pi,j
!12
λi
13. • Consider space of all configuration options (e.g. all possible neural nets or pipelines)
• Surrogate model: probabilistic regression model of configuration performance
• Acquisition function: selects next configuration to try (exploration-exploitation)
Task
ModelsModelsModels
performance
LearningLearningLearning
Bayesian optimization
Surrogate model
Acquisition function
λ ∈ Λ
P
λi
Rasmussen 2014
!13
16. • If task j is similar to the new task, its surrogate model Sj will likely transfer well
• Sum up all Sj predictions, weighted by task similarity (as in active testing)1
• Build combined Gaussian process, weighted by current performance on new task2
Tasks
ModelsModelsModels
performance
LearningLearningLearning
New Task
meta-learner
ModelsModelsModels
performance
per task tj:
Pi,j
}
Surrogate model transfer
1 Wistuba et al. 2018
λi
P
Sj
2 Feurer et al. 2018
S= ∑ wj Sj
+
+
S1
S2
S3
!16
λi
17. • Multi-task Gaussian processes: train surrogate model on t tasks simultaneously1
• If tasks are similar: transfers useful info
• Not very scalable
• Bayesian Neural Networks as surrogate model2
• Multi-task, more scalable
• Stacking Gaussian Process regressors (Google Vizier)3
• Sequential tasks, each similar to the previous one
• Transfers a prior based on residuals of previous GP
Multi-task Bayesian optimization
1 Swersky et al. 2013
Independent GP predictions Multi-task GP predictions
2 Springenberg et al. 2016
3 Golovin et al. 2017
!17
18. • Bayesian linear regression (BLR) surrogate model on every task
• Use neural net to learn a suitable basis expansion ϕz(λ) for all tasks
• Scales linearly in # observations, transfers info on configuration space
Tasks
ModelsModelsModels
performance
LearningLearningLearning New Task
meta-learner
ModelsModelsModels
performance
}
More scalable variants
Perrone et al. 2018
P
BLR
surrogate
(λi,Pi,j)
φz(λ)i
warm-start (pre-train)
λi
Bayesian optimization
φz(λ)
BLR hyperparameters
Pi,j
!18
λi
19. 2. Start from what likely works (based on similar tasks)
Meta-features: measurable properties of the tasks
(number of instances and features, class imbalance, feature skewness,…)
configurations
performances
similar mj ?Task 1 Task N
ModelsModelsModels
performance performance
LearningLearningLearning
LearningLearningLearning
ModelsModelsModels
… meta-features New Task
meta-learner
ModelsModelsModels
performance
mj
Pi,j
!19
λi
20. • Hand-crafted (interpretable) meta-features1
• Number of instances, features, classes, missing values, outliers,…
• Statistical: skewness, kurtosis, correlation, covariance, sparsity, variance,…
• Information-theoretic: class entropy, mutual information, noise-signal
ratio,…
• Model-based: properties of simple models trained on the task
• Landmarkers: performance of fast algorithms trained on the task
• Domain specific task properties
Meta-features
Vanschoren 2018
!20
21. • Learning a joint task representation
• Deep metric learning: learn a representation hmf using a ground truth
distance
• With Siamese Network: similar task, similar representation
• Can be used to recommend neural architectures given task similarity
Meta-features
Kim et al. 2017
!21
22. • Find k most similar tasks, warm-start search with best λi
• Auto-sklearn: Bayesian optimization (SMAC)
• Meta-learning yield better models, faster
• Winner of AutoML Challenges
Tasks
ModelsModelsModels
performance
LearningLearningLearning
New Task
meta-learner
ModelsModelsModels
performance
Pi,j
}
Warm-starting from similar tasks
λ1..k
mj
best λi on
similar tasks
Feurer et al. 2015
!22
λi
Bayesian optimization
λ
P
λ1
λ3
λ2
λ4
23. • Learn direct mapping between meta-features and Pi,j
• Zero-shot meta-models: predict best λi given meta-features 1
• Ranking models: return ranking λ1..k 2
• Predict which algorithms / configurations to consider / tune3
• Predict performance / runtime for given 𝛳i and task4
• Can be integrated in larger AutoML systems: warm start, guide search,…
meta-learner
Meta-models
λbest
1 Brazdil et al. 2009, Lemke et al. 2015
2 Sun and Pfahringer 2013, Pinto et al. 2017
meta-learner λ1..k
mj
mj
meta-learner
Pijmj, λi
3 Sanders and C. Giraud-Carrier 2017
meta-learner
Λmj
4 Yang et al. 2018
!23
24. Learning Pipelines / Architectures
!24
• Compositionality: the learning process can be broken down into smaller tasks
• Easier to learn, more transferable, more robust
• Pipelines are one way of doing this, but how to control the search space?
• Planning techniques (e.g. Hierarchical Task Planning) 2
• Impose a fixed structure or grammar on the pipeline 1
• Neural architecture search
• Usually defines fixed search space 3
• Very little meta-learning (yet)
• RL controller transfer 4
2 Mohr et al. 2018
1 Feurer et al. 2015
3 Zoph et al. 2018
4 Wong et al. 2018
25. Evolving pipelines
!25
3 De Sa et al. 2017
1 Olson et al. 2017
2 Gijsbers et al. 2018
• Start from simple pipelines
• Evolve more complex ones if needed
• Reuse pipelines that do specific things
• Mechanisms:
• Cross-over: reuse partial pipelines
• Mutation: change structure, tuning
• Approaches:
• TPOT: Tree-based pipelines1
• GAMA: asynchronous evolution2
• RECIPE: grammar-based3
• Meta-learning:
• Warm-starting, Meta-models 2
26. Learning to learn through self-play
!26
• Build pipelines by inserting, deleting, replacing components (actions)
• Neural network (LSTM) receives task meta-features, pipelines and evaluations
• Predicts pipeline performance and action probabilities
• Monte Carlo Tree Search builds pipelines, runs simulations against LSTM
Drori et al 2017
New Task
meta-learner
ModelsModelsModels
performance
self-play
mj
λi
27. 3. Start with previously trained models
configurations
performances
Task 1 Task N
ModelsModelsModels
performance performance
LearningLearningLearning
LearningLearningLearning
ModelsModelsModels
… New Task
meta-learner
ModelsModelsModels
performance
model parameters
Models trained on similar tasks
(model parameters, features,…)
intrinsically (very) similar
(e.g. shared representation)
𝛳k
Pi,j
!27
λi
28. Transfer learning
• For neural networks, both structure and weights can be transferred
• Features and initializations learned from:
• Large image datasets (e.g. ImageNet) 1
• Large text corpora (e.g. Wikipedia) 2
• Fails if tasks are not similar enough 3
frozen new
pre-trained new
frozen
Source
tasks
ModelsModels
performance
LearningLearningLearning Feature extraction:
remove last layers, use output as features
if task is quite different, remove more layers
End-to-end tuning:
train from initialized weights
Fine-tuning:
unfreeze last layers, tune on new task
sm
all target task
large
similar
large
different
filters
1 Razavian et al. 2014
3 Yosinski et al. 2014
2 Mikolov et al. 2013
new
pre-trained
convnet
!28
29. Learning to learn by gradient descent
• Our brains probably don’t do backprop, replace it with:
• Simple parametric (bio-inspired) rule to update weights 1
• Single-layer neural network to learn weight updates 2
• Learn parameters across tasks, by gradient descent (meta-gradient)
1 Bengio et al. 1995
2 Runarsson and Jonsson 2000
learning rate
presynaptic activity
reinforcing signal
Tasks
meta-learner
performance
ModelsModelsModels
Δ 𝛳i = 𝜂 ( )
meta-gradient
weights λi!29
learning rate
learn λi
gradient
descent
λi
λinit
learner
Bengio et al.
Runarsson and Jonsson
Δ 𝛳i
30. Learning to learn gradient descent
2 Andrychowicz et al. 2016
1 Hochreiter 2001
• Replace backprop with a recurrent neural net (LSTM)1, not so scalable
• Use a coordinatewise LSTM [m] for scalability/flexibility (cfr. ADAM, RMSprop) 2
• Optimizee: receives weight update gt from optimizer
• Optimizer: receives gradient estimate ∇t from optimizee
• Learns how to do gradient descent across tasks
hidden state
optimisee
weights
New task
Model
meta-
model
by gradient descent
!30LSTM parameters shared for all 𝛳
Single
network!
31. Learning to learn gradient descent
2 Andrychowicz et al. 2016
1 Hochreiter 2001
• Left: optimizer and optimize trained to do style transfer
• Right: optimizer solves similar tasks (different style, content and
resolution) without any more training data
by gradient descent
!31
32. Few-shot learning
• Learn how to learn from few examples (given similar tasks)
• Meta-learner must learn how to train a base-learner based on prior experience
• Parameterize base-learner model and learn the parameters 𝛳i
Ttrain
Image: Hugo Larochelle
meta-model
Model M
𝛳i+1
Tj
Ttest
Ttest
𝛳i
Pi,j
λk
!32
Pi+1,test
X_test
y_test
y_test
X_train y_train
Cost(θi) =
1
|Ttest | ∑
t∈Ttest
loss(θi, t) 1-shot, 5-class:
new classes!
33. Few-shot learning: approaches
!33
• Existing algorithm as meta-learner:
• LSTM + gradient descent
• Learn 𝛳init + gradient descent
• kNN-like: Memory + similarity
• Learn embedding + classifier
• …
• Black-box meta-learner
• Neural Turing machine (with memory)
• Neural attentive learner
• …
Cost(θi) =
1
|Ttest | ∑
t∈Ttest
loss(θi, t)
Santoro et al. 2016
Mishra et al. 2018
meta-model
Model M
𝛳i+1
Tj Ttest
𝛳i
Pi,j
λk
Pi+1,test
Ravi and Larochelle 2017
Finn et al. 2017
Vinyals et al. 2016
Snell et al. 2017
34. LSTM meta-learner + gradient descent
Ravi and Larochelle 2017
!34
Train Test
Cost(θT) =
1
|Ttest | ∑
t∈Ttest
loss(θT, t)
LSTM LSTM LSTM LSTM
M M M M M
• Gradient descent update 𝛳t is similar to LSTM cell state update ct
• Hence, training a meta-learner LSTM yields an update rule for training M
• Start from initial 𝛳0, train model on first batch, get gradient and loss update
• Predict 𝛳t+1 , continue to t=T, get cost, backpropagate to learn LSTM weights, optimal 𝛳0
forget
input
36. Learning to reinforcement learn
!36
1 Duan et al. 2017
2 Wang et al. 2017
3 Duan et al. 2017
Environments
meta-RL
algorithm
performance
policy 𝝅θ
fast RL
agent
!36
policy 𝝅θ
Similar env.
performance
• Humans often learn to play new games much faster than RL techniques do
• Reinforcement learning is very suited for learning-to-learn:
• Build a learner, then use performance as that learner as a reward
impl
• Learning to reinforcement learn 1,2
• Use RNN-based deep RL to train a
recurrent network on many tasks
• Learns to implement a‘fast’RL agent,
encoded in its weights
37. Learning to reinforcement learn
!37
• Also works for few-shot learning 3
• Condition on observation + upcoming demonstration
• You don’t know what someone is trying to teach you, but you
prepare for the lesson
1 Duan et al. 2017
2 Wang et al. 2017
3 Duan et al. 2017
!37
38. Learning to learn more tasks
!38
• Active learning
• Deep network (learns representation) + policy network
• Receives state and reward, says which points to query next
• Density estimation
• Learn distribution over small set of images, can generate new ones
• Uses a MAML-based few-shot learner
• Matrix factorization
• Deep learning architecture that makes recommendations
• Meta-learner learns how to adjust biases for each user (task)
• Replace hand-crafted algorithms by learned ones.
• Look at problems through a meta-learning lens!
Pang et al. 2018
Reed et al. 2017
Vartak et al. 2017
40. Towards human-like learning to learn
!40
• Learning-to-learn gives humans a significant advantage
• Learning how to learn any task empowers us far beyond knowing
how to learn specific tasks.
• It is a universal aspect of life, and how it evolves
• Very exciting field with many unexplored possibilities
• Many aspects not understood (e.g. task similarity), need more
experiments.
• Challenge:
• Build learners that never stop learning, that learn from each other
• Build a global memory for learning systems to learn from
• Let them explore / experiment by themselves
41. Thank you!
special thanks to
Pavel Brazdil, Matthias Feurer, Frank Hutter, Erin Grant,
Hugo Larochelle, Raghu Rajan, Jan van Rijn, Jane Wang
more to learn
http://www.automl.org/book/
Chapter 2: Meta-Learning
!41
Never stop learning