In this talk I introduce the topic of Online Machine Learning, which deals with techniques for doing machine learning in an online setting, i.e. where you train your model a few examples at a time, rather than using the full dataset (off-line learning).
Introductions to Online Machine Learning AlgorithmsDataWorks Summit
Online algorithms are an increasingly popular yet often misunderstood branch of machine learning, where model parameter estimates are updated for each new piece of information received. While mini-batch methods have often been mislabeled as 'streaming-machine learning', true online methods have different implementations and goals. This talk will explain key differences between online and offline machine learning, an introduction to many common online algorithms, and how online algorithms can be analyzed. An example using Apache Flink to detect trends on Twitter will be presented. Attendees will come away from this talk with a better understanding of the challenges and opportunities from working with online algorithms and how they can begin implementing their own algorithms in Apache Flink.
Introductions to Online Machine Learning AlgorithmsDataWorks Summit
Online algorithms are an increasingly popular yet often misunderstood branch of machine learning, where model parameter estimates are updated for each new piece of information received. While mini-batch methods have often been mislabeled as 'streaming-machine learning', true online methods have different implementations and goals. This talk will explain key differences between online and offline machine learning, an introduction to many common online algorithms, and how online algorithms can be analyzed. An example using Apache Flink to detect trends on Twitter will be presented. Attendees will come away from this talk with a better understanding of the challenges and opportunities from working with online algorithms and how they can begin implementing their own algorithms in Apache Flink.
Sequence labeling problem을 해결하는 모델 중 초기 모델인 Hidden markov model에 대해 정리한다. HMM을 설명하기 위한 기본 개념에서 벗어나지 않도록 작성한 자료이다.
- Markov chain
- Markov assumption
- Hidden markov model
- HMM training: forward-backward algorithm
- HMM likelihood computation
- HMM decoding: viterbi algorithm
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Simplilearn
This presentation on "Supervised and Unsupervised Learning" will help you understand what is machine learning, what are the types of Machine learning, what is supervised machine learning, types of supervised machine learning, what is unsupervised learning, types of unsupervised learning and what are the differences between supervised and unsupervised machine learning. In supervised learning, the model learns from a labeled data whereas in unsupervised learning, model trains itself on unlabeled data. Now, let us get started and understand supervised and unsupervised learning and how they are different from each other.
Below are the topics explained in this supervised and unsupervised learning in Machine Learning presentation-
1. What is Machine Learning
- Types of Machine Learning
- Supervised Learning
- Unsupervised Learning
2. Supervised Learning
- Types of Supervised Learning
3. Unsupervised Learning
- Types of Unsupervised Learning
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with the knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire a thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
Learn more at: https://www.simplilearn.com/
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand why Machine Learning came into picture, what is Machine Learning, types of Machine Learning, Machine Learning algorithms with a detailed explanation on linear regression, decision tree & support vector machine and at the end you will also see a use case implementation where we classify whether a recipe is of a cupcake or muffin using SVM algorithm. Machine learning is a core sub-area of artificial intelligence; it enables computers to get into a mode of self-learning without being explicitly programmed. When exposed to new data, these computer programs are enabled to learn, grow, change, and develop by themselves. So, to put simply, the iterative aspect of machine learning is the ability to adapt to new data independently. Now, let us get started with this Machine Learning presentation and understand what it is and why it matters.
Below topics are explained in this Machine Learning presentation:
1. Why Machine Learning?
2. What is Machine Learning?
3. Types of Machine Learning
4. Machine Learning Algorithms
- Linear Regression
- Decision Trees
- Support Vector Machine
5. Use case: Classify whether a recipe is of a cupcake or a muffin using SVM
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Index.....................
History of Machine Learning.
What is Machine Learning.
Why ML.
Learning System Model.
Training and Testing.
Performance.
Algorithms.
Machine Learning Structure.
Application.
Conclusion.
----------------------------------------------
THANK YOU
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Introduction to machine learning. Basics of machine learning. Overview of machine learning. Linear regression. logistic regression. cost function. Gradient descent. sensitivity, specificity. model selection.
The world today is evolving and so are the needs and requirements of people. Furthermore, we are witnessing a fourth industrial revolution of data.
Machine Learning has revolutionized industries like medicine, healthcare, manufacturing, banking, and several other industries. Therefore, Machine Learning has become an essential part of modern industry.
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand what is clustering, K-Means clustering, flowchart to understand K-Means clustering along with demo showing clustering of cars into brands, what is logistic regression, logistic regression curve, sigmoid function and a demo on how to classify a tumor as malignant or benign based on its features. Machine Learning algorithms can help computers play chess, perform surgeries, and get smarter and more personal. K-Means & logistic regression are two widely used Machine learning algorithms which we are going to discuss in this video. Logistic Regression is used to estimate discrete values (usually binary values like 0/1) from a set of independent variables. It helps to predict the probability of an event by fitting data to a logit function. It is also called logit regression. K-means clustering is an unsupervised learning algorithm. In this case, you don't have labeled data unlike in supervised learning. You have a set of data that you want to group into and you want to put them into clusters, which means objects that are similar in nature and similar in characteristics need to be put together. This is what k-means clustering is all about. Now, let us get started and understand K-Means clustering & logistic regression in detail.
Below topics are explained in this Machine Learning tutorial part -2 :
1. Clustering
- What is clustering?
- K-Means clustering
- Flowchart to understand K-Means clustering
- Demo - Clustering of cars based on brands
2. Logistic regression
- What is logistic regression?
- Logistic regression curve & Sigmoid function
- Demo - Classify a tumor as malignant or benign based on features
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Deep Reinforcement Learning Talk at PI School. Covering following contents as:
1- Deep Reinforcement Learning
2- QLearning
3- Deep QLearning (DQN)
4- Google Deepmind Paper (DQN for ATARI)
Back Propagation Neural Network In AI PowerPoint Presentation Slide Templates...SlideTeam
Introducing Back Propagation Neural Network In AI PowerPoint Presentation Slide Templates Complete Deck. This ready-to-use backpropagation PowerPoint visuals can be used to explain the concept of artificial intelligence, machine learning, and deep learning easily to your audience. Discuss the types of artificial intelligence including deep learning, machine learning, and artificial intelligence. Present the goals of AI research which constitutes reasoning, knowledge representation, learning, natural language processing, artificial neural networks by taking the advantage of our neural network PowerPoint slide designs. Describe the concept of machine learning and discuss how it helps in analyzing customer queries and provide support for human customer support executives. You can also showcase the comparison between artificial intelligence, deep learning and machine learning. Make your audience familiar with the usages of artificial intelligence such as customer services, supply chain, human resources, customer insight etc. Challenges and limitations of machine learning can also be discussed by using our content-ready computational statistics PPT themes. Thus, download our ready-to-use artificial neural network PowerPoint slide deck and increase your business efficiency. https://bit.ly/2YlHC9s
Machine Learning and Real-World ApplicationsMachinePulse
This presentation was created by Ajay, Machine Learning Scientist at MachinePulse, to present at a Meetup on Jan. 30, 2015. These slides provide an overview of widely used machine learning algorithms. The slides conclude with examples of real world applications.
Ajay Ramaseshan, is a Machine Learning Scientist at MachinePulse. He holds a Bachelors degree in Computer Science from NITK, Suratkhal and a Master in Machine Learning and Data Mining from Aalto University School of Science, Finland. He has extensive experience in the machine learning domain and has dealt with various real world problems.
Sequence labeling problem을 해결하는 모델 중 초기 모델인 Hidden markov model에 대해 정리한다. HMM을 설명하기 위한 기본 개념에서 벗어나지 않도록 작성한 자료이다.
- Markov chain
- Markov assumption
- Hidden markov model
- HMM training: forward-backward algorithm
- HMM likelihood computation
- HMM decoding: viterbi algorithm
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Simplilearn
This presentation on "Supervised and Unsupervised Learning" will help you understand what is machine learning, what are the types of Machine learning, what is supervised machine learning, types of supervised machine learning, what is unsupervised learning, types of unsupervised learning and what are the differences between supervised and unsupervised machine learning. In supervised learning, the model learns from a labeled data whereas in unsupervised learning, model trains itself on unlabeled data. Now, let us get started and understand supervised and unsupervised learning and how they are different from each other.
Below are the topics explained in this supervised and unsupervised learning in Machine Learning presentation-
1. What is Machine Learning
- Types of Machine Learning
- Supervised Learning
- Unsupervised Learning
2. Supervised Learning
- Types of Supervised Learning
3. Unsupervised Learning
- Types of Unsupervised Learning
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with the knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire a thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
Learn more at: https://www.simplilearn.com/
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand why Machine Learning came into picture, what is Machine Learning, types of Machine Learning, Machine Learning algorithms with a detailed explanation on linear regression, decision tree & support vector machine and at the end you will also see a use case implementation where we classify whether a recipe is of a cupcake or muffin using SVM algorithm. Machine learning is a core sub-area of artificial intelligence; it enables computers to get into a mode of self-learning without being explicitly programmed. When exposed to new data, these computer programs are enabled to learn, grow, change, and develop by themselves. So, to put simply, the iterative aspect of machine learning is the ability to adapt to new data independently. Now, let us get started with this Machine Learning presentation and understand what it is and why it matters.
Below topics are explained in this Machine Learning presentation:
1. Why Machine Learning?
2. What is Machine Learning?
3. Types of Machine Learning
4. Machine Learning Algorithms
- Linear Regression
- Decision Trees
- Support Vector Machine
5. Use case: Classify whether a recipe is of a cupcake or a muffin using SVM
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Index.....................
History of Machine Learning.
What is Machine Learning.
Why ML.
Learning System Model.
Training and Testing.
Performance.
Algorithms.
Machine Learning Structure.
Application.
Conclusion.
----------------------------------------------
THANK YOU
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
Slides for talk Abhishek Sharma and I gave at the Gennovation tech talks (https://gennovationtalks.com/) at Genesis. The talk was part of outreach for the Deep Learning Enthusiasts meetup group at San Francisco. My part of the talk is covered from slides 19-34.
Introduction to machine learning. Basics of machine learning. Overview of machine learning. Linear regression. logistic regression. cost function. Gradient descent. sensitivity, specificity. model selection.
The world today is evolving and so are the needs and requirements of people. Furthermore, we are witnessing a fourth industrial revolution of data.
Machine Learning has revolutionized industries like medicine, healthcare, manufacturing, banking, and several other industries. Therefore, Machine Learning has become an essential part of modern industry.
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand what is clustering, K-Means clustering, flowchart to understand K-Means clustering along with demo showing clustering of cars into brands, what is logistic regression, logistic regression curve, sigmoid function and a demo on how to classify a tumor as malignant or benign based on its features. Machine Learning algorithms can help computers play chess, perform surgeries, and get smarter and more personal. K-Means & logistic regression are two widely used Machine learning algorithms which we are going to discuss in this video. Logistic Regression is used to estimate discrete values (usually binary values like 0/1) from a set of independent variables. It helps to predict the probability of an event by fitting data to a logit function. It is also called logit regression. K-means clustering is an unsupervised learning algorithm. In this case, you don't have labeled data unlike in supervised learning. You have a set of data that you want to group into and you want to put them into clusters, which means objects that are similar in nature and similar in characteristics need to be put together. This is what k-means clustering is all about. Now, let us get started and understand K-Means clustering & logistic regression in detail.
Below topics are explained in this Machine Learning tutorial part -2 :
1. Clustering
- What is clustering?
- K-Means clustering
- Flowchart to understand K-Means clustering
- Demo - Clustering of cars based on brands
2. Logistic regression
- What is logistic regression?
- Logistic regression curve & Sigmoid function
- Demo - Classify a tumor as malignant or benign based on features
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
Deep Reinforcement Learning Talk at PI School. Covering following contents as:
1- Deep Reinforcement Learning
2- QLearning
3- Deep QLearning (DQN)
4- Google Deepmind Paper (DQN for ATARI)
Back Propagation Neural Network In AI PowerPoint Presentation Slide Templates...SlideTeam
Introducing Back Propagation Neural Network In AI PowerPoint Presentation Slide Templates Complete Deck. This ready-to-use backpropagation PowerPoint visuals can be used to explain the concept of artificial intelligence, machine learning, and deep learning easily to your audience. Discuss the types of artificial intelligence including deep learning, machine learning, and artificial intelligence. Present the goals of AI research which constitutes reasoning, knowledge representation, learning, natural language processing, artificial neural networks by taking the advantage of our neural network PowerPoint slide designs. Describe the concept of machine learning and discuss how it helps in analyzing customer queries and provide support for human customer support executives. You can also showcase the comparison between artificial intelligence, deep learning and machine learning. Make your audience familiar with the usages of artificial intelligence such as customer services, supply chain, human resources, customer insight etc. Challenges and limitations of machine learning can also be discussed by using our content-ready computational statistics PPT themes. Thus, download our ready-to-use artificial neural network PowerPoint slide deck and increase your business efficiency. https://bit.ly/2YlHC9s
Machine Learning and Real-World ApplicationsMachinePulse
This presentation was created by Ajay, Machine Learning Scientist at MachinePulse, to present at a Meetup on Jan. 30, 2015. These slides provide an overview of widely used machine learning algorithms. The slides conclude with examples of real world applications.
Ajay Ramaseshan, is a Machine Learning Scientist at MachinePulse. He holds a Bachelors degree in Computer Science from NITK, Suratkhal and a Master in Machine Learning and Data Mining from Aalto University School of Science, Finland. He has extensive experience in the machine learning domain and has dealt with various real world problems.
Jay Yagnik at AI Frontiers : A History Lesson on AIAI Frontiers
We have reached a remarkable point in history with the evolution of AI, from applying this technology to incredible use cases in healthcare, to addressing the world's biggest humanitarian and environmental issues. Our ability to learn task-specific functions for vision, language, sequence and control tasks is getting better at a rapid pace. This talk will survey some of the current advances in AI, compare AI to other fields that have historically developed over time, and calibrate where we are in the relative advancement timeline. We will also speculate about the next inflection points and capabilities that AI can offer down the road, and look at how those might intersect with other emergent fields, e.g. Quantum computing.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org.
Machine Learning for (DF)IR with Velociraptor: From Setting Expectations to a...Chris Hammerschmidt
achine Learning for DFIR with Velociraptor: From Setting Expectations to a Case Study
By Christian Hammerschmidt, PhD - Head of Engineering/ML, APTA Technologies
Machine learning (ML) or artificial intelligence (AI) often comes with great promise and large marketing budgets for cybersecurity, especially in monitoring (such as EDR/XDR solutions). Post-breach, it often turns out that the actual performance falls short of its promises.
In this talk, we’ll briefly look at ML for DFIR: What tasks can ML solve, generally speaking? What requirements do we have for a useful ML system in cybersecurity/DFIR contexts, such as reliability, robustness to attackers, and explainability? What makes ML difficult to apply in cybersecurity, e.g. when thinking about false alerts or attackers attempting to circumvent automated systems?
After discussing the basics, we look at ML for velociraptor:
How can we process forensic data collected with VQL using machine learning (with a typical Python/Jupyter/scikit-learn/PyTorch stack)?
And how can we build artifacts that run ML directly on each endpoint, avoiding central data collection?
The talk concludes with a case study, showing how we significantly reduced time to analyze EVTX files in incident response cases, saving thousands of USD in costs and reducing time to resolution.
Bio: Chris Hammerschmidt did his PhD research on machine learning methods for reverse engineering software systems. Now, he’s heading APTA Technologies, a start-up building machine learning tools to understand software behavior .
Affiliation: APTA Technologies, https://apta.tech
Tools for Discrete Time Control; Application to Power SystemsOlivier Teytaud
3 main algorithms from the state of the art:
- Model Predictive Control
- Stochastic Dynamic Programming
- Direct Policy Search
==> and our proposal, a modified Direct Policy Search
termed Direct Value Search
"What we learned from 5 years of building a data science software that actual...Dataconomy Media
"What we learned from 5 years of building a data science software that actually works for everybody." Dr. Dennis Proppe, CTO and Chief Data Scientist at GPredictive GmbH
Watch more from Data Natives Berlin 2016 here: http://bit.ly/2fE1sEo
Visit the conference website to learn more: www.datanatives.io
Follow Data Natives:
https://www.facebook.com/DataNatives
https://twitter.com/DataNativesConf
https://www.youtube.com/c/DataNatives
Stay Connected to Data Natives by Email: Subscribe to our newsletter to get the news first about Data Natives 2017: http://bit.ly/1WMJAqS
About the Author:
Dennis Proppe is the CTO and Chief Data Scientist at Gpredictive, where he helps building software that enables data scientists to build and deploy predictive models in a few minutes instead of weeks. He has 10 years+ of expertise in extracting business value from data. Before co-founding Gpredictive, he worked as a marketing science consultant. Dennis holds a Ph.D. in statistical marketing.
First steps with Keras 2: A tutorial with ExamplesFelipe
In this presentation, we give a brief introduction to Keras and Neural networks, and use examples to explain how to build and train neural network models using this framework.
Talk given as part of an event by Rio Machine Learning Meetup.
Word embeddings introdução, motivação e exemplosFelipe
Nesta palestra, será apresentada uma introdução ao tema “Word Embeddings”, assim como um pouco da história e motivação de desenvolvimento deste modelo. Além disso, exemplos e implementações (como o Word2Vec) serão mostrados e discutidos.
A quick overview of the main certifications for cloud computing practitioners, as of May 2017. Mini-talk held at the Rio Cloud Computing Meetup: https://www.meetup.com/rio-cloud-computing-meetup/
An introduction to and a couple of examples and tips on how to use Elasticsearch for general data analytics. Examples are based on Elasticsearch version 2.x.
Cloudwatch: Monitoring your AWS services with Metrics and AlarmsFelipe
Brief intro to AWS Cloudwatch. Motivation, examples and use cases. Shows how you can collect and monitors metrics for all your AWS services to better control your applications and infrastructure. #cloud-computing #aws #amazon-web-services
Aws cost optimization: lessons learned, strategies, tips and toolsFelipe
A couple of useful resources that may help you lower your AWS bill at the end of the month. Includes AWS Resources, Third-party Solutions and general tips and lessons learned.
Hadoop MapReduce and Apache Spark on EMR: comparing performance for distribut...Felipe
Alguns experimentos comparando a performance do Hadoop MapReduce e do Apache Spark, para workloads distribuídos. Os experimentos foram feitos no ambiente Elastic MapReduce, da Amazon WebServices.
Boas práticas no desenvolvimento de softwareFelipe
Um pequeno conjunto de boas práticas para o desenvolvimento de software. O conteúdo é recomendado para desenvolvedores iniciantes ou intermediários. O foco é em desenvolvimento Web, baseado em Sistemas de Informação, com uma linguagem fracamente tipada. Os exemplos são dados na linguagem PHP.
Introducing Rachinations, a game execution engine and Ruby-based DSL that can be used to test out game designs and evaluate hypotheses and analyze gameplay. It is an implementation of Dr. J. Dormans' Machinations framework for game design.
A short introduction (with many examples) to the Scala programming language and also an introduction to using the Play! Framework for modern, safe, efffcient and reactive web applications.
Conceitos e exemplos em versionamento de códigoFelipe
Uma pequena apresentação dedicada a expôr desenvolvedores a conceitos e termos relacionados ao controle de versão de código em projetos de software; essa é uma prática essencial no desenvolvimento de software com a qual todos os desenvolvedores se depararão no decorrer de suas carreiras.
DevOps Series: Extending vagrant with Puppet for configuration managementFelipe
This is a short presentation on the reasons why you would augment your Vagrant installation with a full-fledged provisioner like Puppet and some examples of basic things you can do with it.
DevOps Series: Defining and Sharing Testable Machine Configurations with vagrantFelipe
A simple explanation (with examples) of what can be accomplished with Vagrant, a very useful tool to effectively define and share machine configurations, in order to ensure everyone on your team is running the exact same environment.
A small presentation with some quick explanation of what D3.js is and a few examples of what it can do for you. It can be used for a quick presentation (20-30 mins) after which you should have an idea of whether you can use D3.js for your project.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
3. Introduction
● Online Learning is generally described as doing machine learning
in a streaming data setting, i.e. training a model in consecutive
rounds
3
4. Introduction
● Online Learning is generally described as doing machine learning
in a streaming data setting, i.e. training a model in consecutive
rounds
○ At the beginning of each round the algorithm is presented with
an input sample, and must perform a prediction
4
5. Introduction
● Online Learning is generally described as doing machine learning
in a streaming data setting, i.e. training a model in consecutive
rounds
○ At the beginning of each round the algorithm is presented with
an input sample, and must perform a prediction
○ The algorithm verifies whether its prediction was correct or
incorrect, and feeds this information back into the model, for
subsequent rounds
5
6. Introduction
6
Whereas in batch (or offline) learning you have
access to the whole dataset to train on
x1
x2
x3
... xd
y
x1
x2
x3
... xd
y
x1
x2
x3
... xd
y
x1
x2
x3
... xd
y
x1
x2
x3
... xd
y
D features
Nsamples
Trained model
Batch training
7. Introduction
7
In online learning your model evolves as you
see new data, one example at a time
x1
x2
x3
... xd
y
Timeincreases
Input data at time t Online update
model at time t
x1
x2
x3
... xd
y
Input data at time t+1 Online update
model at time t+1
x1
x2
x3
... xd
y
Input data at time t+2 Online update
model at time t+2
8. Introduction
● In other words, you need to answer a sequence of questions but
you only have access to answers to previous questions
8
9. Introduction
● In other words, you need to answer a sequence of questions but
you only have access to answers to previous questions
9
Adapted from Shalev-Shwartz 2012
10. Introduction
● In other words, you need to answer a sequence of questions but
you only have access to answers to previous questions
10
Each t represents a
trial
Adapted from Shalev-Shwartz 2012
11. Introduction
● In other words, you need to answer a sequence of questions but
you only have access to answers to previous questions
11
Each t represents a
trial
Adapted from Shalev-Shwartz 2012
xt
is the input data for
trial t
12. Introduction
● In other words, you need to answer a sequence of questions but
you only have access to answers to previous questions
12
Each t represents a
trial
Adapted from Shalev-Shwartz 2012
xt
is the input data for
trial t
pt
is the prediction for the
label corresponding to xt
13. Introduction
● In other words, you need to answer a sequence of questions but
you only have access to answers to previous questions
13
Each t represents a
trial
Adapted from Shalev-Shwartz 2012
xt
is the input data for
trial t
pt
is the prediction for the
label corresponding to xt
This is the difference
between what you
predicted and the actual
label
14. Introduction
● In other words, you need to answer a sequence of questions but
you only have access to answers to previous questions
14
Each t represents a
trial
xt
is the input data for
trial t
pt
is the prediction for the
label corresponding to xt
This is the difference
between what you
predicted and the actual
label
After computing the loss, the algorithm will use this information to
update the model generating the predictions for the next trials
16. Introduction: Main Concepts
The main objective of online learning algorithms is to minimize the
regret.
The regret is the difference between the performance of:
● the online algorithm
● an ideal algorithm that has been able to train on the whole data
seen so far, in batch fashion
16
17. Introduction: Main Concepts
In other words, the main objective of an online machine learning
algorithm is to try to perform as closely to the corresponding offline
algorithm as possible.
17
18. Introduction: Main Concepts
In other words, the main objective of an online machine learning
algorithm is to try to perform as closely to the corresponding offline
algorithm as possible.
This is measured by the regret.
18
20. Use cases
● Online algorithms are useful in at least two scenarios:
● When your data is too large to fit in the memory
○ So you need to train your model one example at a time
20
21. Use cases
● Online algorithms are useful in at least two scenarios:
● When your data is too large to fit in the memory
○ So you need to train your model one example at a time
● When new data is constantly being generated, and/or is
dependent upon time
21
22. Use cases
Some cases where data is constantly being generated and you need
quick predictions:
22
23. Use cases
Some cases where data is constantly being generated and you need
quick predictions:
● Real-time Recommendation
● Fraud Detection
● Spam detection
● Portfolio Selection
● Online ad placement
23
24. Types of Targets
There are two main ways to think about an online learning problem, as far as
the target functions (that we are trying to learn) are concerned:
24
25. Types of Targets
There are two main ways to think about an online learning problem, as far as
the target functions (that we are trying to learn) are concerned:
● Stationary Targets
○ The target function you are trying to learn does not change over time
(but may be stochastic)
25
26. Types of Targets
There are two main ways to think about an online learning problem, as far as
the target functions (that we are trying to learn) are concerned:
● Stationary Targets
○ The target function you are trying to learn does not change over time
(but may be stochastic)
● Dynamic Targets
○ The process that is generating input sample data is assumed to be
non-stationary (i.e. may change over time)
○ The process may even be adapting to your model (i.e. in an
adversarial manner) 26
27. Example I: Stationary Targets
For stationary targets, the input-generating process is a single, but
unknown, function of the attributes.
27
28. Example I: Stationary Targets
For stationary targets, the input-generating process is a single, but
unknown, function of the attributes.
● Ex.: Some process generates, at each time step t, inputs of the
form (x1
, x2
, x3
) where each attribute is a bit, and the label y is the
result of (x1
v (x2
^ x3
)):
28
29. Example I: Stationary Targets
For stationary targets, the input-generating process is a single, but
unknown, function of the attributes.
● Ex.: Some process generates, at each time step t, inputs of the
form (x1
, x2
, x3
) where each attribute is a bit, and the label y is the
result of (x1
v (x2
^ x3
)):
29
1 0 1 1
0 1 1 1
0 0 0 0
1 0 0 1
Input at time t
Input at time t+1
Input at time t+2
Input at time t+3
x1
x2
x3
y
30. Example I: Stationary Targets
For stationary targets, the input-generating process is a single, but
unknown, function of the attributes.
● Ex.: Some process generates, at each time step t, inputs of the
form (x1
, x2
, x3
) where each attribute is a bit, and the label y is the
result of (x1
v (x2
^ x3
)):
30
1 0 1 1
0 1 1 1
0 0 0 0
1 0 0 1
Input at time t
Input at time t+1
Input at time t+2
Input at time t+3
x1
x2
x3
y
From the point of view of the
online learning algorithm,
obviously!
31. Example II: Dynamic Targets
Spam filtering
The objective of a good spam filter is to accurately model the following
function:
31
32. Example II: Dynamic Targets
Spam filtering
The objective of a good spam filter is to accurately model the following
function:
32
x1
x2
x3
x4
... xd {0,1}
33. Example II: Dynamic Targets
Spam filtering
The objective of a good spam filter is to accurately model the following
function:
33
Features extracted from an e-mail
Spam or not spam?
model
x1
x2
x3
x4
... xd {0,1}
34. Example II: Dynamic Targets
Spam filtering
● So suppose you have learned that the presence of the word
“Dollars” implies that an e-mail is likely spam.
34
35. Example II: Dynamic Targets
Spam filtering
● So suppose you have learned that the presence of the word
“Dollars” implies that an e-mail is likely spam.
● Spammers have noticed that their scammy e-mails are falling prey
to spam filters so they change tactics:
35
36. Example II: Dynamic Targets
Spam filtering
● So suppose you have learned that the presence of the word
“Dollars” implies that an e-mail is likely spam.
● Spammers have noticed that their scammy e-mails are falling prey
to spam filters so they change tactics:
○ So instead of using the word “Dollars” they start using the
word “Euro”, which fools your filter but also accomplishes
their goal (have people read the e-mail)
36
37. Approaches
A couple of approaches have been proposed in the literature:
● Online Learning from Expert Advice
37
38. Approaches
A couple of approaches have been proposed in the literature:
● Online Learning from Expert Advice
● Online Learning from Examples
38
39. Approaches
A couple of approaches have been proposed in the literature:
● Online Learning from Expert Advice
● Online Learning from Examples
● General algorithms that may also be used in the online setting
39
40. Approaches: Expert Advice
In this approach, it is assumed that the algorithm has multiple oracles
(or experts at its disposal), which its can use to produce its output, in
each trial.
40
41. Approaches: Expert Advice
In this approach, it is assumed that the algorithm has multiple oracles
(or experts at its disposal), which its can use to produce its output, in
each trial.
In other words, the task of this online algorithm is simply to learn which
of the experts it should use.
41
42. Approaches: Expert Advice
In this approach, it is assumed that the algorithm has multiple oracles
(or experts at its disposal), which its can use to produce its output, in
each trial.
In other words, the task of this online algorithm is simply to learn which
of the experts it should use.
The simplest algorithm in this realm is the Randomized Weighted
Majority Algorithm
42
43. Approaches: Expert Advice
Randomized Weighted Majority Algorithm
● Every expert has a weight (starting at 1)
● For every trial:
○ Randomly select an expert (larger weight => more likely)
○ Use that expert’s output as your prediction
○ Verify the correct answer
○ For each expert:
■ If it was mistaken, decrease its weight by a constant factor
43
44. Approaches: Learning from Examples
Learning from examples is different from using Expert Advice inasmuch
as we don’t need to previously define prebuild experts we will derive
our predictions from.
44
45. Approaches: Learning from Examples
Learning from examples is different from using Expert Advice inasmuch
as we don’t need to previously define prebuild experts we will derive
our predictions from.
We need, however, to know what Concept Class we want to search
over.
45
46. Approaches: Learning from Examples
A Concept Class is a set of of functions (concepts) that subscribe to a
particular model.
46
47. Approaches: Learning from Examples
A Concept Class is a set of of functions (concepts) that subscribe to a
particular model.
Some examples of concept classes are:
● The set of all monotone disjunctions of N variables
● The set of non-monotone disjunctions of N variables
● Decision lists with N variables
● Linear threshold formulas
● DNF (disjunctive normal form) formulas
47
48. Approaches: Learning from Examples
The Winnow Algorithm is one example of a simple algorithm that
learns monotone disjunctions online.
48
49. Approaches: Learning from Examples
The Winnow Algorithm is one example of a simple algorithm that
learns monotone disjunctions online.
In other words, it learns any concept (function), provided the concept
belongs to the Concept Class of monotone disjunctions.
49
50. Approaches: Learning from Examples
The Winnow Algorithm is one example of a simple algorithm that
learns monotone disjunctions online.
In other words, it learns any concept (function), provided the concept
belongs to the Concept Class of monotone disjunctions.
It also uses weights, as in the previous example.
50
51. Approaches: Learning from Examples
Winnow algorithm
● Initialize all weights (w1
, w2
,... wn
) to 1
● Given a new example:
○ Predict 1 if wT
x > n
○ Predict 0 otherwise
● Check the true answer
● For each input attribute:
○ If algorithm predicted 1 but true answer was 0, double the value of
every weight corresponding to an attribute = 1
○ If algorithm predicted 0 but true answer was 1, halve the value of
each weight corresponding to an attribute = 0 51
52. Approaches: Learning from Examples
Winnow algorithm
● Initialize all weights (w1
, w2
,... wn
) to 1
● Given a new example:
○ Predict 1 if wT
x > n
○ Predict 0 otherwise
● Check the true answer
● For each input attribute:
○ If algorithm predicted 1 but true answer was 0, double the value of
every weight corresponding to an attribute = 1
○ If algorithm predicted 0 but true answer was 1, halve the value of
each weight corresponding to an attribute = 0 52
We aimed too low,
let’s try to make our
guess higher
53. Approaches: Learning from Examples
Winnow algorithm
● Initialize all weights (w1
, w2
,... wn
) to 1
● Given a new example:
○ Predict 1 if wT
x > n
○ Predict 0 otherwise
● Check the true answer
● For each input attribute:
○ If algorithm predicted 1 but true answer was 0, double the value of
every weight corresponding to an attribute = 1
○ If algorithm predicted 0 but true answer was 1, halve the value of
each weight corresponding to an attribute = 0 53
We aimed too high,
let’s try to make our
guess lower
55. Approaches: Other Approaches
More general algorithms can also be used in an online setting, such as:
● Stochastic Gradient Descent
55
56. Approaches: Other Approaches
More general algorithms can also be used in an online setting, such as:
● Stochastic Gradient Descent
● Perceptron Learning Algorithm
56
57. Current Trends / Related Areas
Adversarial Machine Learning
● Refers to scenarios where your input-generating process is an
adaptive adversary
● Applications in:
○ Information Security
○ Games
57
58. Current Trends / Related Areas
One-shot Learning
● Refers to scenarios where your must perform predictions after
seeing just a few, or even a single input sample
● Applications in:
○ Computer Vision
58
59. Links
● http://ttic.uchicago.edu/~shai/papers/ShalevThesis07.pdf
● Blum 1998 Survey Paper
● UofW CSE599S Online Learning
● Machine Learning From Streaming data
● Twitter Fighting Spam with BotMaker
● CS229 - Online Learning Lecture
● Building a real time Recommendation Engine with Data Science
● Online Optimization for Large Scale Machine Learning by prof A.
Banerjee
● Learning, Regret, Minimization and Equilibria
59