Algorithms and machine learning models can unintentionally learn to classify and control people based on their data. A case study shows how optimizing for click-through rates can lead users to be clustered into "filter bubbles" and have their opinions steered over time without feedback. It is important to be aware of these risks and regulate algorithms' use of personal data to avoid unfairly profiling or manipulating individuals.
Fairness and Transparency in Machine LearningAndreas Dewes
My presentation on fairness on transparency in machine learning that I gave at the PyData Berlin. I investigated the "Stop and Frisk" dataset and tried to show how algorithms can pick up (or remove) biases from our data.
ML practitioners and advocates are increasingly finding themselves becoming gatekeepers of the modern world. The models you create have power to get people arrested or vindicated, get loans approved or rejected, determine what interest rate should be charged for such loans, who should be shown to you in your long list of pursuits on your Tinder, what news do you read, who gets called for a job phone screen or even a college admission... the list goes on. My goal in this talk is to summarize the kinds of disparate outcomes that are caused by cargo cult machine learning, and recent academic efforts to address some of them.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org.
If you are curious what is ML all about, this is a gentle introduction to Machine Learning and Deep Learning. This includes questions such as why ML/Data Analytics/Deep Learning ? Intuitive Understanding o how they work and some models in detail. At last I share some useful resources to get started.
Machine Learning has become a must to improve insight, quality and time to market. But it's also been called the 'high interest credit card of technical debt' with challenges in managing both how it's applied and how its results are consumed.
Module 4: Model Selection and EvaluationSara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
This is an introduction to text analytics for advanced business users and IT professionals with limited programming expertise. The presentation will go through different areas of text analytics as well as provide some real work examples that help to make the subject matter a little more relatable. We will cover topics like search engine building, categorization (supervised and unsupervised), clustering, NLP, and social media analysis.
Fairness and Transparency in Machine LearningAndreas Dewes
My presentation on fairness on transparency in machine learning that I gave at the PyData Berlin. I investigated the "Stop and Frisk" dataset and tried to show how algorithms can pick up (or remove) biases from our data.
ML practitioners and advocates are increasingly finding themselves becoming gatekeepers of the modern world. The models you create have power to get people arrested or vindicated, get loans approved or rejected, determine what interest rate should be charged for such loans, who should be shown to you in your long list of pursuits on your Tinder, what news do you read, who gets called for a job phone screen or even a college admission... the list goes on. My goal in this talk is to summarize the kinds of disparate outcomes that are caused by cargo cult machine learning, and recent academic efforts to address some of them.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org.
If you are curious what is ML all about, this is a gentle introduction to Machine Learning and Deep Learning. This includes questions such as why ML/Data Analytics/Deep Learning ? Intuitive Understanding o how they work and some models in detail. At last I share some useful resources to get started.
Machine Learning has become a must to improve insight, quality and time to market. But it's also been called the 'high interest credit card of technical debt' with challenges in managing both how it's applied and how its results are consumed.
Module 4: Model Selection and EvaluationSara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
This is an introduction to text analytics for advanced business users and IT professionals with limited programming expertise. The presentation will go through different areas of text analytics as well as provide some real work examples that help to make the subject matter a little more relatable. We will cover topics like search engine building, categorization (supervised and unsupervised), clustering, NLP, and social media analysis.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Module 1 introduction to machine learningSara Hooker
We believe in building technical capacity all over the world.
We are building and teaching an accessible introduction to machine learning for students passionate about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our work, visit www.deltanalytics.org
This Machine Learning Algorithms presentation will help you learn you what machine learning is, and the various ways in which you can use machine learning to solve a problem. At the end, you will see a demo on linear regression, logistic regression, decision tree and random forest. This Machine Learning Algorithms presentation is designed for beginners to make them understand how to implement the different Machine Learning Algorithms.
Below topics are covered in this Machine Learning Algorithms Presentation:
1. Real world applications of Machine Learning
2. What is Machine Learning?
3. Processes involved in Machine Learning
4. Type of Machine Learning Algorithms
5. Popular Algorithms with a hands-on demo
- Linear regression
- Logistic regression
- Decision tree and Random forest
- N Nearest neighbor
What is Machine Learning: Machine Learning is an application of Artificial Intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Edureka!
This Edureka Random Forest tutorial will help you understand all the basics of Random Forest machine learning algorithm. This tutorial is ideal for both beginners as well as professionals who want to learn or brush up their Data Science concepts, learn random forest analysis along with examples. Below are the topics covered in this tutorial:
1) Introduction to Classification
2) Why Random Forest?
3) What is Random Forest?
4) Random Forest Use Cases
5) How Random Forest Works?
6) Demo in R: Diabetes Prevention Use Case
You can also take a complete structured training, check out the details here: https://goo.gl/AfxwBc
Module 9: Natural Language Processing Part 2Sara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org .
Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Machine learning focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.
This Edureka Sentiment Analysis tutorial will help you understand all the basics of Sentiment Analysis algorithm along with examples. This tutorial also has an interesting demo on Sentiment Analysis in R - El Clasico Sentiment Analysis. Below are the topics covered in this tutorial:
1. What is Machine Learning?
2. Why Sentiment Analysis?
3. What is Sentiment Analysis?
4. How Sentiment Analysis Works?
5. Sentiment Analysis Demo - El Clasico
6. Sentiment Analysis Use Case
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this course content for teaching, please reach out to inquiry@deltanalytics.org
Module 8: Natural language processing Pt 1Sara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org .
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Francesca Lazzeri, PhD
Machine learning model fairness and interpretability are critical for data scientists, researchers and developers to explain their models and understand the value and accuracy of their findings. Interpretability is also important to debug machine learning models and make informed decisions about how to improve them. In this session, Francesca will go over a few methods and tools that enable you to “unpack" machine learning models, gain insights into how and why they produce specific results, assess your AI systems fairness and mitigate any observed fairness issues.
Using open source fairness and interpretability packages, attendees will learn how to:
- Explain model prediction by generating feature importance values for the entire model and/or individual datapoints.
- Achieve model interpretability on real-world datasets at scale, during training and inference.
- Use an interactive visualization dashboard to discover patterns in data and explanations at training time.
- Leverage additional interactive visualizations to assess which groups of users might be negatively impacted by a model and compare multiple models in terms of their fairness and performance.
YouTube Link: https://youtu.be/aGu0fbkHhek
** Data Science Master Program: https://www.edureka.co/masters-program/data-scientist-certification **
This Edureka PPT on "Data Science Full Course" provides an end to end, detailed and comprehensive knowledge on Data Science. This Data Science PPT will start with basics of Statistics and Probability and then moves to Machine Learning and Finally ends the journey with Deep Learning and AI. For Data-sets and Codes discussed in this PPT, drop a comment.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Module 1 introduction to machine learningSara Hooker
We believe in building technical capacity all over the world.
We are building and teaching an accessible introduction to machine learning for students passionate about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our work, visit www.deltanalytics.org
This Machine Learning Algorithms presentation will help you learn you what machine learning is, and the various ways in which you can use machine learning to solve a problem. At the end, you will see a demo on linear regression, logistic regression, decision tree and random forest. This Machine Learning Algorithms presentation is designed for beginners to make them understand how to implement the different Machine Learning Algorithms.
Below topics are covered in this Machine Learning Algorithms Presentation:
1. Real world applications of Machine Learning
2. What is Machine Learning?
3. Processes involved in Machine Learning
4. Type of Machine Learning Algorithms
5. Popular Algorithms with a hands-on demo
- Linear regression
- Logistic regression
- Decision tree and Random forest
- N Nearest neighbor
What is Machine Learning: Machine Learning is an application of Artificial Intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed.
- - - - - - - -
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars.This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
- - - - - - -
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
- - - - - -
What skills will you learn from this Machine Learning course?
By the end of this Machine Learning course, you will be able to:
1. Master the concepts of supervised, unsupervised and reinforcement learning concepts and modeling.
2. Gain practical mastery over principles, algorithms, and applications of Machine Learning through a hands-on approach which includes working on 28 projects and one capstone project.
3. Acquire thorough knowledge of the mathematical and heuristic aspects of Machine Learning.
4. Understand the concepts and operation of support vector machines, kernel SVM, naive Bayes, decision tree classifier, random forest classifier, logistic regression, K-nearest neighbors, K-means clustering and more.
5. Be able to model a wide variety of robust Machine Learning algorithms including deep learning, clustering, and recommendation systems
- - - - - - -
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Edureka!
This Edureka Random Forest tutorial will help you understand all the basics of Random Forest machine learning algorithm. This tutorial is ideal for both beginners as well as professionals who want to learn or brush up their Data Science concepts, learn random forest analysis along with examples. Below are the topics covered in this tutorial:
1) Introduction to Classification
2) Why Random Forest?
3) What is Random Forest?
4) Random Forest Use Cases
5) How Random Forest Works?
6) Demo in R: Diabetes Prevention Use Case
You can also take a complete structured training, check out the details here: https://goo.gl/AfxwBc
Module 9: Natural Language Processing Part 2Sara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org .
Machine learning is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed. Machine learning focuses on the development of computer programs that can teach themselves to grow and change when exposed to new data.
This Edureka Sentiment Analysis tutorial will help you understand all the basics of Sentiment Analysis algorithm along with examples. This tutorial also has an interesting demo on Sentiment Analysis in R - El Clasico Sentiment Analysis. Below are the topics covered in this tutorial:
1. What is Machine Learning?
2. Why Sentiment Analysis?
3. What is Sentiment Analysis?
4. How Sentiment Analysis Works?
5. Sentiment Analysis Demo - El Clasico
6. Sentiment Analysis Use Case
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this course content for teaching, please reach out to inquiry@deltanalytics.org
Module 8: Natural language processing Pt 1Sara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org .
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Francesca Lazzeri, PhD
Machine learning model fairness and interpretability are critical for data scientists, researchers and developers to explain their models and understand the value and accuracy of their findings. Interpretability is also important to debug machine learning models and make informed decisions about how to improve them. In this session, Francesca will go over a few methods and tools that enable you to “unpack" machine learning models, gain insights into how and why they produce specific results, assess your AI systems fairness and mitigate any observed fairness issues.
Using open source fairness and interpretability packages, attendees will learn how to:
- Explain model prediction by generating feature importance values for the entire model and/or individual datapoints.
- Achieve model interpretability on real-world datasets at scale, during training and inference.
- Use an interactive visualization dashboard to discover patterns in data and explanations at training time.
- Leverage additional interactive visualizations to assess which groups of users might be negatively impacted by a model and compare multiple models in terms of their fairness and performance.
YouTube Link: https://youtu.be/aGu0fbkHhek
** Data Science Master Program: https://www.edureka.co/masters-program/data-scientist-certification **
This Edureka PPT on "Data Science Full Course" provides an end to end, detailed and comprehensive knowledge on Data Science. This Data Science PPT will start with basics of Statistics and Probability and then moves to Machine Learning and Finally ends the journey with Deep Learning and AI. For Data-sets and Codes discussed in this PPT, drop a comment.
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Slide presentasi ini dibawakan oleh Imron Zuhri dalam acara Seminar & Workshop Pengenalan & Potensi Big Data & Machine Learning yang diselenggarakan oleh KUDO pada tanggal 14 Mei 2016.
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Intel® Software
This session explains how solutions desired by such IT/Internet/Silicon Valley etc companies can look like, how they may differ from the more “classical” consumers of machine learning and analytics, and the arising challenges that current and future HPC development may have to cope with.
what-is-machine-learning-and-its-importance-in-todays-world.pdfTemok IT Services
Machine Learning is an AI method for teaching computers to learn from their mistakes. Machine learning algorithms can “learn” data directly from data without using an equation as a model by employing computational methods.
https://bit.ly/RightContactDataSpecialists
This was part of my inaugural lecture of Summer Internship on Machine Learning at NMAM Institute of Technology, Nitte on 7th June, 2018. A lot more than what was on this presentation was discussed. We spoke on the ethics of choices we make as developers, socio-cultural impact of AI and ML and the political repercussions of deploying ML and AI.
Overview of Machine learning concepts – Over fitting and train/test splits, Types of Machine learning – Supervised, Unsupervised, Reinforced learning, Introduction to Bayes Theorem, Linear Regression- model assumptions, regularization (lasso, ridge, elastic net), Classification and Regression algorithms- Naïve Bayes, K-Nearest Neighbors, logistic regression, support vector machines (SVM), decision trees, and random forest, Classification Errors..
Machine learning for sensor Data AnalyticsMATLABISRAEL
במצגת זאת נראה כיצד עושים Machine Learning בסביבת MATLAB. נציג מספר יכולות ואפליקציות מובנות ההופכות את תהליך למידת המכונה ליעיל ומהיר יותר – כלים כמו ה-Classification Learner, ה-Regression Learner ו-Bayesian Optimization. בהסתמך על מידע המתקבל מחיישני סמארטפון, נבנה מערכת סיווג המזהה את הפעילות שמבצע המשתמש – הליכה, טיפוס במדרגות, שכיבה, וכו'
Machine Learning: Addressing the Disillusionment to Bring Actual Business Ben...Jon Mead
'Machine learning’ is one of those cringy phrases, almost (if not already) taboo in the world of high-tech SaaS. Applying true machine learning to an organization’s product(s), however, can have real benefit for the business, its clients, and the industry as a whole. From credit card fraud investigations to the way that a car is built, machine learning has permeated our everyday life without a common understanding of what it is and how to implement it.
The Ultimate Guide to Machine Learning (ML)RR IT Zone
Machine learning is a broad term that refers to a variety of techniques that computers learn to do. These include speech recognition, natural language processing, and computer vision. But it’s also the concept behind things like Google Search, and Facebook’s Like button. With machine learning, machines can learn to do things that only humans can do. For example, your smartphone can translate languages with a combination of artificial intelligence, big data, and the internet. It can identify faces in photos, recognize text, and analyze other information—all without human intervention. In addition, machine learning is used to train robots, predict customer behavior, and even build virtual reality environments.
Code is not text! How graph technologies can help us to understand our code b...Andreas Dewes
Today, we almost exclusively think of code in software projects as a collection of text files. The tools that we use (version control systems, IDEs, code analyzers) also use text as the primary storage format for code. In fact, the belief that “code is text” is so deeply ingrained in our heads that we never question its validity or even become aware of the fact that there are other ways to look at code.
In my talk I will explain why treating code as text is a very bad idea which actively holds back our understanding and creates a range of problems in large software projects. I will then show how we can overcome (some of) these problems by treating and storing code as data, and more specifically as a graph. I will show specific examples of how we can use this approach to improve our understanding of large code bases, increase code quality and automate certain aspects of software development.
Finally, I will outline my personal vision of the future of programming, which is a future where we no longer primarily interact with code bases using simple text editors. I will also give some ideas on how we might get to that future.
Learning from other's mistakes: Data-driven code analysisAndreas Dewes
Static code analysis is an useful tool that can help to detect bugs early in the software development life cycle. I will explain the basics of static analysis and show the challenges we face when analyzing Python code. I will introduce a data-driven approach to code analysis that makes use of public code and example-based learning and show how it can be applied to analyzing Python code.
I will explain why quantum computing is interesting, how it works and what you actually need to build a working quantum computer. I will use the superconducting two-qubit quantum processor I built during my PhD as an example to explain its basic building blocks. I will show how we used this processor to achieve so-called quantum speed-up for a search algorithm that we ran on it. Finally, I will give a short overview of the current state of superconducting quantum computing and Google's recently announced effort to build a working quantum computer in cooperation with one of the leading research groups in this field.
Demonstrating Quantum Speed-Up with a Two-Transmon Quantum Processor Ph.D. d...Andreas Dewes
The accompanying slides of my PhD defense presentation on experimental quantum computing, held at the CEA Saclay in November 2012.
Please not that some slides appear "broken" due to the animation sequences they contain, to get a correct view of the presentation, just download the PPTX.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
3. Outline
Theory
1. Algorithms
2. Machine Learning
3. Big Data & Consequences for Machine Learning
4. Use of Algorithms Today and in the Future
Experiments
1. Discriminating people with machine learning & algorithms
2. Creating persistent user identities by (accidental) de-
anonymization
Summary & Outlook
1. Strategies for Handling Data Responsibly
5. Algorithms
An algorithm is a "recipe" that gives a computer (or a
human) step-by-step instructions in order to achieve a
certain goal.
Start
Door
bell
ringing
Andreas
stands on
trapdoor?
Open
trapdoor
Wait.
Our time
will
come.
yes
no
6. Machine Learning
A machine learning algorithm automatically generates
models and checks them against the training data we
provide, trying to find a model that explains the data well
and can predict unknown data.
7. Data vs. Model
𝒙 𝑦 = 𝑚 𝒙, 𝒑 + 𝜀
see e.g. "Machine Learning" by Tom Mitchell (McGraw Hill, 1997).
y
x1
8. Data vs. Model
𝒙 𝑦 = 𝑚 𝒙, 𝒑 + 𝜀
see e.g. "Machine Learning" by Tom Mitchell (McGraw Hill, 1997).
y
x1
9. Sources of Error
𝜀 = 𝜀 𝑠𝑦𝑠 + 𝜀 𝑛𝑜𝑖𝑠𝑒 + 𝜀ℎ𝑖𝑑𝑑𝑒𝑛
systematic errors arise due to
imperfect measurements of
known variables
noise is present due to
the nature of the process
or our measurement apparatus
many variables are
usually unknown to us
10. Big Data & Machine Learning
2000 2015
more data sources
high data volume
higher density
higher frequency
longer retention
13. Exploiting New Sources of Data
𝑦 = 𝑚 𝑥, 𝑝 + 𝜀ℎ𝑖𝑑𝑑𝑒𝑛 + ⋯
incorporate variables that were hidden
into the model, reducing error
14. Understanding Results
Models can be easy or very difficult to interpret
Parameter space is often huge and can't be
explored entirely
age > 37 ?
height < 1.78 projects > 19 ?
decision tree classifier (easy to interpret) neural network classifier (hard to interpret
yes no
15. Example: Deep Learning for Image
Recognition
http://googleresearch.blogspot.com/2015/06/inceptionism-going-deeper-into-neural.html
16. Classifying Use of Algorithms
low risk
mildly annoying in case of failure /
misbehaviour
medium risk
large impact on our life in
case of failure / misbehaviour
high risk
critical impact on our
life in case of failure /
misbehaviour
17. low risk
personalization of services
(e.g. recommendation engines for webs
video-on-demand, content, ...)
individualized ad targeting
customer rating / profiling
consumer demand prediction
19. military intelligence / intervention
political oppression
critical infrastructure services (e.g. elect
life-changing decisions (e.g. about healt
high risk
23. Discrimination
Discrimination is treatment or consideration of, or making
a distinction in favor of or against, a person or thing based
on the group, class, or category to which that person or
thing is perceived to belong to rather than on individual
merit.
Wikipedia
Protected attributes (examples):
Ethnicity, Gender, Sexual Orientation, ...
24. When is a process discriminating?
Disparate Impact: Adverse impact of a process C on a given
group X
Outcome X = 0 X = 1
C = NO P(C = NO, X = 0) P(C = NO, X = 1)
C = YES P(C = YES, X = 0) P(C =YES,X = 1)
𝑃 𝐶 = 𝑌𝐸𝑆 𝑋 = 0
𝑃 𝐶 = 𝑌𝐸𝑆 𝑋 = 1
< τ
see e.g. "Certifying and Removing Disparate Impact" M. Feldman et. al.
25. When is a process discriminating?
Estimating with real-world data
Outcome X = 0 X = 1
C = NO a b
C = YES c d
𝑐/ 𝑎 + 𝑐
𝑑/ 𝑏 + 𝑑
< τ
26. Discrimination through Data Analysis
Replacing a manual hiring process with
an automated one.
Benefits:
Save time screening CVs by hand
Improve candidate choice
28. The Setup
Use submitted information (CV, work
samples) along with publicly available /
external information to predict candidate
success.
Use data from the manual process (invite/ no
invite) to train the classifier
Provide it with as much data as possible to
29. Our decision model
𝑆 = 𝑚 𝑌 + 𝑑 𝑋 + 𝜀
score of candidate
(merit function) discrimination
malus/bonus
hidden variables &
luck (if you believe in it)
𝐶 =
𝑌𝐸𝑆, 𝑆 > 𝑡
𝑁𝑂, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
luckcandidate merit
without discrimination with discrimination
30. Training a predictor for C
𝐶 𝑌, 𝑍
information about Y
(unprotected attributes)
additional information
we give to the algorithm
𝒁 ∝ 𝑋 + 𝜀 𝛾
we can predict the value of X from Z with fidelity
31. A Simulation
• Generate 10.000 samples of C with disparate impact
• Train a classifer (e.g. Support-Vector-Machine) on
the test data
• Provide it with (noisy) information about X
• Measure the algorithm-based on the test data
38. Why give that information to the
algorithm?
𝒁
We don't! But it leaks through anyway...
𝑋
39. But can it be done?
Discrimination through information
leakage is possible, but how likely is it in
practice?
Let's try!
We use publicly available data to predict
the gender of Github users (protected
attribute X).
40. Basic Information
Manually classify users as men/women (by looking at
profile pictures, names) -> 5.000 training samples with
small error
Use the Github API to retrieve information about users
(followers, repositories, stargazers, contributions, ...)
We only use data that is easy to get and likely to be used in
real-world setting for classification
We only use a limited dataset (proof of concept, not
44. Commit Message Analysis
Use the commit messages (as obtained from the event
data) to predict gender by training a Support Vector
Machine (SVM) classifier on the word frequency data.
lol
emoji
wtf
serious
ly
rtfm
dude
fuck
git
45. Predictive Power of Model
15 % 35 % error50 % baseline fidelity
30 % information leakage
(with a very simple data set)
46. Takeaways
Algorithms will readily "learn"
discrimination from us if we provide
them with contaminated training
data.
Information leakage of protected
attributes can happen easily.
47. How we can fix this
Harder than you might think! We need to know X to
measure disparate impact and remove it
Incorporate penality for discrimination into target
function
Remove information about X from dataset by
performing a suitable transformation (reduces
fidelity of model)
see e.g. "Certifying and Removing Disparate Impact" M. Feldman et. al
49. What is de-anonymization?
Use data recorded about individuals / entities
to identify those same individuals / entities in
another set of data (exactly or with high
likelihood).
Deanonymization becomes an increasing risk as datasets
about individual entities become larger and more detailed.
50. "Buckets of Truth"
N boolean attributes per entity - on average M < N of them
are set
𝑃𝑐𝑜𝑙. = 𝑃(𝑀1
1
= 𝑀1
2
, ⋯, 𝑀 𝑁
1
= 𝑀 𝑁
2
)
fun with deanonymization: http://en.akinato
51. Examples
𝑃𝑐𝑜𝑙. = 1 − 2𝑝(1 − 𝑝) 𝑁
uniform distribution long-tailed distribution
𝑃𝑐𝑜𝑙. = ?
61. Testing De-Anonymization
Use 75 % of the trajectories as prior data set
Predict the user ID belonging to the remaining
25 %
Measure average success probability and
identification rank (i.e. at which position is the
correct user)
64. Possible Improvements
Use Temporal / Sequence Information
Use speed of movement / mode of transportation
Improve choice of buckets for fingerprinting
Interesting Review Article: "Life in the network: the coming age of computational social science." D. Laze
65. Summary
The more data we have, the more difficult it is
to keep algorithms from directly learning and
using object identities instead of attributes.
Our data follows us around!
67. As Data Scientists / Analysts /
Programmers
Consume data responsibly: Don't include everything
under the sun just because it increases fidelity by a
slim margin
Check for disparate impact and remove it from the
input data
Test anonymization safety by using machine learning
68. As Citizens / Hackers / Users
Do not blindly trust decisions made by algorithms
Test them if possible (using different input values)
Reverse-engineer them (using e.g. active learning)
Fight back with data: Collect and analyze
algorithm-based decisions using collaborative
approaches
69. As a Society
Create better regulations for algorithms and their
use
Force companies / organizations to open up black
boxes
Making access to data easier, also for small
organizations
76. Case Study: Click Rate Optimization
Simple but common use case for big data: Collaborative
filtering
• Users have an opinion on a given topic A (between 0-1)
• They are more likely to like articles that confirm their
opinion
• Our algorithm knows nothing about A, just tries to
optimize click rate
• User opinion may change over time according to the
content he/she is exposed to (2 % change per exposure)
81. Clustering users into groups
Similarity measure: # Articles that both users like or dislike
Clustering: K-Means (minimize distance within clusters, maximize distance betw
82. Like Rate vs. Articles Viewed
with click-rate
optimization