Successfully reported this slideshow.

×

Introduction to machine learning - Ray Poynter - NewMR webinar 2019

An Introduction to Machine Learning for Insight Professionals
Ray Poynter, NewMR. Here are the ten things you need to know about machine learning and its application to the gathering and utilisation of insights.

An Introduction to Machine Learning for Insight Professionals
Ray Poynter, NewMR. Here are the ten things you need to know about machine learning and its application to the gathering and utilisation of insights.

More Related Content

Introduction to machine learning - Ray Poynter - NewMR webinar 2019

1. 1. An Introduction to Machine Learning for Insight Professionals 10 things you need to know June 2019 Ray Poynter NewMR
2. 2. #NewMR Sponsors June 2019 Communication Gold Silver
3. 3. 10 Tips 1. Supervised machine learning needs a training set 2. The model is a black box 3. Brand new inputs confuse 4. When the world changes, models can fail 5. The inputs need to define the outputs 6. The training set has to be big enough 7. ML can learn and embody bias 8. Unsupervised machine learning is not like supervised ML 9. Intelligent interviews apply algorithms and/or lists 10. There are five key links between ML/AI and MR
4. 4. 1 – Supervised Machine Learning Needs a training set Training Inputs Outputs Training Algorithm Trained AlgorithmLive Inputs Outputs
5. 5. Supervised Learning Example Training Data 1 Odd 2 Even 1 Odd 3 Odd 2 Even The learning algorithm discovers there are two outcomes, Odd and Even. 1 and 3 result in Odd 2 results in Even
6. 6. Supervised Learning Example Training Data 1 Odd 2 Even 1 Odd 3 Odd 2 Even 1 Odd 4 Even 1 Even 1 Odd There are still two outcomes Odd and Even 3 results in Odd 2 and 4 result in Even 1 usually results in Odd
7. 7. Supervised Learning Example Training Data 1 Odd 2 Even 1 Odd 3 Odd 2 Even 1 Odd 4 Even 1 Even 1 Odd 14 Even 13 Odd 24 Even 301 Odd There are still two outcomes Odd and Even Two possible models (at least) A 1, 3, 13 & 301 are probably Odd 2, 4, 14 & 24 are Even B Numbers ending in 1 or 3 are probably Odd Numbers ending in 2 or 4 are Even
8. 8. 2 – Supervised Learning You usually don’t know how the model works Training Data 1 Odd 2 Even 1 Odd 3 Odd 2 Even 1 Odd 4 Even 1 Even 1 Odd 14 Even 13 Odd 24 Even 301 Odd You do not know if this training produced a model like A or B, or something different. A 1, 3, 13 & 301 are probably Odd 2, 4, 14 & 24 are Even B Numbers ending in 1 or 3 are probably Odd Numbers ending in 2 or 4 are Even C ?
9. 9. 3 – Supervised Learning Brand new inputs confuse the model Training Data 1 Odd 2 Even 1 Odd 3 Odd 2 Even 1 Odd 4 Even 1 Even 1 Odd 14 Even 13 Odd 24 Even 301 Odd If the next input is a 7 the system will either: 1. Tell you that it can’t allocate it 2. Give you the wrong answer
10. 10. 4 – Supervised Learning When the world changes, the model can fail Training Data - Utopia Island Ferry Passengers Sunny 100 Cloudy 50 Rainy 10 Sunny 120 Sunny 110 Rainy 15 Cloudy 45 Rainy 10 Observed Sunny 30 Sunny 25 Cloudy 15 Rainy 8 The model learns how to predict the number of passengers from the weather (perhaps also using day of the week and season). After the bridge is opened a new model has to be trained
11. 11. 5 – Supervised Learning The outputs have to be defined by the inputs Training Data Prefers Coke, Pepsi, Neither Male, 33 years, Single -> Pepsi Female, 24 years, Married -> Coke Female, 55 years, Single -> Coke Female, 33 years, Divorced -> Neither Male, 33 years, Divorced -> Neither There are too many other factors which impact Coke, Pepsi, Neither preference
12. 12. 6 – Supervised Learning The training set has to be big enough Training Data - Utopia Island Ferry Passengers I liked the boat ride, the scenery was great -> Positive The fumes were awful -> Negative I had toothache so I did not enjoy it -> Neutral Abcdh -> NA The cliffs were enormous -> Positive The queues were enormous -> Negative You are likely to use a two-step approach 1. Using algorithms that can interpret open-ended text 2. Train it with a training set of 1000 to 5000 cases coded by a human The more complex the relationship, the bigger the training data set needs to be.
13. 13. 6 – Supervised Learning The training set has to be big enough Produces Training Sets • Coding open-ends, with training sets of say 1,000 to 5,000 • Coding open-ends for tracking projects • Linking stimuli to predicted share – after 1000+ tests • Linking surveys to bad response patterns More problematic • Turning data into reports – need 1000+ similar data & output reports • Turning research briefs into surveys – 1000+ similar cases • Linking survey responses to market outcomes – need the market outcomes for 1000+ cases, for a specific country and category In Market Research there will be many situations where a training set can’t be created, either because the number of cases is too small, or because the outcomes aren’t known in sufficient detail.
14. 14. 7 – Supervised Learning ML can learn and embody bias Training Data – Tech Company Male Average -> Interview Male Poor -> Reject Female Excellent -> Interview Female Poor -> Reject Female Average -> Reject Male Excellent -> Interview Male Average -> Interview Female Average -> Reject Amazon in 2018 had to scrap its AI hiring system, because it was biased against women. It had learned what the company did and hard-coded it. The model Male Excellent -> Interview Male Average -> Interview Female Excellent -> Interview ELSE -> Reject
15. 15. 8 – Unsupervised Machine Learning Is not like supervised machine learning No training set, just real data Finds patterns in the data Directed and evaluated by the user
16. 16. 8 – Unsupervised Learning Is not like supervised learning No training set, just real data Finds patterns in the data
17. 17. 8 – Unsupervised Learning Is not like supervised learning A K-means clustering is a simple example of unsupervised machine learning
18. 18. 8 – Unsupervised Learning Is not like supervised learning When you do cluster analysis you specify: 1 The variables 2 The algorithm 3 The number of clusters 4 The criteria for clustering 5 And you check them with a human With topic modelling with text, you specify: 1 The text 2 The algorithm (including NLP, bag of words, stemming etc) 3 The number of topics 4 The criterion for clustering 5 And you check them with a human
19. 19. 9 – Intelligent Interviews Apply algorithms and/or lookup lists Do you prefer A or B? (winner = W1) Do you prefer C or D? (winner = W2) Do you prefer W1 or W2? What did you like about the advert? [IF response includes fruit*, apple*, orange* THEN Probe] Voice Interview – e.g. via Alexa What did you like about the advert? [interpret voice response IF response includes fruit*, apple*, orange* THEN Probe]
20. 20. 10 – Five key links between ML / AI & Market Research 1. Supervised Machine Learning Common tasks where the outcome is known and defined 2. Unsupervised Machine Learning Suggesting possible solutions and patterns in the data 3. Expert Systems Where we can define what we do, for example project design, data cleaning, project management 4. Adaptive/Intelligent Surveys Where an algorithm can guide adaptions 5. Leveraging systems built with AI Text analysis, automatic voice-to-text, translation etc
21. 21. Summary of Machine Learning & MR • Supervised machine learning needs a training set – so it is mostly about things that produce repetition • Unsupervised machine learning is mostly about investigating data • Expert systems relate to things where we can define either the answers (e.g. lookup lists) or the processes (e.g. pick the most popular two options and probe) • Often the main use of AI is to use something that has been created by AI (e.g. voice to text transcription) – rather than create new AI solutions.