Do you know if Alexa is lying to you? - Leeds Digital 2018
1. LEEDS / LONDON / GIBRALTAR
WWW.HOMEAGENCY.CO.UK
Do you know if alexa is lying
to you?Neill Horie
Head of Artificial Intelligence Optimisation
https://bit.ly/lyingalexa
3. Flaws/experiments
We have been continually playing and experimenting with the AI
agents with a variety of questions over the last year to see how they
fare.
4.
5. This is an apple.
You probably
recognize it, even
if like me, you
don’t like fruit. If
you ask the AI
agents what an
apple is, they will
tell you.
6. “apple”, the round
fruit of a tree of the
rose family, which
typically has thin
green or red skin
and crisp flesh.
Google will give this answer even
if you ask “what is apple”,
expecting the US company,
because Google prioritises quick
answers/SCRBs over Knowledge
Graph. Definitions always win.
7.
8.
9. Similarly, if you were Mars the
company, you’d want Google to
tell customers about you when
you ask “what is mars”. Based on
the previous slide, you may think
it’ll talk about the planet or the
Roman god of war, though.
10. But you’d be wrong! Definitions
always win.
Google has actually changed this
now, but for most of the last year,
this was the response.
If we come up with a new brand or
product, we need to consider how
assistants will describe us – if at
all.
“mar”; impair the
quality or
appearance of;
spoil.
11.
12. If you ask about the Last King of Prussia, only Cortana and Google answer.
Cortana is based on a web result, and only Google understands that the last
entry in the series of Kings of Prussia on Wikipedia is therefore the last King
of Prussia.
13.
14. Having gone through a few of my
favourite examples, here are
some more top level stats.
Take these stats with a pinch of
salt, as it’s based on our eclectic
set of questions which we have
going back a year so we can keep
a track of what’s changing, but it
should give you a flavor of things.
17. a) 46% b) 66%
c) 76% d) 96%
Amazon Alexa is able to answer 66% of our questions correctly. I don’t
count search results as answers, as they’re not that precise, and obviously
wrong answers aren’t correct. This stat is so bad for Alexa because Alexa
says it doesn’t know 25% of the time.
18.
19.
20. Cortana can answer correctly 76% of the time, but actually responds about
98% of the time. Often these results are search results, though.
23. a) 38% b) 48%
c) 78% d) 98%
Google answers correctly 78% of the time, but when it does answer, those
answers are 97% correct, so it’s mainly ones where it doesn’t know which
let it down. This used to be worse, but they’ve fixed more of the situations
like “what is mars”.
26. b) 54%
c) 64% d) 84%
Siri only answers correctly 34% of the time, which is awful. Almost half of
the time it responds with a search result, and almost half of actual answers
are wrong ,which is a big problem.
27.
28. For a little whilst in 2016/2017, if you asked Google who the King of the
United States was, it would tell you it was Barack Obama. This was from a
Brietbart source used as a quick answer.
29.
30. Let’s talk about these sources. Wikipedia is an obvious one, where lots of
information comes from – either directly in Google’s quick answers, or
indirectly via Bing or Google Knowledge Graphs.
31.
32. This is Wolfram Alpha, and is used by a lot of Siri’s answers as a knowledge
base. We can query Wolfram Alpha’s website ourselves and verify lots of it,
but it’s still a source most normal people don’t think of.
33.
34. This is Evi, which was a knowledge company bought by Amazon a few
years ago. In hindsight, it was kind of embarrassing that we had to figure
out that this was where Alexa got some answers from, but it is.
Unfortunately, you can no longer query it – the answers are dark and
impossible to interrogate.
35.
36. This leads to interesting issues, where Alexa claims that King Ecgbert (of
Wessex) was the first King of England…
38. Image Source: Odeja from Wikipedia
…when actually it was this chap, Æthelstan.
These might seem to be irrelevant examples, but if they can be wrong about
even these basic facts, don’t assume that they can understand your website
and product immediately.
39.
40. Similarly, if you ask who the next King of England will be, sometimes Google
used to tell you that it would be Prince William, because Queen Elizabeth
said so. (Based on a quick answer derived from a YouTube comment.)
41.
42. On a different note, if you ask if Brexit was a good thing you’ll probably get
told yes, because people who think it was tend to ask that question rather
than another question. As marketers, we need to think carefully about our
phrasing, almost like old-school SEO.
45. KNOW QUESTIONS
ANSWER QUESTIONS
BUILD ASSISTANTS
We are the gatekeepers
of content.
From all of our work on
SEO, we should know
what content our users
need, and they’re trying
to find out and what AI
should tell them.
Make sure all of it is
consistent and accurate.
46.
47.
48. Anecdotally, users often
trust what they’re told
rather than what they
see in search results. If
an AI assistant tells the
user the wrong thing
about your brand, the
problems are obvious.
50. appendix
Artificial Intelligence Optimisation
https://bit.ly/europeanswallow - Presentation
https://bit.ly/aiogoldrush - Blog post
AI Assistant Experiments
http://bit.ly/aioexperiments - Blog post
Can Google Tell Us The Truth?
http://bit.ly/nonfacts - Article
Alexa Versus Google
https://bit.ly/alexavgoogle - Video
Alexa Versus Accents
https://bit.ly/alexavaccents - Video
Editor's Notes
This is an apple. You probably all recognise it. I barely do, as people know me to not like fruit much. If you ask the AI agents what “an apple” is, obviously, they can tell you.
This is an apple. You probably all recognise it. I barely do, as people know me to not like fruit much. If you ask the AI agents what “an apple” is, obviously, they can tell you.
But if you ask Google what “apple” is, not, “an apple”, it still tells you this, as it prioritises the quick answers / SCRBs, especially nouns, so much.
This is what it looks like in the wild.
Now you might think if I asked about Mars, it’d tell me about the Roman God of War, or the planet, before the chocolate bar or company.
Now you might think if I asked about Mars, it’d tell me about the Roman God of War, or the planet, before the chocolate bar or company.
Now you might think if I asked about Mars, it’d tell me about the Roman God of War, or the planet, before the chocolate bar or company.
If we ask about the Last King of Prussia, only Cortana and Google can answer (and until recently, only Google). This is likely due to them not being able to interpret the series – Google pulls this from the last entry in the “Kings of Prussia” series in Wikipedia, so seems to understand that the last entry in a series is the “last”.
If we ask about the Last King of Prussia, only Cortana and Google can answer (and until recently, only Google). This is likely due to them not being able to interpret the series – Google pulls this from the last entry in the “Kings of Prussia” series in Wikipedia, so seems to understand that the last entry in a series is the “last”.
Now I’ve given you a few examples, I’m going to give you so more top level stats of how well different AI agents currently answer our selection of questions. We have a list of questions we’ve been asking for a year which we use to compare, so it’s not a rigorously scientific set, but should hopefully give you a flavour. The next stats are of how well agents are able to answer our questions: so not getting it wrong, not just giving a search result. Incidentally, this section was going to be called pub quiz, and this is the image I was shown when I searched for “pub”, and it seemed so fitting that I had to keep it.
Now I’ve given you a few examples, I’m going to give you so more top level stats of how well different AI agents currently answer our selection of questions. We have a list of questions we’ve been asking for a year which we use to compare, so it’s not a rigorously scientific set, but should hopefully give you a flavour. The next stats are of how well agents are able to answer our questions: so not getting it wrong, not just giving a search result. Incidentally, this section was going to be called pub quiz, and this is the image I was shown when I searched for “pub”, and it seemed so fitting that I had to keep it.
Doesn’t know ~25% of the time
This stat is so bad because it simply answers that it doesn’t know ~25% of the time. It’s correct when it does answer, though, at roughly 90%.
Cortana responds with something almost all of the time, but they’re normally just search results, which may or may not be right when you dig into it.
Worth noting though, that when Google does answer, it’s right 97% of the time – it’s only weird ones like “mar” where it’s wrong, and Google has got better over time – e.g. “mar”.
Yes, it’s really this bad, because half of the time it takes you to search results with no further info, and then it answers wrong a further half of the time when it actually answers!
King of the United States
King of the United States
King of the United States
King of the United States
King of the United States
King of the United States
The United Kingdom has at least one king.
The United Kingdom has at least one king.
King Ecgbert – King of Wessex
King Ecgbert – King of Wessex
King Æthelstan, actual first King of England
King Æthelstan, actual first King of England
Who will be the next King of England? Prince William, because Queen Elizabeth says so! (YouTube)
Who will be the next King of England? Prince William, because Queen Elizabeth says so! (YouTube)