3. What Makes a
Good Model?
Team
Grant
dammnit I'm lit.
&dammnit I kn0
ders b0ut2be kiLLer
traFFic! & ya d0nt
even kn0 h0w haPPy I
am dats its back2sch00l
Or, Twitter Sentiment Analysis:
using models to classify tweets
so you don’t have to
4. a good model is
1. Valuable
2. Accurate
3. Sophisticated
4. Agile
11. of the 23% the model got wrong…
model error 41%
neutral 30%
human error 15%
other 13%
You ever have those days
where you feel like you = FAIL.
Yeah. It's one of those days.
Model + / Human -
UP is intense! i cried
and laughed
Model - / Human +
Sorry, typo -
Environmentalism.
Model - / Human +
@Zee It's good, but buggy
like a motherfucker.
Model + / Human -
I really hate twitter... i don't
know what i'm doing here
Model - / Human +
so tierd could drop
DEAD x
Model - / Human +
ActiveRecord::HasManyThroughSourceAssoc
iationMacroError: Invalid source reflection
macro :has_one for has_many ->
http://bit.ly/135UWH
Model + / Human -
@Dichenlachman I like that you
abbreviated bathrooms to b'throoms when
b'throoms is the same no. of letters as
bathrooms... Bathrooms
Model - / Human +
15. made it hurt like a motherfucker fuck my life & i
am not that short & your tall & i did grow some
balls & date night tonight htp bit ly/nos
MADD-E. it hurt like a MOTHERFUCKER fuck
my life & I am not that short & yr tall & i did grow
some balls & date night tonight!1!
http://bit.ly/Nos9D
1 raw tweet
2
5 vectorize [ 0 0 1 0 0 … 0 0 1 0 0 1 ] 6 model
MADD-E. it huuuurt like a MOTHERFUCKER fml
& i’m not that short & yr tall & i
did grow some balls & date night tonight!1!
http://bit.ly/Nos9D
3
expand contractions
social media lexicon
corrected XML
repeat replace
spellcheck
remove punctuation
remove numbers
all lowercase
4 uni-grams
bi-grams { made, it, made it, … }
16. why didn’t we do other cool NLP stuff?
0.74
0.75
0.76
0.77
0.78
tweets what we did english only remove
Twitter
symbols
remove
stopwords
stem
accuracy
17. why didn’t we do other cool NLP stuff?
0.74
0.75
0.76
0.77
0.78
tweets what we did english only remove
Twitter
symbols
remove
stopwords
stem
accuracy
27. 1. genetically diverse
2. ensemble can handle more libraries / classifiers
3. modular design
a) NLP
b) feature detection
c) models
4. sequential checks
5. quick enough to classify the firehose
6. easily incorporate new cases for re-training
Editor's Notes
Stas
Stas
What it says.
Stas
Order is important.
Stas
Stas
Stas
Qahir
Qahir
Qahir
Data went through Google’s pre-trained sentiment detector
it was terrible
And then was trained using google algorithms
better, but not as good as our model
Qahir
Confusion matrix
Reiterating results
Rounding errors
Symmetry indicates a lack of bias
Chad
[slide can be modified to be all percentages if required]
We got our hands dirty and looked at the 23% of tweets the model got wrong
41% due to model error
[first tweet] makes sense; strong negative sentiment
[second tweet] the model failed
30% were neutral
[first tweet] depending on your political views…
[second tweet] contains two distinct sentiments.
15% were grad student error
[first tweet] clear negative sentiment, labelled positive
[second tweet] clear negative sentiment again, labelled positive
30% were for other reasons
[first tweet] some sort of lookup error
[second tweet] uh…
Chad
Chad
Chad
Chad
Read raw tweet
expanded contractions, like I’m to iam
Social media lexicon dealt with acronyms like fml
corrected xml extraction errors
Dealt with repeated characters.
Then spellcheck
Removed punctuation and numbers
Normalized case
Created one and two word features
Finally vectorised features into ones and zeroes