11. Gendered pronouns
female +1: she, her, hers
male +1: he, his, him
Naive Bayes classifier
Proper names
NLTK entities bigram parser (‘people’)
Social Security database of names
Naive Bayes classifier for other names
12. Social security database: Vast majority of proper names
Three features:
● Last letter of name
● Last two letters of name
● Is last letter a vowel (aeiouy)
Trained on 80% of names from Social Security names
database
Validated on 20% holdout sample
Accuracy = 80%
Naive Bayes classifier