SlideShare a Scribd company logo
Words, words, words
Reading Shakespeare with Python
Prologue
Motivation
How can we use Python to supplement our
reading of Shakespeare?
How can we get Python to read for us?
Act I
Why Shakespeare?
Polonius: What do you read, my
lord?
Hamlet: Words, words, words.
P: What is the matter, my lord?
H: Between who?
P: I mean, the matter that you
read, my lord.
--II.2.184
Why Shakespeare?
(Also the XML)
(thank you, https://github.
com/severdia/PlayShakespeare.
com-XML !!!)
Shakespeare XML
Shakespeare XML
Challenges
• Language, especially English, is messy
• Texts are usually unstructured
• Pronunciation is not standard
• Reading is pretty hard!
Humans and Computers
Nuance
Ambiguity
Close reading
Counting
Repetitive tasks
Making graphs
Humans are good at: Computers are good at:
Act II
(leveraging metadata)
Who is the main
Character in _______?
Who is the main character in Hamlet?
Number of Lines
Who is the main character in King Lear?
Number of Lines
Who is the main character in Macbeth?
Number of Lines
Who is the main character in Othello?
Number of Lines
Iago and Othello, Detail
Number of Lines
Obligatory Social Network
Act III
First steps with
natural language processing (NLP)
What are
Shakespeare’s most
interesting rhymes?
Shakespeare’s Sonnets
• A sonnet is 14 line poem
• There are many different rhyme schemes a
sonnet can have; Shakespeare was pretty
unique in choosing one
• This is a huge win for us, since we can “hard
code” his rhyme scheme in our analysis
Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer’s lease hath all too short a date;
Sometime too hot the eye of heaven shines,
And often is his gold complexion dimm'd;
And every fair from fair sometime declines,
By chance or nature’s changing course untrimm'd;
But thy eternal summer shall not fade,
Nor lose possession of that fair thou ow’st;
Nor shall death brag thou wander’st in his shade,
When in eternal lines to time thou grow’st:
So long as men can breathe or eyes can see,
So long lives this, and this gives life to thee.
http://www.poetryfoundation.org/poem/174354
a
b
a
b
c
d
c
d
e
f
e
f
g
g
Sonnet 18
Rhyme Distribution
• Most common rhymes
• nltk.FreqDict
Frequency Distribution
• Given a word, what is the frequency distribution of
the words that rhyme with it?
• nltk.ConditionalFreqDict
Conditional Frequency Distribution
Rhyme Distribution
Rhyme Distribution
1) “Boring” rhymes: “me” and “thee”
2) “Lopsided” rhymes: “thee” and “usury”
Interesting Rhymes?
Act IV
Classifiers 101
Writing code
that reads
Our Classifier
Can we write code to tell if a given speech is
from a tragedy or comedy?
● Requires labeled text
○ (in this case, speeches labeled by genre)
○ [(<speech>, <genre>), ...]
● Requires “training”
● Predicts labels of text
Classifiers: overview
Classifiers: ingredients
● Classifier
● Vectorizer, or Feature Extractor
● Classifiers only interact with features, not
the text itself
Vectorizers (or Feature Extractors)
● A vectorizer, or feature extractor, transforms a text into
quantifiable information about the text.
● Theoretically, these features could be anything. i.e.:
○ How many capital letters does the text contain?
○ Does the text end with an exclamation point?
● In practice, a common model is “Bag of Words”.
Bag of Words is a kind of feature extraction
where:
● The set of features is the set of all words in
the text you’re analyzing
● A single text is represented by how many of
each word appears in it
Bag of Words
Bag of Words: Simple Example
Two texts:
● “Hello, Will!”
● “Hello, Globe!”
Bag of Words: Simple Example
Two texts:
● “Hello, Will!”
● “Hello, Globe!”
Bag: [“Hello”, “Will”, “Globe”]
“Hello” “Will” “Globe”
Bag of Words: Simple Example
Two texts:
● “Hello, Will!”
● “Hello, Globe!”
Bag: [“Hello”, “Will”, “Globe”]
“Hello” “Will” “Globe”
“Hello,
Will”
1 1 0
“Hello,
Globe”
1 0 1
Bag of Words: Simple Example
Two texts:
● “Hello, Will!”
● “Hello, Globe!”
“Hello” “Will” “Globe”
“Hello,
Will”
1 1 0
“Hello,
Globe”
1 0 1
“Hello, Will” → “A text that contains one instance of the
word “Hello”, contains one instance of the word “Will”, and
does not contain the word “Globe”.
(Less readable for us, more readable for computers!)
Live Vectorizer:
Why are these called “Vectorizers”?
text_1 = "words, words, words"
text_2 = "words, words, birds"
# times “birds” is used
# times
“words” is
used
text_2
text_1
Act V
Putting it all Together
Classifier Workflow
Classification: Steps
1) Split pre-labeled text into training and testing
sets
2) Vectorize text (extract features)
3) Train classifier
4) Test classifier
Text → Features → Labels
Training
Classifier Training
from sklearn.feature_extraction.text import
CountVectorizer
from sklearn.naive_bayes import MultinomialNB
vectorizer = CountVectorizer()
vectorizer.fit(train_speeches)
train_features = vectorizer.transform(train_speeches)
classifier = MultinomialNB()
classifier.fit(train_features, train_labels)
Testing
test_speech = test_speeches[0]
print test_speech
Farewell, Andronicus, my noble father,
The woefull'st man that ever liv'd in Rome.
Farewell, proud Rome, till Lucius come again;
He loves his pledges dearer than his life.
...
(From Titus Andronicus, III.1.288-300)
Classifier Testing
Classifier Testing
test_speech = test_speeches[0]
test_label = test_labels[0]
test_features = vectorizer.transform([test_speech])
prediction = classifier.predict(test_features)[0]
print prediction
>>> 'tragedy'
print test_label
>>> 'tragedy'
test_features = vectorizer.transform(test_speeches)
print classifier.score(test_features, test_labels)
>>> 0.75427682737169521
Classifier Testing
Critiques
• "Bag of Words" assumes a correlation
between word use and label. This
correlation is stronger in some cases
than in others.
• Beware of highly-disproportionate
training data.
Epilogue
adampalay@gmail.com
@adampalay
www.adampalay.com
Thank you!

More Related Content

What's hot

Reading The Horned Toad Prince
Reading The Horned Toad PrinceReading The Horned Toad Prince
Reading The Horned Toad Princemonicaramirezmtz
 
Letter To God From The Dog
Letter To God From The DogLetter To God From The Dog
Letter To God From The DogCyndyoxox
 
11 rules of writing, grammar, and
11 rules of writing, grammar, and11 rules of writing, grammar, and
11 rules of writing, grammar, andlucyanee
 
Dear God, Its Me The Dog
Dear God, Its Me The DogDear God, Its Me The Dog
Dear God, Its Me The DogBrenda Silveira
 
Expository paragraph
Expository paragraphExpository paragraph
Expository paragraphAfraz Khan
 
Figurative Language
Figurative LanguageFigurative Language
Figurative LanguageHelloVanAnh
 
PresentaciĂłn22
PresentaciĂłn22PresentaciĂłn22
PresentaciĂłn22josebuleo
 
Horse Of A Different Color
Horse Of A Different ColorHorse Of A Different Color
Horse Of A Different Colorguestad08f9
 
Figurative Language (www.ereadingworksheets.com)
Figurative Language (www.ereadingworksheets.com)Figurative Language (www.ereadingworksheets.com)
Figurative Language (www.ereadingworksheets.com)jjwubby
 
Figurative language ppt
Figurative language pptFigurative language ppt
Figurative language pptDianeMcGinnis
 
Lexical stylistic devices lecture 5(slides)
Lexical stylistic devices lecture 5(slides)Lexical stylistic devices lecture 5(slides)
Lexical stylistic devices lecture 5(slides)aliceinwonder
 
Similesand metaphorsintroductionpowerpoint
Similesand metaphorsintroductionpowerpointSimilesand metaphorsintroductionpowerpoint
Similesand metaphorsintroductionpowerpointNeilfieOrit2
 
Skyhook voice lessons slides
Skyhook voice lessons   slidesSkyhook voice lessons   slides
Skyhook voice lessons slideswhatthekatie
 
Camp write along lesson
Camp write along lessonCamp write along lesson
Camp write along lessonErica Shepherd
 
The figures of speech
The figures of speechThe figures of speech
The figures of speechNeilfieOrit2
 
Elit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsElit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsjordanlachance
 
Sentence Fluency
Sentence FluencySentence Fluency
Sentence FluencySheri Edwards
 

What's hot (19)

Reading The Horned Toad Prince
Reading The Horned Toad PrinceReading The Horned Toad Prince
Reading The Horned Toad Prince
 
Letter To God From The Dog
Letter To God From The DogLetter To God From The Dog
Letter To God From The Dog
 
11 rules of writing, grammar, and
11 rules of writing, grammar, and11 rules of writing, grammar, and
11 rules of writing, grammar, and
 
Dear God, Its Me The Dog
Dear God, Its Me The DogDear God, Its Me The Dog
Dear God, Its Me The Dog
 
Expository paragraph
Expository paragraphExpository paragraph
Expository paragraph
 
Figurative Language
Figurative LanguageFigurative Language
Figurative Language
 
PresentaciĂłn22
PresentaciĂłn22PresentaciĂłn22
PresentaciĂłn22
 
Horse Of A Different Color
Horse Of A Different ColorHorse Of A Different Color
Horse Of A Different Color
 
Figurative Language (www.ereadingworksheets.com)
Figurative Language (www.ereadingworksheets.com)Figurative Language (www.ereadingworksheets.com)
Figurative Language (www.ereadingworksheets.com)
 
Figurative language ppt
Figurative language pptFigurative language ppt
Figurative language ppt
 
Lexical stylistic devices lecture 5(slides)
Lexical stylistic devices lecture 5(slides)Lexical stylistic devices lecture 5(slides)
Lexical stylistic devices lecture 5(slides)
 
English
EnglishEnglish
English
 
Similesand metaphorsintroductionpowerpoint
Similesand metaphorsintroductionpowerpointSimilesand metaphorsintroductionpowerpoint
Similesand metaphorsintroductionpowerpoint
 
Skyhook voice lessons slides
Skyhook voice lessons   slidesSkyhook voice lessons   slides
Skyhook voice lessons slides
 
Stylistic devices
Stylistic devicesStylistic devices
Stylistic devices
 
Camp write along lesson
Camp write along lessonCamp write along lesson
Camp write along lesson
 
The figures of speech
The figures of speechThe figures of speech
The figures of speech
 
Elit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsElit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errors
 
Sentence Fluency
Sentence FluencySentence Fluency
Sentence Fluency
 

Similar to Words, Words, Words: Reading Shakespare with Python

3 Linguistic features
3 Linguistic features3 Linguistic features
3 Linguistic featuresShadzDhan
 
connotation_versus_denotation Slide show
connotation_versus_denotation Slide showconnotation_versus_denotation Slide show
connotation_versus_denotation Slide showHanzelMaePalomo
 
connotation versus denotation definition of a word
connotation versus denotation definition of a wordconnotation versus denotation definition of a word
connotation versus denotation definition of a wordJastineDuarez1
 
Connotation versus Denotation (1).ppt
Connotation versus Denotation (1).pptConnotation versus Denotation (1).ppt
Connotation versus Denotation (1).pptNinoIgnacio2
 
connotation_versus_denotation.ppt
connotation_versus_denotation.pptconnotation_versus_denotation.ppt
connotation_versus_denotation.pptJANINAMAEMALIBIRAN
 
S2 Literacy Course
S2 Literacy CourseS2 Literacy Course
S2 Literacy CourseLHSwebsite
 
Elit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsElit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsjordanlachance
 
Personal Insight Essay Examples. Online assignment writing service.
Personal Insight Essay Examples. Online assignment writing service.Personal Insight Essay Examples. Online assignment writing service.
Personal Insight Essay Examples. Online assignment writing service.Michelle Franks
 
Ewrt 1 c class 5
Ewrt 1 c class 5Ewrt 1 c class 5
Ewrt 1 c class 5jordanlachance
 
Ewrt 1 c class 5
Ewrt 1 c class 5Ewrt 1 c class 5
Ewrt 1 c class 5jordanlachance
 
Word choice lecture
Word choice lecture  Word choice lecture
Word choice lecture E. K. Gordon
 
Word choice word order lecture _
Word choice word order lecture _Word choice word order lecture _
Word choice word order lecture _E. K. Gordon
 
Elit 17 class 3 special winter 2018
Elit 17 class 3 special winter 2018Elit 17 class 3 special winter 2018
Elit 17 class 3 special winter 2018kimpalmore
 
Argumentative Article In Mala
Argumentative Article In MalaArgumentative Article In Mala
Argumentative Article In MalaAmanda Gray
 
Elit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsElit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsjordanlachance
 
Elit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsElit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorskimpalmore
 
โรงเรียนสาธิตแห่งมหาวิทยาลั. Online assignment writing service.
โรงเรียนสาธิตแห่งมหาวิทยาลั. Online assignment writing service.โรงเรียนสาธิตแห่งมหาวิทยาลั. Online assignment writing service.
โรงเรียนสาธิตแห่งมหาวิทยาลั. Online assignment writing service.Jessica Huston
 
Let's Eat Grandpa: Punctuation for English 102
Let's Eat Grandpa: Punctuation for English 102Let's Eat Grandpa: Punctuation for English 102
Let's Eat Grandpa: Punctuation for English 102Chad Eller
 

Similar to Words, Words, Words: Reading Shakespare with Python (20)

3 Linguistic features
3 Linguistic features3 Linguistic features
3 Linguistic features
 
Word choice
Word choiceWord choice
Word choice
 
connotation_versus_denotation Slide show
connotation_versus_denotation Slide showconnotation_versus_denotation Slide show
connotation_versus_denotation Slide show
 
connotation versus denotation definition of a word
connotation versus denotation definition of a wordconnotation versus denotation definition of a word
connotation versus denotation definition of a word
 
Connotation versus Denotation (1).ppt
Connotation versus Denotation (1).pptConnotation versus Denotation (1).ppt
Connotation versus Denotation (1).ppt
 
connotation_versus_denotation.ppt
connotation_versus_denotation.pptconnotation_versus_denotation.ppt
connotation_versus_denotation.ppt
 
S2 Literacy Course
S2 Literacy CourseS2 Literacy Course
S2 Literacy Course
 
Elit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsElit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errors
 
Personal Insight Essay Examples. Online assignment writing service.
Personal Insight Essay Examples. Online assignment writing service.Personal Insight Essay Examples. Online assignment writing service.
Personal Insight Essay Examples. Online assignment writing service.
 
Ewrt 1 c class 5
Ewrt 1 c class 5Ewrt 1 c class 5
Ewrt 1 c class 5
 
Ewrt 1 c class 5
Ewrt 1 c class 5Ewrt 1 c class 5
Ewrt 1 c class 5
 
Word choice lecture
Word choice lecture  Word choice lecture
Word choice lecture
 
Word choice word order lecture _
Word choice word order lecture _Word choice word order lecture _
Word choice word order lecture _
 
Elit 17 class 3 special winter 2018
Elit 17 class 3 special winter 2018Elit 17 class 3 special winter 2018
Elit 17 class 3 special winter 2018
 
Advertising headli
Advertising headliAdvertising headli
Advertising headli
 
Argumentative Article In Mala
Argumentative Article In MalaArgumentative Article In Mala
Argumentative Article In Mala
 
Elit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsElit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errors
 
Elit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errorsElit 17 class 3 comedy of errors
Elit 17 class 3 comedy of errors
 
โรงเรียนสาธิตแห่งมหาวิทยาลั. Online assignment writing service.
โรงเรียนสาธิตแห่งมหาวิทยาลั. Online assignment writing service.โรงเรียนสาธิตแห่งมหาวิทยาลั. Online assignment writing service.
โรงเรียนสาธิตแห่งมหาวิทยาลั. Online assignment writing service.
 
Let's Eat Grandpa: Punctuation for English 102
Let's Eat Grandpa: Punctuation for English 102Let's Eat Grandpa: Punctuation for English 102
Let's Eat Grandpa: Punctuation for English 102
 

Recently uploaded

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion Clinic
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareinfo611746
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
 
Breaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfBreaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfMeon Technology
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandIES VE
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAlluxio, Inc.
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobus
 
Agnieszka Andrzejewska - BIM School Course in KrakĂłw
Agnieszka Andrzejewska - BIM School Course in KrakĂłwAgnieszka Andrzejewska - BIM School Course in KrakĂłw
Agnieszka Andrzejewska - BIM School Course in KrakĂłwbim.edu.pl
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownloadvrstrong314
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfkalichargn70th171
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessWSO2
 

Recently uploaded (20)

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Breaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdfBreaking the Code : A Guide to WhatsApp Business API.pdf
Breaking the Code : A Guide to WhatsApp Business API.pdf
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Agnieszka Andrzejewska - BIM School Course in KrakĂłw
Agnieszka Andrzejewska - BIM School Course in KrakĂłwAgnieszka Andrzejewska - BIM School Course in KrakĂłw
Agnieszka Andrzejewska - BIM School Course in KrakĂłw
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 

Words, Words, Words: Reading Shakespare with Python

  • 1. Words, words, words Reading Shakespeare with Python
  • 3. Motivation How can we use Python to supplement our reading of Shakespeare? How can we get Python to read for us?
  • 5. Why Shakespeare? Polonius: What do you read, my lord? Hamlet: Words, words, words. P: What is the matter, my lord? H: Between who? P: I mean, the matter that you read, my lord. --II.2.184
  • 6. Why Shakespeare? (Also the XML) (thank you, https://github. com/severdia/PlayShakespeare. com-XML !!!)
  • 9. Challenges • Language, especially English, is messy • Texts are usually unstructured • Pronunciation is not standard • Reading is pretty hard!
  • 10. Humans and Computers Nuance Ambiguity Close reading Counting Repetitive tasks Making graphs Humans are good at: Computers are good at:
  • 12. (leveraging metadata) Who is the main Character in _______?
  • 13. Who is the main character in Hamlet? Number of Lines
  • 14. Who is the main character in King Lear? Number of Lines
  • 15. Who is the main character in Macbeth? Number of Lines
  • 16. Who is the main character in Othello? Number of Lines
  • 17. Iago and Othello, Detail Number of Lines
  • 20. First steps with natural language processing (NLP) What are Shakespeare’s most interesting rhymes?
  • 21. Shakespeare’s Sonnets • A sonnet is 14 line poem • There are many different rhyme schemes a sonnet can have; Shakespeare was pretty unique in choosing one • This is a huge win for us, since we can “hard code” his rhyme scheme in our analysis
  • 22. Shall I compare thee to a summer’s day? Thou art more lovely and more temperate: Rough winds do shake the darling buds of May, And summer’s lease hath all too short a date; Sometime too hot the eye of heaven shines, And often is his gold complexion dimm'd; And every fair from fair sometime declines, By chance or nature’s changing course untrimm'd; But thy eternal summer shall not fade, Nor lose possession of that fair thou ow’st; Nor shall death brag thou wander’st in his shade, When in eternal lines to time thou grow’st: So long as men can breathe or eyes can see, So long lives this, and this gives life to thee. http://www.poetryfoundation.org/poem/174354 a b a b c d c d e f e f g g Sonnet 18
  • 23. Rhyme Distribution • Most common rhymes • nltk.FreqDict Frequency Distribution • Given a word, what is the frequency distribution of the words that rhyme with it? • nltk.ConditionalFreqDict Conditional Frequency Distribution
  • 26. 1) “Boring” rhymes: “me” and “thee” 2) “Lopsided” rhymes: “thee” and “usury” Interesting Rhymes?
  • 29. Our Classifier Can we write code to tell if a given speech is from a tragedy or comedy?
  • 30. ● Requires labeled text ○ (in this case, speeches labeled by genre) ○ [(<speech>, <genre>), ...] ● Requires “training” ● Predicts labels of text Classifiers: overview
  • 31. Classifiers: ingredients ● Classifier ● Vectorizer, or Feature Extractor ● Classifiers only interact with features, not the text itself
  • 32. Vectorizers (or Feature Extractors) ● A vectorizer, or feature extractor, transforms a text into quantifiable information about the text. ● Theoretically, these features could be anything. i.e.: ○ How many capital letters does the text contain? ○ Does the text end with an exclamation point? ● In practice, a common model is “Bag of Words”.
  • 33. Bag of Words is a kind of feature extraction where: ● The set of features is the set of all words in the text you’re analyzing ● A single text is represented by how many of each word appears in it Bag of Words
  • 34. Bag of Words: Simple Example Two texts: ● “Hello, Will!” ● “Hello, Globe!”
  • 35. Bag of Words: Simple Example Two texts: ● “Hello, Will!” ● “Hello, Globe!” Bag: [“Hello”, “Will”, “Globe”] “Hello” “Will” “Globe”
  • 36. Bag of Words: Simple Example Two texts: ● “Hello, Will!” ● “Hello, Globe!” Bag: [“Hello”, “Will”, “Globe”] “Hello” “Will” “Globe” “Hello, Will” 1 1 0 “Hello, Globe” 1 0 1
  • 37. Bag of Words: Simple Example Two texts: ● “Hello, Will!” ● “Hello, Globe!” “Hello” “Will” “Globe” “Hello, Will” 1 1 0 “Hello, Globe” 1 0 1 “Hello, Will” → “A text that contains one instance of the word “Hello”, contains one instance of the word “Will”, and does not contain the word “Globe”. (Less readable for us, more readable for computers!)
  • 39. Why are these called “Vectorizers”? text_1 = "words, words, words" text_2 = "words, words, birds" # times “birds” is used # times “words” is used text_2 text_1
  • 40. Act V
  • 41. Putting it all Together Classifier Workflow
  • 42. Classification: Steps 1) Split pre-labeled text into training and testing sets 2) Vectorize text (extract features) 3) Train classifier 4) Test classifier Text → Features → Labels
  • 44. Classifier Training from sklearn.feature_extraction.text import CountVectorizer from sklearn.naive_bayes import MultinomialNB vectorizer = CountVectorizer() vectorizer.fit(train_speeches) train_features = vectorizer.transform(train_speeches) classifier = MultinomialNB() classifier.fit(train_features, train_labels)
  • 46. test_speech = test_speeches[0] print test_speech Farewell, Andronicus, my noble father, The woefull'st man that ever liv'd in Rome. Farewell, proud Rome, till Lucius come again; He loves his pledges dearer than his life. ... (From Titus Andronicus, III.1.288-300) Classifier Testing
  • 47. Classifier Testing test_speech = test_speeches[0] test_label = test_labels[0] test_features = vectorizer.transform([test_speech]) prediction = classifier.predict(test_features)[0] print prediction >>> 'tragedy' print test_label >>> 'tragedy'
  • 48. test_features = vectorizer.transform(test_speeches) print classifier.score(test_features, test_labels) >>> 0.75427682737169521 Classifier Testing
  • 49. Critiques • "Bag of Words" assumes a correlation between word use and label. This correlation is stronger in some cases than in others. • Beware of highly-disproportionate training data.