Lecture09

Knowledge Representation
in
Digital Humanities
Antonio Jiménez Mavillard
Department of Modern Languages and Literatures
Western University

Lecture 9
Knowledge Representation in Digital Humanities
* Contents:
1. Why this lecture?
2. Discussion
3. Chapter 9
4. Assignment
5. Bibliography
2

Why this lecture?
* This lecture...
· teaches some NLP techniques subject
to be applied to real problems
· presents another example of how DH put
together various disciplines (Linguistics,
Artificial Intelligence, Information
Science, Statistics...) to solve problems
3

Last assignment discussion
* Time to...
· consolidate ideas and
concepts dealt in the readings
4

Chapter 9
Natural Language Processing
in Python
1. Preliminary theory
2. Word tagging and categorization
3. Text classification
4. Text information extraction
Antonio Jiménez Mavillard5

Chapter 9
1 Preliminary theory
1.1 Linguistics
1.2 Statistics
1.3 Artificial Intelligence

Chapter 9
2 Word tagging and categorization
2.1 Tagger
2.2 Automatic tagging
2.3 n-gram tagging

Chapter 9
3 Text classification
3.1 Supervised classification
3.2 Document classification

Chapter 9
4 Text information extraction
4.1 Information extraction
4.2 Entity recognition
4.3 Relation extraction

Preliminary theory

Linguistics
* Lexical categories
· nouns: people, places, things, concepts
· verbs: actions
· adjectives: describes nouns
· adverbs: modifies adjectives and verbs
· ...

Linguistics
* Lexical categories

Linguistics
* These word classes are also known as
part-of-speech
* They arise from simple analysis of the
distribution of words in text

Statistics
* Frequency distribution
· Arrangement of the values that one or
more variables take in a sample

Statistics
· Example: vocabulary in a text
+ how many times each word appears in
the text?
+ it is a “distribution” since it tells us
how the total number of word tokens
in the text are distributed across the
vocabulary items

Statistics

Statistics
* Conditional frequency distribution
· A collection of frequency distributions,
each one for a different condition

Statistics
+ when the texts of a corpus are
divided into several categories we can
maintain separate frequency
distributions for each category
+ the condition will often be the
category of the text

Statistics

Artificial Intelligence
* Supervised vs unsupervised learning
· Supervised learning:
+ Possible results are known
+ Data is labeled
· Unsupervised learning:
+ Results are unknown
+ Data is clustered

* Decision trees
· Flowchart that selects labels for input
values
· Formed by decision and leaf nodes
· Decision nodes: check feature values
· Leaf nodes: assign labels

* Decision trees
· Example: “Going out?”

* Naive Bayes classifiers
1. Begins by calculating the prior
probability of each label, determined by
checking the frequency of each label in
the training set

2. The contribution from each feature is
combined with this prior probability, to
arrive at a likelihood estimate for each
label
3. The label whose likelihood estimate is
the highest is then assigned to the input
value

· Example: document classification
Prior probability: close “Automotive”

References
“Frequency Distribution.” Wikipedia, the free encyclopedia 7 Apr. 2014. Wikipedia. Web. 8 Apr. 2014.
Mitchell, Tom M. “Chapter 3: Decision Tree Learning.” Machine Learning. New York: McGraw-Hill, 1997. Print.
Mitchell, Tom M. “Chapter 6: Bayesian Learning.” Machine Learning. New York: McGraw-Hill, 1997. Print.
“Part of Speech.” Wikipedia, the free encyclopedia 5 Apr. 2014. Wikipedia. Web. 8 Apr. 2014.
Steven Bird, Ewan Klein, and Edward Loper. “Conditional Frequency Distributions.” Natural Language Processing with
Python. O’Reilly Media, 2009. 504. shop.oreilly.com. Web. 8 Mar. 2014.
Steven Bird, Ewan Klein, and Edward Loper. “Frequency Distributions.” Natural Language Processing with Python. O’Reilly
Media, 2009. 504. shop.oreilly.com. Web. 8 Mar. 2014.

Word tagging and classification

Tagger
* Processes a sequence of words, and
attaches a part of speech tag to each
word
* Procedure:
1. Tokenization
2. Tagging

Tagger
* Example 1:
In [1]: text = 'And now for something completely different'
In [2]: tokens = nltk.word_tokenize(text)
In [3]: nltk.pos_tag(tokens)
Out[3]:
[('And', 'CC'),
('now', 'RB'),
('for', 'IN'),
('something', 'NN'),
('completely', 'RB'),
('different', 'JJ')]

Tagger
* Example 2:
In [1]: text = 'They refuse to permit us to obtain the
refuse permit'
In [3]: nltk.pos_tag(tokens)
Out[3]:
[('They', 'PRP'),
('refuse', 'VBP'),
('to', 'TO'),
('permit', 'VB'),
('us', 'PRP'),
('to', 'TO'),
('obtain', 'VB'),
('the', 'DT'),
('refuse', 'NN'),
('permit', 'NN')]

Automatic tagging
* The tag of a word depends on the word
itself and its context within a sentence
* Working with data at the level of tagged
sentences rather than tagged words

Automatic tagging
* Loading data
· Example: tagged and non-tagged
sentences of “news” category
In [1]: from nltk.corpus import brown
In [2]: brown_tagged_sents =
brown.tagged_sents(categories='news')
In [3]: brown_sents = brown.sents(categories='news')

Automatic tagging
* Default tagger
· Chose the most likely tag
In [4]: tags = [tag for (word, tag) in
brown.tagged_words(categories='news')]
In [4]: nltk.FreqDist(tags).max()
Out[4]: 'NN'

Automatic tagging
* Default tagger
· Assign the most likely tag to each token
In [5]: text = 'I do not like green eggs and ham, I do not
like them Sam I am!'
In [7]: default_tagger = nltk.DefaultTagger('NN')

Automatic tagging
* Default tagger
In [8]: default_tagger.tag(tokens)
Out[8]:
[('I', 'NN'),
('do', 'NN'),
('not', 'NN'),
('like', 'NN'),
('green', 'NN'),
('eggs', 'NN'),
('and', 'NN'),
('ham', 'NN'),
(',', 'NN'),

Automatic tagging
* Default tagger
...
('I', 'NN'),
('do', 'NN'),
('not', 'NN'),
('like', 'NN'),
('them', 'NN'),
('Sam', 'NN'),
('I', 'NN'),
('am', 'NN'),
('!', 'NN')]

Automatic tagging
* Default tagger
· This method performs rather poorly
· Unknown words will be nouns (as it
happens, most new words are nouns)
In [9]: default_tagger.evaluate(brown_tagged_sents)
Out[9]: 0.13089484257215028

Automatic tagging
* Regular expression tagger
· Assigns tags to tokens on the basis of
matching patterns
In [10]: patterns = [
   ...: (r'.*ing$', 'VBG'),              # gerounds
   ...: (r'.*ed$', 'VBD'),               # simple past
   ...: (r'.*es$', 'VBZ'),               # 3rd sing present
   ...: (r'.*ould$', 'MD'),              # modals
   ...: (r'.*'s$', 'NN$'),              # possessive nouns
   ...: (r'.*s$', 'NNS'),                # plural nouns
   ...: (r'^?[09]+(.[09]+)?$', 'CD'), # cardinal numbers
   ...: (r'.*', 'NN'),                   # nouns (default)
   ...: ]
In [11]: regexp_tagger = nltk.RegexpTagger(patterns)

Automatic tagging
In [12]: regexp_tagger.tag(brown_sents[3])
Out[12]:
[('``', 'NN'),
('Only', 'NN'),
('a', 'NN'),
('relative', 'NN'),
('handful', 'NN'),
('of', 'NN'),
('such', 'NN'),
('reports', 'NNS'),
('was', 'NNS'),
('received', 'VBD'),
...]

Automatic tagging
· This method is correct about a fifth of
the time
· The final regular expression «.*» is a
catch-all that tags everything as a noun
In [13]: regexp_tagger.evaluate(brown_tagged_sents)
Out[13]: 0.20326391789486245

Automatic tagging
* Lookup tagger
· Problem: a lot of high-frequency words
do not have the NN tag

Automatic tagging
* Lookup tagger
· Solution:
+ Find the hundred most frequent words
and store their most likely tag
+ Use this information as model for a
lookup tagger (NLTK UnigramTagger)
+ Tag everything else as a noun

Automatic tagging
* Lookup tagger
In [14]: fd = nltk.FreqDist(brown.words(categories='news'))
In [15]: cfd = #counts how many times a word belongs to a category
nltk.ConditionalFreqDist(brown.tagged_words(categories='news'))
In [16]: most_freq_words = fd.keys()[:100]
In [17]: likely_tags = dict((word, cfd[word].max()) for word in
most_freq_words) #from all categories of a word, take the maximum
In [18]: baseline_tagger = nltk.UnigramTagger(model=likely_tags,
backoff=nltk.DefaultTagger('NN'))
In [19]: baseline_tagger.evaluate(brown_tagged_sents)
Out[19]: 0.5817769556656125

Automatic tagging
* Lookup tagger
· The tagger
accuracy
increases as
the model
size grows

n-gram tagging
* Unigram tagger
· As the lookup tagger, assign the most
likely tag to each token
· As opposed to the default tagger, it is
trained for setting it up
· Training: initialize the tagger with a
tagged sentence data as a parameter

n-gram tagging
* Unigram tagger
· Separate the data in:
+ Training data (90%)
+ Testing data (10%)

n-gram tagging
* Unigram tagger
In [20]: size = int(len(brown_tagged_sents) * 0.9)
In [21]: train_sents = brown_tagged_sents[:size]
In [22]: test_sents = brown_tagged_sents[size:]
In [23]: unigram_tagger = nltk.UnigramTagger(train_sents)

n-gram tagging
* Unigram tagger
In [24]: unigram_tagger.tag(brown_sents[2007])
Out[24]:
[('Various', 'JJ'),
('of', 'IN'),
('the', 'AT'),
('apartments', 'NNS'),
('are', 'BER'),
('of', 'IN'),
('the', 'AT'),
('terrace', 'NN'),
('type', 'NN'),
(',', ','),
...

n-gram tagging
* Unigram tagger
('being', 'BEG'),
('on', 'IN'),
('the', 'AT'),
('ground', 'NN'),
('floor', 'NN'),
('so', 'QL'),
('that', 'CS'),
('entrance', 'NN'),
('is', 'BEZ'),
('direct', 'JJ'),
('.', '.')]
In [21]: unigram_tagger.evaluate(test_sents)
Out[21]: 0.8110236220472441

n-gram tagging
* An n-gram tagger picks the tag that is
most likely in the given context
* Unigram (1-gram) tagger
· Context:
+ current token in isolation

n-gram tagging
* Bigram (2-gram) tagger
· Context:
+ current token
+ POS tag of the 1 preceding token
* Trigram (3-gram) tagger
· Context:
+ current token
+ POS tag of the 2 preceding tokens

n-gram tagging
* n-gram tagger
· Context:
+ current token
+ POS tag of the n-1 preceding tokens

n-gram tagging
* n-gram tagger
· Example: bigram
In [22]: bigram_tagger = nltk.BigramTagger(train_sents)
In [23]: bigram_tagger.evaluate(train_sents)
Out[23]: 0.7853094861965731
In [24]: bigram_tagger.evaluate(test_sents)
Out[24]: 0.10216286255357321

n-gram tagging
* n-gram tagger
· Example: bigram
+ Problem: it manages to tag words in
sentences of training data but
- it is unable to tag a new word
(assigns None)

n-gram tagging
* n-gram tagger
· Example: bigram
+ Problem: it manages to tag words in
sentences of training data but
- it cannot tag the following word
(even if it is not new) because it
never saw it during training with
a None tag on the previous word

n-gram tagging
* n-gram tagger
· Example: bigram
+ Name: sparse data
+ Reason: specific contexts with no
default tagger
+ Solution: trade-off between accuracy
and coverage

n-gram tagging
* Combining taggers
· Trade-off between accuracy and
coverage

n-gram tagging
* Combining taggers
1. Try tagging with the n-gram tagger
2. If unable, try the (n-1)-gram tagger
3. If unable, try the (n-2)-gram tagger
...

n-gram tagging
* Combining taggers
...
n-2. If unable, try the trigram tagger
n-1. If unable, try the bigram tagger
n. If unable, try the unigram tagger
n+1. If unable, use the default tagger

n-gram tagging
* Combining taggers
· Example:
In [25]: t0 = nltk.DefaultTagger('NN')
In [26]: t1 = nltk.UnigramTagger(train_sents, backoff=t0)
In [27]: t2 = nltk.BigramTagger(train_sents, backoff=t1)
In [28]: t2.evaluate(test_sents)
Out[28]: 0.8447124489185687

n-gram tagging
* Exercise 1
· Build a tagger by combining
a trigram, a bigram, a unigram
and a regular expression tagger (in the
default case)
· Use it to tag a sentence
· Evaluate its performance

n-gram tagging
* Exercise 1 (solution)
import nltk
import re
from nltk.corpus import brown

n-gram tagging
patterns = [
    (r'.*ing$', 'VBG'),
    (r'.*ed$', 'VBD'),
    (r'.*es$', 'VBZ'),
    (r'.*ould$', 'MD'),
    (r".'s$", 'NN$'),
    (r'.*s$', 'NNS'),
    (r'^?[09]+(.[09]+)?$', 'CD'),
    (r'.*', 'NN')
]

n-gram tagging
brown_tagged_sents =
brown.tagged_sents(categories='news')
size = int(len(brown_tagged_sents) * 0.9)
train_sents = brown_tagged_sents[:size]
test_sents = brown_tagged_sents[size:]

n-gram tagging
t0 = nltk.RegexpTagger(patterns)
t1 = nltk.UnigramTagger(train_sents, backoff=t0)
t2 = nltk.BigramTagger(train_sents, backoff=t1)
t3 = nltk.TrigramTagger(train_sents, backoff=t1)

n-gram tagging
brown_sents = brown.sents(categories='news')
sent = brown_sents[2007]
t3.tag(sent)
t3.evaluate(brown_tagged_sents)

References
Steven Bird, Ewan Klein, and Edward Loper. “Chapter 5: Categorizing and Tagging Words.” Natural Language Processing
with Python. O’Reilly Media, 2009. 504. shop.oreilly.com. Web. 8 Mar. 2014.

Text classification

Supervised classification
* Idea

* Process
1. Features
2. Encode
3. Feature extractor

* The process involves important skills:
· Abstraction
· Modelling
· Programming

* Features
· Abstraction: decide the relevant
information of the data set
* Encode
· Modelling: choose a sound representation
(data structure)

* Feature extractor
· Programming: program a function that
extracts the features in the chosen
representation

* Applications:
· Deciding the lexical category of words:
POS tagging
· Deciding the topic of a document from
a list of topics (“sports”, “technology”,
etc.): document classification

Document classification
* Example 1: gender identification
(solved by Naive Bayesian Classifier)
· Evidence
+ Names ending in a, e, i => female
+ Names ending in k, o, r, s, t => male
· Features: last letter
· Encode: dictionary
· Feature extractor: “name => {last letter}”

· Data
In [1]: from nltk.corpus import names
In [2]: import random
In [3]: all_names =
[(name, 'male') for name in names.words('male.txt')] +
[(name, 'female') for name in names.words('female.txt')]
In [4]: random.shuffle(all_names)

· Feature extractor
In [5]: def gender_features(word):
return {'last_letter': word[1]}
# Example
In [6]: gender_features('Shrek')
Out[6]: {'last_letter': 'k'}

· Classification
In [7]: featuresets =
[(gender_features(n), g) for (n,g) in all_names]
In [8]: train_set = featuresets[500:]
In [9]: test_set = featuresets[:500]
In [10]: classifier =
nltk.NaiveBayesClassifier.train(train_set)
In [11]: nltk.classify.accuracy(classifier, test_set)
Out[11]: 0.778

* Example 2: POS tagging
(solved by Decision Tree Classifier)
· Results: POS tag
· Features: Suffixes

· Data
In [1]: from nltk.corpus import brown
In [2]: suffix_fdist = nltk.FreqDist()
In [3]: for word in brown.words():
            word = word.lower()
            suffix_fdist.inc(word[1:])
In [4]: common_suffixes = suffix_fdist.keys()[:100]

In [5]: def pos_features(word):
            features = {}
            for suffix in common_suffixes:
                features['endswith(%s)' % suffix] =
                    word.lower().endswith(suffix)
            return features

· Classification
In [6]: tagged_words = brown.tagged_words(categories='news')
[(pos_features(n), g) for (n,g) in tagged_words]
In [8]: size = int(len(featuresets) * 0.1)
In [9]: train_set, test_set =
featuresets[size:], featuresets[:size]

· Classification
nltk.DecisionTreeClassifier.train(train_set)
In [11]: classifier.classify(pos_features('cats'))
Out[11]: 'NNS'
0.62705121829935351

* Example 3: document classification
(solved by Naive Bayesian Classifier)
· Corpus: Movie Reviews Corpus
· Results: Positive or negative review
· Features: Indicate whether or not the
2000 most frequent words are present in
each review

· Data
In [1]: from nltk.corpus import movie_reviews
In [2]: import random
In [3]: documents =
            [(list(movie_reviews.words(fileid)), category)
            for category in movie_reviews.categories()
            for fileid in movie_reviews.fileids(category)]
In [4]: random.shuffle(documents)

In [5]: all_words = nltk.FreqDist(
            w.lower() for w in movie_reviews.words())
In [6]: word_features = all_words.keys()[:2000]
In [7]: def document_features(document):
            document_words = set(document)
            features = {}
            for word in word_features:
                features['contains(%s)' % word] =
                    (word in document_words)
            return features

· Classification
[(document_features(d), c) for (d,c) in documents]
In [8]: train_set = featuresets[100:]
In [9]: test_set = featuresets[:100]
Out[11]: 0.84

· 5 most informative features
In [12]: classifier.show_most_informative_features(5)
Most Informative Features
   contains(outstanding) = True    pos : neg = 10.7 : 1.0
         contains(mulan) = True    pos : neg =  9.0 : 1.0
        contains(seagal) = True    neg : pos =  8.2 : 1.0
   contains(wonderfully) = True    pos : neg =  6.4 : 1.0
         contains(damon) = True    pos : neg =  6.4 : 1.0

* Exercise 2
· “Reuters-21578 benchmark corpus /
ApteMod version” is a collection of 10,788
documents from the Reuters financial
newswire service

* Exercise 2
· Train a naive Bayes classifier with
ApteMod corpus
· Use it to classify a document
· Evalutate its performance

import nltk
import random
from nltk.corpus import reuters

documents = [(list(reuters.words(fileid)), category)
for category in reuters.categories()
for fileid in reuters.fileids(category)]
random.shuffle(documents)

all_words = nltk.FreqDist(w.lower() for w in
reuters.words())
word_features = all_words.keys()[:2000]
def document_features(document):
    document_words = set(document)
    features = {}
    for word in word_features:
        features['contains(%s)' % word] =
            (word in document_words)
    return features

featuresets = [(document_features(d), c) for (d,c) in
documents]
size = int(len(featuresets) * 0.9)
train_set = featuresets[size:]
test_set = featuresets[:size]
classifier =

document = reuters.words('test/14826')
classifier.classify(document_features(document))
nltk.classify.accuracy(classifier, test_set)

References
Steven Bird, Ewan Klein, and Edward Loper. “Chapter 6: Learning to Classify Text.” Natural Language Processing with
Python. O’Reilly Media, 2009. 504. shop.oreilly.com. Web. 8 Mar. 2014.

Text information extraction

Information extraction
* Definition:
· Convert unstructured data of natural
language into structured data of table
· Get information from tabulated data

Information extraction
* Arquitecture:

Entity recognition
* Chunking
· Segments and labels multitoken sequences
· Selects a subset of the tokens (chunks)
· Chunks do not overlap in the source text

Entity recognition
* Chunking
· Entities are mostly nouns
· Let us search for the noun phrase chunks
(NP-chunks)
· Grammar: set of rules that indicate how
sentences should be chunked

Entity recognition
* NP-chunker
In [1]: import nltk, re, pprint
In [2]: grammar = r"""
# chunk optional determiner/possessive, adjectives and nouns
NP: {<DT|PP$>?<JJ>*<NN>}
# chunk sequences of proper nouns
{<NNP>+}
"""
In [3]: cp = nltk.RegexpParser(grammar)

Entity recognition
* NP-chunker
In [4]: sentence1 = [("the", "DT"), ("little", "JJ"),
("yellow", "JJ"), ("dog", "NN"), ("barked", "VBD"), ("at",
"IN"), ("the", "DT"), ("cat", "NN")]
In [5]: sentence2 = [("Rapunzel", "NNP"), ("let", "VBD"),
("down", "RP"), ("her", "PP$"), ("long", "JJ"), ("golden",
"JJ"), ("hair", "NN")]
In [6]: result1 = cp.parse(sentence)
In [7]: result2 = cp.parse(sentence)

Entity recognition
* NP-chunker
In [8]: print result1
(S
  (NP the/DT little/JJ yellow/JJ dog/NN)
  barked/VBD
  at/IN
  (NP the/DT cat/NN))
In [9]: print result2
(S
  (NP Rapunzel/NNP)
  let/VBD
  down/RP
  (NP her/PP$ long/JJ golden/JJ hair/NN))

Entity recognition
* NP-chunker
In [10]: result1.draw()

Entity recognition
* Chunking text corpora
In [11]: for sent in brown.tagged_sents():
tree = cp.parse(sent)
for subtree in tree.subtrees():
if subtree.node == 'NP':
nps.append(subtree)
In [12]: for np in nps[:10]:
print np
(NP investigation/NN)
(NP widespread/JJ interest/NN)
(NP this/DT city/NN)
(NP new/JJ multimilliondollar/JJ airport/NN)
(NP his/PP$ wife/NN)
(NP His/PP$ political/JJ career/NN)
...

Entity recognition
* Named entities
· Are definite noun phrases
· Refer to specific types of individuals:

Entity recognition
* Named entity recognition
· Task well suited to classifier-based
approach for noun phrase chunking

Entity recognition
* Named entity recognition
· Example:
In [1]: sent = nltk.corpus.treebank.tagged_sents()[22]
In [2]: print nltk.ne_chunk(sent)
(S
  The/DT
  (GPE U.S./NNP)
  is/VBZ
  one/CD
  ...
  according/VBG
  to/TO
  (PERSON Brooke/NNP T./NNP Mossman/NNP)
  ...)

Relation extraction
* Extraction of relations that exists between
the named entities recognized
* Approach: initially look for all triples of
the form (X, , Y)α
· X and Y are named entities of specific
types
· is the relationα

Relation extraction
* Example:
In [1]: import nltk
In [2]: import re
In [3]: IN = re.compile(r'.*binb(?!b.+ing)')
In [4]: for doc in nltk.corpus.ieer.parsed_docs('NYT_19980315'):
for rel in nltk.sem.extract_rels('ORG', 'LOC', doc,
corpus='ieer', pattern = IN):
print nltk.sem.relextract.show_raw_rtuple(rel)

Relation extraction
* Example:
[ORG: 'WHYY'] 'in' [LOC: 'Philadelphia']
[ORG: 'McGlashan &AMP; Sarrail'] 'firm in' [LOC: 'San Mateo']
[ORG: 'Freedom Forum'] 'in' [LOC: 'Arlington']
[ORG: 'Brookings Institution'] ', the research group in' [LOC:
'Washington']
[ORG: 'Idealab'] ', a selfdescribed business incubator based in'
[LOC: 'Los Angeles']
[ORG: 'Open Text'] ', based in' [LOC: 'Waterloo']
[ORG: 'WGBH'] 'in' [LOC: 'Boston']
[ORG: 'Bastille Opera'] 'in' [LOC: 'Paris']
[ORG: 'Omnicom'] 'in' [LOC: 'New York']
[ORG: 'DDB Needham'] 'in' [LOC: 'New York']
[ORG: 'Kaplan Thaler Group'] 'in' [LOC: 'New York']
[ORG: 'BBDO South'] 'in' [LOC: 'Atlanta']
[ORG: 'GeorgiaPacific'] 'in' [LOC: 'Atlanta']

Relation extraction
* Exercise 3
· From the corpus ieer, extract
all the relations of type “people were
born in a location”

Relation extraction
* Exercise 3
· Extract all the relations of type
“people were born in a location” from
the corpus ieer

Relation extraction
import nltk
import os
import re
BORN = re.compile(r'.*bbornb')
files = filter(lambda x: x != 'README',
os.listdir('nltk_data/corpora/ieer'))
for f in files:
    for doc in nltk.corpus.ieer.parsed_docs(f):
        for rel in nltk.sem.extract_rels('PER', 'LOC', doc,
corpus='ieer', pattern=BORN):
            print nltk.sem.relextract.show_raw_rtuple(rel)

References
Steven Bird, Ewan Klein, and Edward Loper. “Chapter 7: Extracting Information from Text.” Natural Language Processing
with Python. O’Reilly Media, 2009. 504. shop.oreilly.com. Web. 8 Mar. 2014.

Assignment
* Assignment 9
· Readings
+ Supervised classification (Natural
Language Processing with Python)
+ Decision Tree Learning (Machine
Learning)

References
Mitchell, Tom M. “Chapter 3: Decision Tree Learning.” Machine Learning. New York: McGraw-Hill, 1997. Print.
Steven Bird, Ewan Klein, and Edward Loper. “Chapter 6: Learning to Classify Text - Supervised Classification.” Natural
Language Processing with Python. O’Reilly Media, 2009. 504. shop.oreilly.com. Web. 8 Mar. 2014.

Bibliography
“Frequency Distribution.” Wikipedia, the free encyclopedia 7 Apr. 2014. Wikipedia. Web. 8 Apr. 2014.
Mitchell, Tom M. Machine Learning. New York: McGraw-Hill, 1997. Print.
“Part of Speech.” Wikipedia, the free encyclopedia 5 Apr. 2014. Wikipedia. Web. 8 Apr. 2014.
Steven Bird, Ewan Klein, and Edward Loper. Natural Language Processing with Python. O’Reilly Media, 2009. 504.
shop.oreilly.com. Web. 8 Mar. 2014.

Lecture09

Recommended

Recommended

More Related Content

Similar to Lecture09

Similar to Lecture09 (20)

Recently uploaded

Recently uploaded (20)

Lecture09