Data aggregation / transformation / refinement
Train corpora
Tokenization process
Attributes and instances
Vector space modeling / Bag of words
Feature selection
Machine learning algorithms Classifiers – binary / multi-class, multi-label / problem transformation methods
Learning evaluation
Quality Management workflow / “Human in the loop” supervised learning
2. Contents
Data aggregation / transformation / refinement
Train corpora
Tokenization process
Attributes and instances
Vector space modeling / Bag of words
Feature selection
Machine learning algorithms
Classifiers – binary / multi-class, multi-label / problem transformation methods
Learning evaluation
Quality Management workflow / “Human in the loop” supervised learning
3. Data transformation and refinement
Original:
A <b>multilayer perceptron</b> (MLP) is a class of <a href="/wiki/Feedforward_neural_network" title="Feedforward neural
network">feedforward</a> <a href="/wiki/Artificial_neural_network" title="Artificial neural network">artificial neural network</a>. An
MLP consists of at least three layers of nodes. Except for the input nodes, each node is a neuron that uses a nonlinear <a
href="/wiki/Activation_function" title="Activation function">activation function</a>. MLP utilizes a <a href="/wiki/Supervised_learning"
title="Supervised learning">supervised learning</a> technique called <a href="/wiki/Backpropagation"
title="Backpropagation">backpropagation</a> for training.<sup id="cite_ref-1" class="reference"><a href="#cite_note-
1">[1]</a></sup><sup id="cite_ref-2" class="reference"><a href="#cite_note-2">[2]</a></sup> Its multiple layers and non-linear
activation distinguish MLP from a linear <a href="/wiki/Perceptron" title="Perceptron">perceptron</a>. It can distinguish data that is
not <a href="/wiki/Linear_separability" title="Linear separability">linearly separable</a>.<sup id="cite_ref-Cybenko1989_3-0"
class="reference"><a href="#cite_note-Cybenko1989-3">[3]</a></sup></p>
<p>Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single
hidden layer.<sup id="cite_ref-4" class="reference">
Stripped:
A multilayer perceptron (MLP) is a class of feedforward artificial neural network. An MLP consists of at least three layers of nodes.
Except for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning
technique called backpropagation for training.[1][2] Its multiple layers and non-linear activation distinguish MLP from a linear
perceptron. It can distinguish data that is not linearly separable.[3]
Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single hidden
layer.
Stemmed:
multilay perceptron class feedforward artifici neural network n consist least three layer node xcept input node each node neuron us
nonlinear activ function util supervis learn techniqu call backpropag train ts multipl layer non linear activ distinguish from linear
perceptron t can distinguish data linearli separ ultilay perceptron sometim colloqui refer vanilla neural network especi when have singl
hidden layer
5. Vector spaces
Vectors are abstract
mathematical objects with
particular properties, which
in some cases can be
visualized as arrows.
Vector spaces are
collections of vectors well
characterized by their
dimension, which specifies
the number of independent
directions in the space.
7. Tokenizers
Example sentence
Except for the input nodes, each node is a neuron that uses a nonlinear activation function.
Word tokens
except | input | nodes | each | node | neuron | uses | nonlinear | activation | function
Ngrams
except input | except input nodes | except input nodes each | input nodes |
input nodes each | input nodes each node | nodes each | nodes each node | ...
8. Attributes
@attribute except input {0,1}
@attribute except input nodes {0,1}
@attribute except input nodes each {0,1}
@attribute input nodes {0,1}
@attribute input nodes each {0,1}
@attribute input nodes each node {0,1}
@attribute nodes each {0,1}
@attribute nodes each node {0,1}
@attribute nodes each node neuron {0,1}
@attribute sentiment_class {1,2,3}
10. Feature selection
F-score and Supported Sequential Forward Search (F_SSFS)
Original feature variables
Sort F-score
Calculate F-score
Select top K F-score features
11. Classification tasks
● Binary classification Discriminates in which predefined category the document belongs.
Typical scenarios of using classifiers can be applied in tasks such as “spam versus
ham” to filter undesired messages in mailbox, or to filter out which documents in large
data set aren't relevant to specific topic of interest.
● Multiclass classification means a classification task with more than two classes; e.g.,
classify a set of images of fruits which may be oranges, apples, or pears. Multiclass
classification makes the assumption that each sample is assigned to one and only one
label: a fruit can be either an apple or a pear but not both at the same time.
● Multilabel classification assigns to each sample a set of target labels. This can be
thought as predicting properties of a data-point that are not mutually exclusive, such as
topics that are relevant for a document. A text might be about any of religion, politics,
finance or education at the same time or none of these.
12. Classification algorithms
● SVM (Support Vector Machine) is supervised learning algorithm that uses given set of
training examples, each belonging to one or other classification category. SVM model is
representation of the examples as points in space and the examples of the separate
categories are divided by a clear gap in that space.
● SMO (Sequential minimal optimization) is faster optimization of SVM algorithm and has
better scaling properties for difficult SVM problems dividing them to smaller sub-
problems that are solved analytically.
● CRF (Conditional Random Fields) algorithm is the basis for NER. CRF is sequence
modeling statistical method that takes context into account and predicts sequences of
labels for sequences of input samples.
● RBF NN (Radial Basis Function Neural Network) – RBFNN is special neural network
architecture in which activation function in hidden layer differs as non-linear comparing
to linear activation function in output layer. Usually output layer is represented as scalar
function of input vectors of real numbers.
● MLP (Multi-layered Perceptron) is a feedforward, backpropagation artificial neural
network. An MLP consists of at least three layers of nodes. Except for the input nodes,
each node is a neuron that uses a nonlinear activation function.
16. Named entities recognition
● Person
● Organization
● Geographical location
● Money amounts
● Percents
● Date
● Time
17. Named entities recognition
Dreux LOC
contre O
un O
socialiste O
des O
plus O
sectaires O
Viollette PER
le O
second O
contre O
Cha- O
ronnat O
autre O
pilier O
du O
Bloc O
à O
Troyes LOC
Notre O
excellent O
ami O
, O
M. O
Louis PER
Latapie PER
18. Named entities recognition
trainFile = ../classifiers/french-model.tsv
serializeTo = ../classifiers/french-model.ser.gz
# structure of your training file; this tells the classifier that
# the word is in column 0 and the correct answer is in column 1
map=word=0,answer=1
# This specifies the order of the CRF: order 1 means that features
# apply at most to a class pair of previous class and current class
# or current class and next class.
maxLeft=1
# these are the features we'd like to train with
# some are discussed below, the rest can be
# understood by looking at NERFeatureFactory
useClassFeature=true
useWord=true
useNGrams=true
noMidNGrams=true
maxNGramLeng=6
usePrev=true
useNext=true
useDisjunctive=true
useSequences=true
usePrevSequences=true
# the last 4 properties deal with word shape features
useTypeSeqs=true
useTypeSeqs2=true
useTypeySequences=true
wordShape=chris2useLC