Applications of artificial intelligence in the analysis
of media web content.
Contents
Data aggregation / transformation / refinement
Train corpora
Tokenization process
Attributes and instances
Vector space modeling / Bag of words
Feature selection
Machine learning algorithms
Classifiers – binary / multi-class, multi-label / problem transformation methods
Learning evaluation
Quality Management workflow / “Human in the loop” supervised learning
Data transformation and refinement
Original:
A <b>multilayer perceptron</b> (MLP) is a class of <a href="/wiki/Feedforward_neural_network" title="Feedforward neural
network">feedforward</a> <a href="/wiki/Artificial_neural_network" title="Artificial neural network">artificial neural network</a>. An
MLP consists of at least three layers of nodes. Except for the input nodes, each node is a neuron that uses a nonlinear <a
href="/wiki/Activation_function" title="Activation function">activation function</a>. MLP utilizes a <a href="/wiki/Supervised_learning"
title="Supervised learning">supervised learning</a> technique called <a href="/wiki/Backpropagation"
title="Backpropagation">backpropagation</a> for training.<sup id="cite_ref-1" class="reference"><a href="#cite_note-
1">[1]</a></sup><sup id="cite_ref-2" class="reference"><a href="#cite_note-2">[2]</a></sup> Its multiple layers and non-linear
activation distinguish MLP from a linear <a href="/wiki/Perceptron" title="Perceptron">perceptron</a>. It can distinguish data that is
not <a href="/wiki/Linear_separability" title="Linear separability">linearly separable</a>.<sup id="cite_ref-Cybenko1989_3-0"
class="reference"><a href="#cite_note-Cybenko1989-3">[3]</a></sup></p>
<p>Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single
hidden layer.<sup id="cite_ref-4" class="reference">
Stripped:
A multilayer perceptron (MLP) is a class of feedforward artificial neural network. An MLP consists of at least three layers of nodes.
Except for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning
technique called backpropagation for training.[1][2] Its multiple layers and non-linear activation distinguish MLP from a linear
perceptron. It can distinguish data that is not linearly separable.[3]
Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single hidden
layer.
Stemmed:
multilay perceptron class feedforward artifici neural network n consist least three layer node xcept input node each node neuron us
nonlinear activ function util supervis learn techniqu call backpropag train ts multipl layer non linear activ distinguish from linear
perceptron t can distinguish data linearli separ ultilay perceptron sometim colloqui refer vanilla neural network especi when have singl
hidden layer
Training corpus
Vector spaces
Vectors are abstract
mathematical objects with
particular properties, which
in some cases can be
visualized as arrows.
Vector spaces are
collections of vectors well
characterized by their
dimension, which specifies
the number of independent
directions in the space.
Topic-based vector space model
Tokenizers
Example sentence
Except for the input nodes, each node is a neuron that uses a nonlinear activation function.
Word tokens
except | input | nodes | each | node | neuron | uses | nonlinear | activation | function
Ngrams
except input | except input nodes | except input nodes each | input nodes |
input nodes each | input nodes each node | nodes each | nodes each node | ...
Attributes
@attribute except input {0,1}
@attribute except input nodes {0,1}
@attribute except input nodes each {0,1}
@attribute input nodes {0,1}
@attribute input nodes each {0,1}
@attribute input nodes each node {0,1}
@attribute nodes each {0,1}
@attribute nodes each node {0,1}
@attribute nodes each node neuron {0,1}
@attribute sentiment_class {1,2,3}
Instances
{3 1,9 1,16 1,18 1,28 1,31 1,37 1,39 1,44 1,59 1,72 1,76 1,79 1,93 1,131 1,137 1,152 1,156 1,170
1,177 1,184 1,199 1,204 1,206 1,212 1,231 1,233 1,240 1,244 1,261 1,268 1,273 1,279 1,282
1,295 1,305 1,319 1,328 1,337 1,353 1,365 1,366 1,368 1,372 1,375 1,388 1,390 1,403 1,412
1,417 1,419 1,421 1,426 1,431 1,439 1,443 1,460 1,471 1,487 1,500 1,501 1,503 1,513 1,528
1,548 1,552 1,596 1,606 1,607 1,608 1,619 1,622 1,641 1,645 1,665 1,671 1,673 1,676 1,681
1,741 1,762 1}
{5 1,8 1,15 1,18 1,21 1,23 1,25 1,28 1,31 1,37 1,39 1,40 1,43 1,44 1,46 1,47 1,54 1,55 1,57 1,59
1,67 1,73 1,74 1,76 1,85 1,87 1,88 1,131 1,133 1,137 1,160 1,170 1,175 1,176 1,179 1,185 1,187
1,191 1,199 1,204 1,206 1,214 1,218 1,221 1,242 1,244 1,256 1,274 1,279 1,282 1,289 1,295
1,297 1,298 1,299 1,300 1,301 1,310 1,314 1,319 1,325 1,327 1,328 1,329 1,330 1,344 1,353
1,360 1,364 1,368 1,370 1,371 1,374 1,375 1,376 1,380 1,386 1,398 1,403 1,414 1,426 1,429
1,433 1,436 1,445 1,460 1,462 1,464 1,472 1,475 1,478 1,486 1,499 1,501 1,503 1,513 1,521
1,527 1,529 1,530 1,531 1,532 1,534 1,549 1,552 1,554 1,560 1,572 1,575 1,580 1,596 1,602
1,604 1,606 1,628 1,653 1,655 1,660 1,662 1,670 1,688 1,701 1,714 1,729 1,731 1,733 1,743
1,757 1,762 2}
{12 1,39 1,59 1,60 1,76 1,79 1,81 1,88 1,137 1,175 1,228 1,235 1,241 1,256 1,270 1,281 1,282
1,318 1,330 1,375 1,398 1,408 1,455 1,459 1,467 1,474 1,487 1,501 1,507 1,513 1,518 1,522
1,537 1,560 1,588 1,590 1,596 1,606 1,622 1,636 1,658 1,664 1,690 1,698 1,716 1,721 1,762 3}
Feature selection
F-score and Supported Sequential Forward Search (F_SSFS)
Original feature variables
Sort F-score
Calculate F-score
Select top K F-score features
Classification tasks
● Binary classification Discriminates in which predefined category the document belongs.
Typical scenarios of using classifiers can be applied in tasks such as “spam versus
ham” to filter undesired messages in mailbox, or to filter out which documents in large
data set aren't relevant to specific topic of interest.
● Multiclass classification means a classification task with more than two classes; e.g.,
classify a set of images of fruits which may be oranges, apples, or pears. Multiclass
classification makes the assumption that each sample is assigned to one and only one
label: a fruit can be either an apple or a pear but not both at the same time.
● Multilabel classification assigns to each sample a set of target labels. This can be
thought as predicting properties of a data-point that are not mutually exclusive, such as
topics that are relevant for a document. A text might be about any of religion, politics,
finance or education at the same time or none of these.
Classification algorithms
● SVM (Support Vector Machine) is supervised learning algorithm that uses given set of
training examples, each belonging to one or other classification category. SVM model is
representation of the examples as points in space and the examples of the separate
categories are divided by a clear gap in that space.
● SMO (Sequential minimal optimization) is faster optimization of SVM algorithm and has
better scaling properties for difficult SVM problems dividing them to smaller sub-
problems that are solved analytically.
● CRF (Conditional Random Fields) algorithm is the basis for NER. CRF is sequence
modeling statistical method that takes context into account and predicts sequences of
labels for sequences of input samples.
● RBF NN (Radial Basis Function Neural Network) – RBFNN is special neural network
architecture in which activation function in hidden layer differs as non-linear comparing
to linear activation function in output layer. Usually output layer is represented as scalar
function of input vectors of real numbers.
● MLP (Multi-layered Perceptron) is a feedforward, backpropagation artificial neural
network. An MLP consists of at least three layers of nodes. Except for the input nodes,
each node is a neuron that uses a nonlinear activation function.
Evaluation
Cross-validation
● All-versus-all instances evaluation
● Instances are separated by folds
Test data set
● 60-40 percents ratio of train and test data
Named entities recognition
Named entities recognition
● Person
● Organization
● Geographical location
● Money amounts
● Percents
● Date
● Time
Named entities recognition
Dreux LOC
contre O
un O
socialiste O
des O
plus O
sectaires O
Viollette PER
le O
second O
contre O
Cha- O
ronnat O
autre O
pilier O
du O
Bloc O
à O
Troyes LOC
Notre O
excellent O
ami O
, O
M. O
Louis PER
Latapie PER
Named entities recognition
trainFile = ../classifiers/french-model.tsv
serializeTo = ../classifiers/french-model.ser.gz
# structure of your training file; this tells the classifier that
# the word is in column 0 and the correct answer is in column 1
map=word=0,answer=1
# This specifies the order of the CRF: order 1 means that features
# apply at most to a class pair of previous class and current class
# or current class and next class.
maxLeft=1
# these are the features we'd like to train with
# some are discussed below, the rest can be
# understood by looking at NERFeatureFactory
useClassFeature=true
useWord=true
useNGrams=true
noMidNGrams=true
maxNGramLeng=6
usePrev=true
useNext=true
useDisjunctive=true
useSequences=true
usePrevSequences=true
# the last 4 properties deal with word shape features
useTypeSeqs=true
useTypeSeqs2=true
useTypeySequences=true
wordShape=chris2useLC
Neural Networks
Model of a Neuron Multilayer perceptron with two hidden
layers
Neural Networks
Sigmoid Hyperbolic tangent
Latent semantics indexing
Human-in-the-loop
Frameworks and tools
WEKA
https://www.cs.waikato.ac.nz/ml/weka/
Deeplearning4j
https://deeplearning4j.org/
StanfordNLP
http://nlp.stanford.edu/
Приложение на изкуствен интелект при анализа на медийно съдържание в интернет Деян Пейчев (Identrics)

Приложение на изкуствен интелект при анализа на медийно съдържание в интернет Деян Пейчев (Identrics)

  • 1.
    Applications of artificialintelligence in the analysis of media web content.
  • 2.
    Contents Data aggregation /transformation / refinement Train corpora Tokenization process Attributes and instances Vector space modeling / Bag of words Feature selection Machine learning algorithms Classifiers – binary / multi-class, multi-label / problem transformation methods Learning evaluation Quality Management workflow / “Human in the loop” supervised learning
  • 3.
    Data transformation andrefinement Original: A <b>multilayer perceptron</b> (MLP) is a class of <a href="/wiki/Feedforward_neural_network" title="Feedforward neural network">feedforward</a> <a href="/wiki/Artificial_neural_network" title="Artificial neural network">artificial neural network</a>. An MLP consists of at least three layers of nodes. Except for the input nodes, each node is a neuron that uses a nonlinear <a href="/wiki/Activation_function" title="Activation function">activation function</a>. MLP utilizes a <a href="/wiki/Supervised_learning" title="Supervised learning">supervised learning</a> technique called <a href="/wiki/Backpropagation" title="Backpropagation">backpropagation</a> for training.<sup id="cite_ref-1" class="reference"><a href="#cite_note- 1">[1]</a></sup><sup id="cite_ref-2" class="reference"><a href="#cite_note-2">[2]</a></sup> Its multiple layers and non-linear activation distinguish MLP from a linear <a href="/wiki/Perceptron" title="Perceptron">perceptron</a>. It can distinguish data that is not <a href="/wiki/Linear_separability" title="Linear separability">linearly separable</a>.<sup id="cite_ref-Cybenko1989_3-0" class="reference"><a href="#cite_note-Cybenko1989-3">[3]</a></sup></p> <p>Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single hidden layer.<sup id="cite_ref-4" class="reference"> Stripped: A multilayer perceptron (MLP) is a class of feedforward artificial neural network. An MLP consists of at least three layers of nodes. Except for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training.[1][2] Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that is not linearly separable.[3] Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single hidden layer. Stemmed: multilay perceptron class feedforward artifici neural network n consist least three layer node xcept input node each node neuron us nonlinear activ function util supervis learn techniqu call backpropag train ts multipl layer non linear activ distinguish from linear perceptron t can distinguish data linearli separ ultilay perceptron sometim colloqui refer vanilla neural network especi when have singl hidden layer
  • 4.
  • 5.
    Vector spaces Vectors areabstract mathematical objects with particular properties, which in some cases can be visualized as arrows. Vector spaces are collections of vectors well characterized by their dimension, which specifies the number of independent directions in the space.
  • 6.
  • 7.
    Tokenizers Example sentence Except forthe input nodes, each node is a neuron that uses a nonlinear activation function. Word tokens except | input | nodes | each | node | neuron | uses | nonlinear | activation | function Ngrams except input | except input nodes | except input nodes each | input nodes | input nodes each | input nodes each node | nodes each | nodes each node | ...
  • 8.
    Attributes @attribute except input{0,1} @attribute except input nodes {0,1} @attribute except input nodes each {0,1} @attribute input nodes {0,1} @attribute input nodes each {0,1} @attribute input nodes each node {0,1} @attribute nodes each {0,1} @attribute nodes each node {0,1} @attribute nodes each node neuron {0,1} @attribute sentiment_class {1,2,3}
  • 9.
    Instances {3 1,9 1,161,18 1,28 1,31 1,37 1,39 1,44 1,59 1,72 1,76 1,79 1,93 1,131 1,137 1,152 1,156 1,170 1,177 1,184 1,199 1,204 1,206 1,212 1,231 1,233 1,240 1,244 1,261 1,268 1,273 1,279 1,282 1,295 1,305 1,319 1,328 1,337 1,353 1,365 1,366 1,368 1,372 1,375 1,388 1,390 1,403 1,412 1,417 1,419 1,421 1,426 1,431 1,439 1,443 1,460 1,471 1,487 1,500 1,501 1,503 1,513 1,528 1,548 1,552 1,596 1,606 1,607 1,608 1,619 1,622 1,641 1,645 1,665 1,671 1,673 1,676 1,681 1,741 1,762 1} {5 1,8 1,15 1,18 1,21 1,23 1,25 1,28 1,31 1,37 1,39 1,40 1,43 1,44 1,46 1,47 1,54 1,55 1,57 1,59 1,67 1,73 1,74 1,76 1,85 1,87 1,88 1,131 1,133 1,137 1,160 1,170 1,175 1,176 1,179 1,185 1,187 1,191 1,199 1,204 1,206 1,214 1,218 1,221 1,242 1,244 1,256 1,274 1,279 1,282 1,289 1,295 1,297 1,298 1,299 1,300 1,301 1,310 1,314 1,319 1,325 1,327 1,328 1,329 1,330 1,344 1,353 1,360 1,364 1,368 1,370 1,371 1,374 1,375 1,376 1,380 1,386 1,398 1,403 1,414 1,426 1,429 1,433 1,436 1,445 1,460 1,462 1,464 1,472 1,475 1,478 1,486 1,499 1,501 1,503 1,513 1,521 1,527 1,529 1,530 1,531 1,532 1,534 1,549 1,552 1,554 1,560 1,572 1,575 1,580 1,596 1,602 1,604 1,606 1,628 1,653 1,655 1,660 1,662 1,670 1,688 1,701 1,714 1,729 1,731 1,733 1,743 1,757 1,762 2} {12 1,39 1,59 1,60 1,76 1,79 1,81 1,88 1,137 1,175 1,228 1,235 1,241 1,256 1,270 1,281 1,282 1,318 1,330 1,375 1,398 1,408 1,455 1,459 1,467 1,474 1,487 1,501 1,507 1,513 1,518 1,522 1,537 1,560 1,588 1,590 1,596 1,606 1,622 1,636 1,658 1,664 1,690 1,698 1,716 1,721 1,762 3}
  • 10.
    Feature selection F-score andSupported Sequential Forward Search (F_SSFS) Original feature variables Sort F-score Calculate F-score Select top K F-score features
  • 11.
    Classification tasks ● Binaryclassification Discriminates in which predefined category the document belongs. Typical scenarios of using classifiers can be applied in tasks such as “spam versus ham” to filter undesired messages in mailbox, or to filter out which documents in large data set aren't relevant to specific topic of interest. ● Multiclass classification means a classification task with more than two classes; e.g., classify a set of images of fruits which may be oranges, apples, or pears. Multiclass classification makes the assumption that each sample is assigned to one and only one label: a fruit can be either an apple or a pear but not both at the same time. ● Multilabel classification assigns to each sample a set of target labels. This can be thought as predicting properties of a data-point that are not mutually exclusive, such as topics that are relevant for a document. A text might be about any of religion, politics, finance or education at the same time or none of these.
  • 12.
    Classification algorithms ● SVM(Support Vector Machine) is supervised learning algorithm that uses given set of training examples, each belonging to one or other classification category. SVM model is representation of the examples as points in space and the examples of the separate categories are divided by a clear gap in that space. ● SMO (Sequential minimal optimization) is faster optimization of SVM algorithm and has better scaling properties for difficult SVM problems dividing them to smaller sub- problems that are solved analytically. ● CRF (Conditional Random Fields) algorithm is the basis for NER. CRF is sequence modeling statistical method that takes context into account and predicts sequences of labels for sequences of input samples. ● RBF NN (Radial Basis Function Neural Network) – RBFNN is special neural network architecture in which activation function in hidden layer differs as non-linear comparing to linear activation function in output layer. Usually output layer is represented as scalar function of input vectors of real numbers. ● MLP (Multi-layered Perceptron) is a feedforward, backpropagation artificial neural network. An MLP consists of at least three layers of nodes. Except for the input nodes, each node is a neuron that uses a nonlinear activation function.
  • 13.
    Evaluation Cross-validation ● All-versus-all instancesevaluation ● Instances are separated by folds Test data set ● 60-40 percents ratio of train and test data
  • 15.
  • 16.
    Named entities recognition ●Person ● Organization ● Geographical location ● Money amounts ● Percents ● Date ● Time
  • 17.
    Named entities recognition DreuxLOC contre O un O socialiste O des O plus O sectaires O Viollette PER le O second O contre O Cha- O ronnat O autre O pilier O du O Bloc O à O Troyes LOC Notre O excellent O ami O , O M. O Louis PER Latapie PER
  • 18.
    Named entities recognition trainFile= ../classifiers/french-model.tsv serializeTo = ../classifiers/french-model.ser.gz # structure of your training file; this tells the classifier that # the word is in column 0 and the correct answer is in column 1 map=word=0,answer=1 # This specifies the order of the CRF: order 1 means that features # apply at most to a class pair of previous class and current class # or current class and next class. maxLeft=1 # these are the features we'd like to train with # some are discussed below, the rest can be # understood by looking at NERFeatureFactory useClassFeature=true useWord=true useNGrams=true noMidNGrams=true maxNGramLeng=6 usePrev=true useNext=true useDisjunctive=true useSequences=true usePrevSequences=true # the last 4 properties deal with word shape features useTypeSeqs=true useTypeSeqs2=true useTypeySequences=true wordShape=chris2useLC
  • 19.
    Neural Networks Model ofa Neuron Multilayer perceptron with two hidden layers
  • 20.
  • 21.
  • 22.
  • 23.