SlideShare a Scribd company logo
Machine Learning and Automatic Text Classification
What’s Next?
Fabrizio Sebastiani
(Joint work with Giacomo Berardi and Andrea Esuli)
Istituto di Scienza e Tecnologie dell’Informazione
Consiglio Nazionale delle Ricerche
56124 Pisa, Italy
ASC Methods Conference
Winchester, UK – September 6-7, 2013
Prequel: ML for Automated Verbatim Coding
In the last 10 years we have championed an approach to automatically coding
open-ended answers (“verbatims”) based on “machine learning”;
2003 : D. Giorgetti, I. Prodanof, and F. Sebastiani. Automatic Coding of
Open-ended Questions Using Text Categorization Techniques. Proceedings of
the 4th International Conference of the Association for Survey Computing,
Warwick, UK, pp. 173-–184.
2007 : T. Macer, M. Pearson, and F. Sebastiani. Cracking the Code: What
Customers Say, in Their Own Words. In Proceedings of the 50th Annual
Conference of the Market Research Society, Brighton, UK. (Best New
Thinking Award, Shortlisted for Best Paper Award and for ASC/MRS Tech
Effectiveness Award)
2010 : A. Esuli and F. Sebastiani. Machines that learn how to code
open-ended survey data. International Journal of Market Research, 52(6).
(Shortlisted for best 2010 IJMR paper)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 2 / 41
Prequel: ML for Automated Verbatim Coding (cont’d)
Based on these principles we have built a software system, called VCS
(“Verbatim Coding System”), which has been variously applied to coding
surveys in the social sciences, customer relationship management, and market
research.
VCS is based on a “supervised learning” metaphor : the classifier learns (or:
is trained), from sample manually classified verbatims, the characteristics a
new verbatim should have in order to be attributed a given code.
The human operator who feeds sample manually classified verbatims to the
system plays the role of the “supervisor”.
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 3 / 41
A Verbatim Coding System based on ML
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 4 / 41
A Verbatim Coding System based on ML
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 5 / 41
A Verbatim Coding System based on ML
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 6 / 41
A Verbatim Coding System based on ML
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 7 / 41
What I’ll be talking about today
A talk about the role of humans in the verbatim coding process, and about
how to best support their work
I will be looking at scenarios in which
1 automated verbatim coding technology is used ...
2 ... but the level of accuracy that can be obtained from the classifier is not
considered sufficient ...
3 ... with the consequence that one or more human coders are asked to inspect
(and correct where appropriate) a portion of the classification decisions, with
the goal of increasing overall accuracy.
Problem
How can we support / optimize the work of the human coders?
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 8 / 41
What I’ll be talking about today
A talk about the role of humans in the verbatim coding process, and about
how to best support their work
I will be looking at scenarios in which
1 automated verbatim coding technology is used ...
2 ... but the level of accuracy that can be obtained from the classifier is not
considered sufficient ...
3 ... with the consequence that one or more human coders are asked to inspect
(and correct where appropriate) a portion of the classification decisions, with
the goal of increasing overall accuracy.
Problem
How can we support / optimize the work of the human coders?
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 8 / 41
What I’ll be talking about today
A talk about the role of humans in the verbatim coding process, and about
how to best support their work
I will be looking at scenarios in which
1 automated verbatim coding technology is used ...
2 ... but the level of accuracy that can be obtained from the classifier is not
considered sufficient ...
3 ... with the consequence that one or more human coders are asked to inspect
(and correct where appropriate) a portion of the classification decisions, with
the goal of increasing overall accuracy.
Problem
How can we support / optimize the work of the human coders?
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 8 / 41
What I’ll be talking about today
A talk about the role of humans in the verbatim coding process, and about
how to best support their work
I will be looking at scenarios in which
1 automated verbatim coding technology is used ...
2 ... but the level of accuracy that can be obtained from the classifier is not
considered sufficient ...
3 ... with the consequence that one or more human coders are asked to inspect
(and correct where appropriate) a portion of the classification decisions, with
the goal of increasing overall accuracy.
Problem
How can we support / optimize the work of the human coders?
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 8 / 41
A worked out example
predicted
Y N
true
Y TP = 4 FP = 3
N FN = 4 TN = 9
F1 =
2TP
2TP + FP + FN
= 0.53
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 9 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 4 FP = 3
N FN = 4 TN = 9
F1 =
2TP
2TP + FP + FN
= 0.53
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 10 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 5 FP = 3
N FN = 3 TN = 9
F1 =
2TP
2TP + FP + FN
= 0.63
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 11 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 5 FP = 2
N FN = 3 TN = 10
F1 =
2TP
2TP + FP + FN
= 0.67
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 12 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 6 FP = 2
N FN = 2 TN = 10
F1 =
2TP
2TP + FP + FN
= 0.75
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 13 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 6 FP = 1
N FN = 2 TN = 11
F1 =
2TP
2TP + FP + FN
= 0.80
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 14 / 41
What I’ll be talking about (cont’d)
We need methods that
given a desired level of accuracy, minimize the human coders’ effort necessary
to achieve it; alternatively,
given an available amount of human coders’ effort, maximize the accuracy
that can be obtained through it
This can be achieved by ranking the automatically classified verbatims in
such a way that, by starting the inspection from the top of the ranking, the
cost-effectiveness of the human coders’ work is maximized
We call the task of generating such a ranking Semi-Automatic Text
Classification (SATC)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 15 / 41
What I’ll be talking about (cont’d)
We need methods that
given a desired level of accuracy, minimize the human coders’ effort necessary
to achieve it; alternatively,
given an available amount of human coders’ effort, maximize the accuracy
that can be obtained through it
This can be achieved by ranking the automatically classified verbatims in
such a way that, by starting the inspection from the top of the ranking, the
cost-effectiveness of the human coders’ work is maximized
We call the task of generating such a ranking Semi-Automatic Text
Classification (SATC)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 15 / 41
What I’ll be talking about (cont’d)
Previous work has addressed SATC via techniques developed for active
learning
In both cases, the automatically classified verbatims are ranked with the goal
of having the human coder start inspecting/correcting from the top; however
in active learning the goal is providing new training examples
in SATC the goal is increasing the overall accuracy of the classified set
We claim that a ranking generated “à la active learning” is suboptimal for
SATC1
1G Berardi, A Esuli, F Sebastiani. A Utility-Theoretic Ranking Method for Semi-Automated Text
Classification. Proceedings of the 35th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval (SIGIR 2012), Portland, US, 2012.
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 16 / 41
What I’ll be talking about (cont’d)
Previous work has addressed SATC via techniques developed for active
learning
In both cases, the automatically classified verbatims are ranked with the goal
of having the human coder start inspecting/correcting from the top; however
in active learning the goal is providing new training examples
in SATC the goal is increasing the overall accuracy of the classified set
We claim that a ranking generated “à la active learning” is suboptimal for
SATC1
1G Berardi, A Esuli, F Sebastiani. A Utility-Theoretic Ranking Method for Semi-Automated Text
Classification. Proceedings of the 35th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval (SIGIR 2012), Portland, US, 2012.
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 16 / 41
A Verbatim Coding System based on ML
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 17 / 41
A Verbatim Coding System based on ML
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 18 / 41
Outline of this talk
1 We discuss how to measure “error reduction” (i.e., the increase in accuracy
deriving from the human coder’s inspection activity)
2 We discuss a method for maximizing the expected error reduction for a fixed
amount of annotation effort
3 We show some promising experimental results
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 19 / 41
Outline
1 Error Reduction, and How to Measure it
2 Error Reduction, and How to Maximize it
3 Some Experimental Results
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 20 / 41
Error Reduction, and how to measure it
Assume we have
1 Code c;
2 Classifier h for c;
3 Set of unlabeled verbatims D that we have automatically classified by means
of h, so that every verbatim in D is associated
with a binary decision (Yes or No)
with a confidence score (a positive real number)
4 Measure of accuracy A, ranging on [0,1]
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 21 / 41
Error Reduction, and how to Measure it (cont’d)
We will assume that A is
F1 =
2 · TP
(2 · TP) + FP + FN
but any measure of accuracy based on a contingency table may be used
An amount of error, measured as E = (1 − A), is present in the automatically
classified set D
Human coders inspect-and-correct a portion of D with the goal of reducing
the error present in D
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 22 / 41
Error Reduction, and how to Measure it (cont’d)
We will assume that A is
F1 =
2 · TP
(2 · TP) + FP + FN
but any measure of accuracy based on a contingency table may be used
An amount of error, measured as E = (1 − A), is present in the automatically
classified set D
Human coders inspect-and-correct a portion of D with the goal of reducing
the error present in D
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 22 / 41
Error Reduction, and how to Measure it (cont’d)
We define error at rank n (noted as E(n)) as the error still present in D after
the coder has inspected the verbatims at the first n rank positions
E(0) is the initial error generated by the automated classifier
E(|D|) is 0
We define error reduction at rank n (noted as ER(n)) to be
ER(n) =
E(0) − E(n)
E(0)
the error reduction obtained by the human coder who inspects the verbatims
at the first n rank positions
ER(n) ∈ [0, 1]
ER(n) = 0 indicates no reduction
ER(n) = 1 indicates total elimination of error
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 23 / 41
Error Reduction, and how to Measure it (cont’d)
We define error at rank n (noted as E(n)) as the error still present in D after
the coder has inspected the verbatims at the first n rank positions
E(0) is the initial error generated by the automated classifier
E(|D|) is 0
We define error reduction at rank n (noted as ER(n)) to be
ER(n) =
E(0) − E(n)
E(0)
the error reduction obtained by the human coder who inspects the verbatims
at the first n rank positions
ER(n) ∈ [0, 1]
ER(n) = 0 indicates no reduction
ER(n) = 1 indicates total elimination of error
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 23 / 41
Error Reduction, and how to Measure it (cont’d)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 24 / 41
Outline
1 Error Reduction, and How to Measure it
2 Error Reduction, and How to Maximize it
3 Some Experimental Results
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 25 / 41
Error Reduction, and how to Maximize it
Problem
How should we rank the verbatims in D so as to maximize the expected error
reduction?
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 26 / 41
A worked out example
predicted
Y N
true
Y TP = 4 FP = 3
N FN = 4 TN = 9
F1 =
2TP
2TP + FP + FN
= 0.53
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 27 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 4 FP = 3
N FN = 4 TN = 9
F1 =
2TP
2TP + FP + FN
= 0.53
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 28 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 5 FP = 3
N FN = 3 TN = 9
F1 =
2TP
2TP + FP + FN
= 0.63
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 29 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 5 FP = 2
N FN = 3 TN = 10
F1 =
2TP
2TP + FP + FN
= 0.67
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 30 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 6 FP = 2
N FN = 2 TN = 10
F1 =
2TP
2TP + FP + FN
= 0.75
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 31 / 41
A worked out example (cont’d)
predicted
Y N
true
Y TP = 6 FP = 1
N FN = 2 TN = 11
F1 =
2TP
2TP + FP + FN
= 0.80
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 32 / 41
Error Reduction, and how to Maximize it
Intuition 1: Verbatims that have a higher probability of being misclassified
should be ranked higher
Intuition 2: Verbatims that, if corrected, bring about a higher gain (i.e., a
bigger increase in A) should be ranked higher
This means that verbatims that have a higher utility (= probability × gain)
should be ranked higher
A false positive and a false negative may have different impacts on A !
While in active learning only the probability of misclassification is relevant, in
SATC gains are also relevant
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 33 / 41
Error Reduction, and how to Maximize it (cont’d)
Given a set Ω of mutually disjoint events, a utility function is defined as
U(Ω) =
ω∈Ω
P(ω)G(ω)
where
P(ω) is the probability of occurrence of event ω
G(ω) is the gain obtained if event ω occurs
We can thus estimate the utility, for the aims of increasing A, of manually
inspecting a verbatim d as
U(TP, TN, FP, FN) = P(FP) · G(FP) + P(FN) · G(FN)
provided we can estimate
If d is labelled with code c: P(FP) and G(FP)
If d is not labelled with code c: P(FN) and G(FN)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 34 / 41
Error Reduction, and how to Maximize it (cont’d)
Given a set Ω of mutually disjoint events, a utility function is defined as
U(Ω) =
ω∈Ω
P(ω)G(ω)
where
P(ω) is the probability of occurrence of event ω
G(ω) is the gain obtained if event ω occurs
We can thus estimate the utility, for the aims of increasing A, of manually
inspecting a verbatim d as
U(TP, TN, FP, FN) = P(FP) · G(FP) + P(FN) · G(FN)
provided we can estimate
If d is labelled with code c: P(FP) and G(FP)
If d is not labelled with code c: P(FN) and G(FN)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 34 / 41
Error Reduction, and how to Maximize it (cont’d)
Estimating P(FP) and P(FN) (the probability of misclassification) can be
done by converting the confidence score returned by the classifier into a
probability of correct classification
Tricky: requires probability “calibration” via a generalized sigmoid function to
be optimized via k-fold cross-validation
Gains G(FP) and G(FN) can be defined “differentially”; i.e.,
The gain obtained by correcting a FN is (AFN→TP
− A)
The gain obtained by correcting a FP is (AFP→TN
− A)
Gains need to be estimated by estimating the contingency table on the
training set via k-fold cross-validation
Key observation: in general, G(FP) = G(FN)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 35 / 41
Error Reduction, and how to Maximize it (cont’d)
Estimating P(FP) and P(FN) (the probability of misclassification) can be
done by converting the confidence score returned by the classifier into a
probability of correct classification
Tricky: requires probability “calibration” via a generalized sigmoid function to
be optimized via k-fold cross-validation
Gains G(FP) and G(FN) can be defined “differentially”; i.e.,
The gain obtained by correcting a FN is (AFN→TP
− A)
The gain obtained by correcting a FP is (AFP→TN
− A)
Gains need to be estimated by estimating the contingency table on the
training set via k-fold cross-validation
Key observation: in general, G(FP) = G(FN)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 35 / 41
Error Reduction, and how to Maximize it (cont’d)
Estimating P(FP) and P(FN) (the probability of misclassification) can be
done by converting the confidence score returned by the classifier into a
probability of correct classification
Tricky: requires probability “calibration” via a generalized sigmoid function to
be optimized via k-fold cross-validation
Gains G(FP) and G(FN) can be defined “differentially”; i.e.,
The gain obtained by correcting a FN is (AFN→TP
− A)
The gain obtained by correcting a FP is (AFP→TN
− A)
Gains need to be estimated by estimating the contingency table on the
training set via k-fold cross-validation
Key observation: in general, G(FP) = G(FN)
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 35 / 41
Outline
1 Error Reduction, and How to Measure it
2 Error Reduction, and How to Maximize it
3 Some Experimental Results
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 36 / 41
Some Experimental Results
Dataset:
# Codes # Training # Test
Reuters-21578 115 9603 3299
Baseline: ranking by probability of misclassification (“à la active learning”),
equivalent to applying our ranking method with G(FP) = G(FN) = 1
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 37 / 41
0.0 0.2 0.4 0.6 0.8 1.0
Inspection Length
0.0
0.2
0.4
0.6
0.8
1.0ErrorReduction(ER)
Learner: MP-Boost; Dataset: Reuters-21578; Type: Macro
Random
Baseline
Utility-theoretic
Oracle
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 38 / 41
A few side notes
This approach allows the human coder to know, at any stage of the
inspection process, what the estimated accuracy is at that stage; obtained by
Estimating accuracy at the beginning of the process, via k-fold cross validation
Updating after each correction is made
This approach lends itself to having more than one coder working in parallel
on the same inspection-and-correction task
Recent research I have not discussed today :
A “dynamic” SATC method in which gains are updated after each correction
is performed
“Microaveraging” and “Macroaveraging” -oriented methods
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 39 / 41
A few side notes
This approach allows the human coder to know, at any stage of the
inspection process, what the estimated accuracy is at that stage; obtained by
Estimating accuracy at the beginning of the process, via k-fold cross validation
Updating after each correction is made
This approach lends itself to having more than one coder working in parallel
on the same inspection-and-correction task
Recent research I have not discussed today :
A “dynamic” SATC method in which gains are updated after each correction
is performed
“Microaveraging” and “Macroaveraging” -oriented methods
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 39 / 41
Concluding Remarks
Take-away message: Semi-automatic text classification needs to be addressed
as a task in its own right
Active learning typically makes use of probabilities of misclassification but does
not make use of gains ⇒ ranking “à la active learning” is suboptimal for SATC
The use of utility theory means that the ranking algorithm is optimized for a
specific accuracy measure ⇒ Choose the accuracy measure the best mirrors
your applicative needs (e.g., Fβ with β > 1), and choose it well!
SATC is important, since in more and more application contexts the accuracy
obtainable via completely automatic text classification is not sufficient; more
and more frequently humans will need to enter the loop
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 40 / 41
Thank you!
Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 41 / 41

More Related Content

Viewers also liked

abdul kalam
abdul kalamabdul kalam
abdul kalam
VISVA DIVYA
 
Leilão de peças e veículos
Leilão de peças e veículosLeilão de peças e veículos
Leilão de peças e veículos
Prefeitura Guarujá
 
休閒度假-5
休閒度假-5休閒度假-5
休閒度假-5
Iwan Hidayat
 
Instituto universitario politécnico
Instituto universitario politécnicoInstituto universitario politécnico
Instituto universitario politécnico
nayari salazar
 
Trabajo Social
Trabajo SocialTrabajo Social
Trabajo Socialjumyanlo
 
Expocicion de concepto de tomas de decisiones
Expocicion de concepto de tomas de decisionesExpocicion de concepto de tomas de decisiones
Expocicion de concepto de tomas de decisiones
rodrigo rivera
 
Introductio1 (1) aiswarya (4)
Introductio1 (1)  aiswarya (4)Introductio1 (1)  aiswarya (4)
Introductio1 (1) aiswarya (4)
mi5thu
 
Kidville Franchise Opportunities
Kidville Franchise OpportunitiesKidville Franchise Opportunities
Kidville Franchise Opportunitieskidvillecorporate
 
EL ADJETIVO
EL ADJETIVOEL ADJETIVO
EL ADJETIVO
cuscoop2015
 
Apresentação de Negócios Welions - Como Funciona a Welions igual Telexfree
Apresentação de Negócios Welions - Como Funciona a Welions igual TelexfreeApresentação de Negócios Welions - Como Funciona a Welions igual Telexfree
Apresentação de Negócios Welions - Como Funciona a Welions igual Telexfree
O Melhor do MMN
 

Viewers also liked (10)

abdul kalam
abdul kalamabdul kalam
abdul kalam
 
Leilão de peças e veículos
Leilão de peças e veículosLeilão de peças e veículos
Leilão de peças e veículos
 
休閒度假-5
休閒度假-5休閒度假-5
休閒度假-5
 
Instituto universitario politécnico
Instituto universitario politécnicoInstituto universitario politécnico
Instituto universitario politécnico
 
Trabajo Social
Trabajo SocialTrabajo Social
Trabajo Social
 
Expocicion de concepto de tomas de decisiones
Expocicion de concepto de tomas de decisionesExpocicion de concepto de tomas de decisiones
Expocicion de concepto de tomas de decisiones
 
Introductio1 (1) aiswarya (4)
Introductio1 (1)  aiswarya (4)Introductio1 (1)  aiswarya (4)
Introductio1 (1) aiswarya (4)
 
Kidville Franchise Opportunities
Kidville Franchise OpportunitiesKidville Franchise Opportunities
Kidville Franchise Opportunities
 
EL ADJETIVO
EL ADJETIVOEL ADJETIVO
EL ADJETIVO
 
Apresentação de Negócios Welions - Como Funciona a Welions igual Telexfree
Apresentação de Negócios Welions - Como Funciona a Welions igual TelexfreeApresentação de Negócios Welions - Como Funciona a Welions igual Telexfree
Apresentação de Negócios Welions - Como Funciona a Welions igual Telexfree
 

Similar to Machine Learning and Automatic Text Classification: What's Next?

Automated Classification and Quantification of Verbatims via Machine...
         Automated Classification and Quantification of Verbatims via Machine...         Automated Classification and Quantification of Verbatims via Machine...
Automated Classification and Quantification of Verbatims via Machine...
Fabrizio Sebastiani
 
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
diannepatricia
 
Daniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of WorkDaniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of Work
Edunomica
 
Effective Training Program Deployment
Effective Training Program DeploymentEffective Training Program Deployment
Effective Training Program Deployment
Deepak Manjarekar
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz Testing
Sebastiano Panichella
 
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Zainul Sayed
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Dr. Radhey Shyam
 
[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...
[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...
[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...
Daniele Malitesta
 
Performance Analysis of Supervised Machine Learning Techniques for Sentiment ...
Performance Analysis of Supervised Machine Learning Techniques for Sentiment ...Performance Analysis of Supervised Machine Learning Techniques for Sentiment ...
Performance Analysis of Supervised Machine Learning Techniques for Sentiment ...
Biswaranjan Samal
 
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro..."Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
Edge AI and Vision Alliance
 
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningSentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
IRJET Journal
 
Artificial Intelligence Master at UPC: some experience on applying AI to real...
Artificial Intelligence Master at UPC: some experience on applying AI to real...Artificial Intelligence Master at UPC: some experience on applying AI to real...
Artificial Intelligence Master at UPC: some experience on applying AI to real...
Javier Vázquez-Salceda
 
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONTEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
ijistjournal
 
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONTEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
ijistjournal
 
A Research Study On Using A Computer Program
A Research Study On Using A Computer ProgramA Research Study On Using A Computer Program
A Research Study On Using A Computer Program
Aimee Brown
 
Self-organisation of Knowledge in Socio-technical Systems: A Coordination Per...
Self-organisation of Knowledge in Socio-technical Systems: A Coordination Per...Self-organisation of Knowledge in Socio-technical Systems: A Coordination Per...
Self-organisation of Knowledge in Socio-technical Systems: A Coordination Per...
Andrea Omicini
 
Jonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAIJonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAI
MLconf
 
MediaEval 2018: Fine grained sport action recognition: Application to table t...
MediaEval 2018: Fine grained sport action recognition: Application to table t...MediaEval 2018: Fine grained sport action recognition: Application to table t...
MediaEval 2018: Fine grained sport action recognition: Application to table t...
multimediaeval
 

Similar to Machine Learning and Automatic Text Classification: What's Next? (20)

Automated Classification and Quantification of Verbatims via Machine...
         Automated Classification and Quantification of Verbatims via Machine...         Automated Classification and Quantification of Verbatims via Machine...
Automated Classification and Quantification of Verbatims via Machine...
 
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
 
Daniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of WorkDaniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of Work
 
Effective Training Program Deployment
Effective Training Program DeploymentEffective Training Program Deployment
Effective Training Program Deployment
 
DEFENSE
DEFENSEDEFENSE
DEFENSE
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz Testing
 
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...
[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...
[MMIR@MM2023] On Popularity Bias of Multimodal-aware Recommender Systems: A M...
 
Performance Analysis of Supervised Machine Learning Techniques for Sentiment ...
Performance Analysis of Supervised Machine Learning Techniques for Sentiment ...Performance Analysis of Supervised Machine Learning Techniques for Sentiment ...
Performance Analysis of Supervised Machine Learning Techniques for Sentiment ...
 
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro..."Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
 
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine LearningSentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
 
Artificial Intelligence Master at UPC: some experience on applying AI to real...
Artificial Intelligence Master at UPC: some experience on applying AI to real...Artificial Intelligence Master at UPC: some experience on applying AI to real...
Artificial Intelligence Master at UPC: some experience on applying AI to real...
 
ConQueSt
ConQueStConQueSt
ConQueSt
 
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONTEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
 
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTIONTEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
TEXT SENTIMENTS FOR FORUMS HOTSPOT DETECTION
 
A Research Study On Using A Computer Program
A Research Study On Using A Computer ProgramA Research Study On Using A Computer Program
A Research Study On Using A Computer Program
 
Self-organisation of Knowledge in Socio-technical Systems: A Coordination Per...
Self-organisation of Knowledge in Socio-technical Systems: A Coordination Per...Self-organisation of Knowledge in Socio-technical Systems: A Coordination Per...
Self-organisation of Knowledge in Socio-technical Systems: A Coordination Per...
 
Jonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAIJonas Schneider, Head of Engineering for Robotics, OpenAI
Jonas Schneider, Head of Engineering for Robotics, OpenAI
 
MediaEval 2018: Fine grained sport action recognition: Application to table t...
MediaEval 2018: Fine grained sport action recognition: Application to table t...MediaEval 2018: Fine grained sport action recognition: Application to table t...
MediaEval 2018: Fine grained sport action recognition: Application to table t...
 

Recently uploaded

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 

Recently uploaded (20)

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 

Machine Learning and Automatic Text Classification: What's Next?

  • 1. Machine Learning and Automatic Text Classification What’s Next? Fabrizio Sebastiani (Joint work with Giacomo Berardi and Andrea Esuli) Istituto di Scienza e Tecnologie dell’Informazione Consiglio Nazionale delle Ricerche 56124 Pisa, Italy ASC Methods Conference Winchester, UK – September 6-7, 2013
  • 2. Prequel: ML for Automated Verbatim Coding In the last 10 years we have championed an approach to automatically coding open-ended answers (“verbatims”) based on “machine learning”; 2003 : D. Giorgetti, I. Prodanof, and F. Sebastiani. Automatic Coding of Open-ended Questions Using Text Categorization Techniques. Proceedings of the 4th International Conference of the Association for Survey Computing, Warwick, UK, pp. 173-–184. 2007 : T. Macer, M. Pearson, and F. Sebastiani. Cracking the Code: What Customers Say, in Their Own Words. In Proceedings of the 50th Annual Conference of the Market Research Society, Brighton, UK. (Best New Thinking Award, Shortlisted for Best Paper Award and for ASC/MRS Tech Effectiveness Award) 2010 : A. Esuli and F. Sebastiani. Machines that learn how to code open-ended survey data. International Journal of Market Research, 52(6). (Shortlisted for best 2010 IJMR paper) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 2 / 41
  • 3. Prequel: ML for Automated Verbatim Coding (cont’d) Based on these principles we have built a software system, called VCS (“Verbatim Coding System”), which has been variously applied to coding surveys in the social sciences, customer relationship management, and market research. VCS is based on a “supervised learning” metaphor : the classifier learns (or: is trained), from sample manually classified verbatims, the characteristics a new verbatim should have in order to be attributed a given code. The human operator who feeds sample manually classified verbatims to the system plays the role of the “supervisor”. Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 3 / 41
  • 4. A Verbatim Coding System based on ML Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 4 / 41
  • 5. A Verbatim Coding System based on ML Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 5 / 41
  • 6. A Verbatim Coding System based on ML Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 6 / 41
  • 7. A Verbatim Coding System based on ML Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 7 / 41
  • 8. What I’ll be talking about today A talk about the role of humans in the verbatim coding process, and about how to best support their work I will be looking at scenarios in which 1 automated verbatim coding technology is used ... 2 ... but the level of accuracy that can be obtained from the classifier is not considered sufficient ... 3 ... with the consequence that one or more human coders are asked to inspect (and correct where appropriate) a portion of the classification decisions, with the goal of increasing overall accuracy. Problem How can we support / optimize the work of the human coders? Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 8 / 41
  • 9. What I’ll be talking about today A talk about the role of humans in the verbatim coding process, and about how to best support their work I will be looking at scenarios in which 1 automated verbatim coding technology is used ... 2 ... but the level of accuracy that can be obtained from the classifier is not considered sufficient ... 3 ... with the consequence that one or more human coders are asked to inspect (and correct where appropriate) a portion of the classification decisions, with the goal of increasing overall accuracy. Problem How can we support / optimize the work of the human coders? Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 8 / 41
  • 10. What I’ll be talking about today A talk about the role of humans in the verbatim coding process, and about how to best support their work I will be looking at scenarios in which 1 automated verbatim coding technology is used ... 2 ... but the level of accuracy that can be obtained from the classifier is not considered sufficient ... 3 ... with the consequence that one or more human coders are asked to inspect (and correct where appropriate) a portion of the classification decisions, with the goal of increasing overall accuracy. Problem How can we support / optimize the work of the human coders? Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 8 / 41
  • 11. What I’ll be talking about today A talk about the role of humans in the verbatim coding process, and about how to best support their work I will be looking at scenarios in which 1 automated verbatim coding technology is used ... 2 ... but the level of accuracy that can be obtained from the classifier is not considered sufficient ... 3 ... with the consequence that one or more human coders are asked to inspect (and correct where appropriate) a portion of the classification decisions, with the goal of increasing overall accuracy. Problem How can we support / optimize the work of the human coders? Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 8 / 41
  • 12. A worked out example predicted Y N true Y TP = 4 FP = 3 N FN = 4 TN = 9 F1 = 2TP 2TP + FP + FN = 0.53 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 9 / 41
  • 13. A worked out example (cont’d) predicted Y N true Y TP = 4 FP = 3 N FN = 4 TN = 9 F1 = 2TP 2TP + FP + FN = 0.53 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 10 / 41
  • 14. A worked out example (cont’d) predicted Y N true Y TP = 5 FP = 3 N FN = 3 TN = 9 F1 = 2TP 2TP + FP + FN = 0.63 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 11 / 41
  • 15. A worked out example (cont’d) predicted Y N true Y TP = 5 FP = 2 N FN = 3 TN = 10 F1 = 2TP 2TP + FP + FN = 0.67 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 12 / 41
  • 16. A worked out example (cont’d) predicted Y N true Y TP = 6 FP = 2 N FN = 2 TN = 10 F1 = 2TP 2TP + FP + FN = 0.75 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 13 / 41
  • 17. A worked out example (cont’d) predicted Y N true Y TP = 6 FP = 1 N FN = 2 TN = 11 F1 = 2TP 2TP + FP + FN = 0.80 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 14 / 41
  • 18. What I’ll be talking about (cont’d) We need methods that given a desired level of accuracy, minimize the human coders’ effort necessary to achieve it; alternatively, given an available amount of human coders’ effort, maximize the accuracy that can be obtained through it This can be achieved by ranking the automatically classified verbatims in such a way that, by starting the inspection from the top of the ranking, the cost-effectiveness of the human coders’ work is maximized We call the task of generating such a ranking Semi-Automatic Text Classification (SATC) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 15 / 41
  • 19. What I’ll be talking about (cont’d) We need methods that given a desired level of accuracy, minimize the human coders’ effort necessary to achieve it; alternatively, given an available amount of human coders’ effort, maximize the accuracy that can be obtained through it This can be achieved by ranking the automatically classified verbatims in such a way that, by starting the inspection from the top of the ranking, the cost-effectiveness of the human coders’ work is maximized We call the task of generating such a ranking Semi-Automatic Text Classification (SATC) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 15 / 41
  • 20. What I’ll be talking about (cont’d) Previous work has addressed SATC via techniques developed for active learning In both cases, the automatically classified verbatims are ranked with the goal of having the human coder start inspecting/correcting from the top; however in active learning the goal is providing new training examples in SATC the goal is increasing the overall accuracy of the classified set We claim that a ranking generated “à la active learning” is suboptimal for SATC1 1G Berardi, A Esuli, F Sebastiani. A Utility-Theoretic Ranking Method for Semi-Automated Text Classification. Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012), Portland, US, 2012. Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 16 / 41
  • 21. What I’ll be talking about (cont’d) Previous work has addressed SATC via techniques developed for active learning In both cases, the automatically classified verbatims are ranked with the goal of having the human coder start inspecting/correcting from the top; however in active learning the goal is providing new training examples in SATC the goal is increasing the overall accuracy of the classified set We claim that a ranking generated “à la active learning” is suboptimal for SATC1 1G Berardi, A Esuli, F Sebastiani. A Utility-Theoretic Ranking Method for Semi-Automated Text Classification. Proceedings of the 35th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012), Portland, US, 2012. Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 16 / 41
  • 22. A Verbatim Coding System based on ML Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 17 / 41
  • 23. A Verbatim Coding System based on ML Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 18 / 41
  • 24. Outline of this talk 1 We discuss how to measure “error reduction” (i.e., the increase in accuracy deriving from the human coder’s inspection activity) 2 We discuss a method for maximizing the expected error reduction for a fixed amount of annotation effort 3 We show some promising experimental results Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 19 / 41
  • 25. Outline 1 Error Reduction, and How to Measure it 2 Error Reduction, and How to Maximize it 3 Some Experimental Results Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 20 / 41
  • 26. Error Reduction, and how to measure it Assume we have 1 Code c; 2 Classifier h for c; 3 Set of unlabeled verbatims D that we have automatically classified by means of h, so that every verbatim in D is associated with a binary decision (Yes or No) with a confidence score (a positive real number) 4 Measure of accuracy A, ranging on [0,1] Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 21 / 41
  • 27. Error Reduction, and how to Measure it (cont’d) We will assume that A is F1 = 2 · TP (2 · TP) + FP + FN but any measure of accuracy based on a contingency table may be used An amount of error, measured as E = (1 − A), is present in the automatically classified set D Human coders inspect-and-correct a portion of D with the goal of reducing the error present in D Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 22 / 41
  • 28. Error Reduction, and how to Measure it (cont’d) We will assume that A is F1 = 2 · TP (2 · TP) + FP + FN but any measure of accuracy based on a contingency table may be used An amount of error, measured as E = (1 − A), is present in the automatically classified set D Human coders inspect-and-correct a portion of D with the goal of reducing the error present in D Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 22 / 41
  • 29. Error Reduction, and how to Measure it (cont’d) We define error at rank n (noted as E(n)) as the error still present in D after the coder has inspected the verbatims at the first n rank positions E(0) is the initial error generated by the automated classifier E(|D|) is 0 We define error reduction at rank n (noted as ER(n)) to be ER(n) = E(0) − E(n) E(0) the error reduction obtained by the human coder who inspects the verbatims at the first n rank positions ER(n) ∈ [0, 1] ER(n) = 0 indicates no reduction ER(n) = 1 indicates total elimination of error Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 23 / 41
  • 30. Error Reduction, and how to Measure it (cont’d) We define error at rank n (noted as E(n)) as the error still present in D after the coder has inspected the verbatims at the first n rank positions E(0) is the initial error generated by the automated classifier E(|D|) is 0 We define error reduction at rank n (noted as ER(n)) to be ER(n) = E(0) − E(n) E(0) the error reduction obtained by the human coder who inspects the verbatims at the first n rank positions ER(n) ∈ [0, 1] ER(n) = 0 indicates no reduction ER(n) = 1 indicates total elimination of error Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 23 / 41
  • 31. Error Reduction, and how to Measure it (cont’d) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 24 / 41
  • 32. Outline 1 Error Reduction, and How to Measure it 2 Error Reduction, and How to Maximize it 3 Some Experimental Results Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 25 / 41
  • 33. Error Reduction, and how to Maximize it Problem How should we rank the verbatims in D so as to maximize the expected error reduction? Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 26 / 41
  • 34. A worked out example predicted Y N true Y TP = 4 FP = 3 N FN = 4 TN = 9 F1 = 2TP 2TP + FP + FN = 0.53 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 27 / 41
  • 35. A worked out example (cont’d) predicted Y N true Y TP = 4 FP = 3 N FN = 4 TN = 9 F1 = 2TP 2TP + FP + FN = 0.53 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 28 / 41
  • 36. A worked out example (cont’d) predicted Y N true Y TP = 5 FP = 3 N FN = 3 TN = 9 F1 = 2TP 2TP + FP + FN = 0.63 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 29 / 41
  • 37. A worked out example (cont’d) predicted Y N true Y TP = 5 FP = 2 N FN = 3 TN = 10 F1 = 2TP 2TP + FP + FN = 0.67 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 30 / 41
  • 38. A worked out example (cont’d) predicted Y N true Y TP = 6 FP = 2 N FN = 2 TN = 10 F1 = 2TP 2TP + FP + FN = 0.75 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 31 / 41
  • 39. A worked out example (cont’d) predicted Y N true Y TP = 6 FP = 1 N FN = 2 TN = 11 F1 = 2TP 2TP + FP + FN = 0.80 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 32 / 41
  • 40. Error Reduction, and how to Maximize it Intuition 1: Verbatims that have a higher probability of being misclassified should be ranked higher Intuition 2: Verbatims that, if corrected, bring about a higher gain (i.e., a bigger increase in A) should be ranked higher This means that verbatims that have a higher utility (= probability × gain) should be ranked higher A false positive and a false negative may have different impacts on A ! While in active learning only the probability of misclassification is relevant, in SATC gains are also relevant Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 33 / 41
  • 41. Error Reduction, and how to Maximize it (cont’d) Given a set Ω of mutually disjoint events, a utility function is defined as U(Ω) = ω∈Ω P(ω)G(ω) where P(ω) is the probability of occurrence of event ω G(ω) is the gain obtained if event ω occurs We can thus estimate the utility, for the aims of increasing A, of manually inspecting a verbatim d as U(TP, TN, FP, FN) = P(FP) · G(FP) + P(FN) · G(FN) provided we can estimate If d is labelled with code c: P(FP) and G(FP) If d is not labelled with code c: P(FN) and G(FN) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 34 / 41
  • 42. Error Reduction, and how to Maximize it (cont’d) Given a set Ω of mutually disjoint events, a utility function is defined as U(Ω) = ω∈Ω P(ω)G(ω) where P(ω) is the probability of occurrence of event ω G(ω) is the gain obtained if event ω occurs We can thus estimate the utility, for the aims of increasing A, of manually inspecting a verbatim d as U(TP, TN, FP, FN) = P(FP) · G(FP) + P(FN) · G(FN) provided we can estimate If d is labelled with code c: P(FP) and G(FP) If d is not labelled with code c: P(FN) and G(FN) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 34 / 41
  • 43. Error Reduction, and how to Maximize it (cont’d) Estimating P(FP) and P(FN) (the probability of misclassification) can be done by converting the confidence score returned by the classifier into a probability of correct classification Tricky: requires probability “calibration” via a generalized sigmoid function to be optimized via k-fold cross-validation Gains G(FP) and G(FN) can be defined “differentially”; i.e., The gain obtained by correcting a FN is (AFN→TP − A) The gain obtained by correcting a FP is (AFP→TN − A) Gains need to be estimated by estimating the contingency table on the training set via k-fold cross-validation Key observation: in general, G(FP) = G(FN) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 35 / 41
  • 44. Error Reduction, and how to Maximize it (cont’d) Estimating P(FP) and P(FN) (the probability of misclassification) can be done by converting the confidence score returned by the classifier into a probability of correct classification Tricky: requires probability “calibration” via a generalized sigmoid function to be optimized via k-fold cross-validation Gains G(FP) and G(FN) can be defined “differentially”; i.e., The gain obtained by correcting a FN is (AFN→TP − A) The gain obtained by correcting a FP is (AFP→TN − A) Gains need to be estimated by estimating the contingency table on the training set via k-fold cross-validation Key observation: in general, G(FP) = G(FN) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 35 / 41
  • 45. Error Reduction, and how to Maximize it (cont’d) Estimating P(FP) and P(FN) (the probability of misclassification) can be done by converting the confidence score returned by the classifier into a probability of correct classification Tricky: requires probability “calibration” via a generalized sigmoid function to be optimized via k-fold cross-validation Gains G(FP) and G(FN) can be defined “differentially”; i.e., The gain obtained by correcting a FN is (AFN→TP − A) The gain obtained by correcting a FP is (AFP→TN − A) Gains need to be estimated by estimating the contingency table on the training set via k-fold cross-validation Key observation: in general, G(FP) = G(FN) Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 35 / 41
  • 46. Outline 1 Error Reduction, and How to Measure it 2 Error Reduction, and How to Maximize it 3 Some Experimental Results Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 36 / 41
  • 47. Some Experimental Results Dataset: # Codes # Training # Test Reuters-21578 115 9603 3299 Baseline: ranking by probability of misclassification (“à la active learning”), equivalent to applying our ranking method with G(FP) = G(FN) = 1 Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 37 / 41
  • 48. 0.0 0.2 0.4 0.6 0.8 1.0 Inspection Length 0.0 0.2 0.4 0.6 0.8 1.0ErrorReduction(ER) Learner: MP-Boost; Dataset: Reuters-21578; Type: Macro Random Baseline Utility-theoretic Oracle Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 38 / 41
  • 49. A few side notes This approach allows the human coder to know, at any stage of the inspection process, what the estimated accuracy is at that stage; obtained by Estimating accuracy at the beginning of the process, via k-fold cross validation Updating after each correction is made This approach lends itself to having more than one coder working in parallel on the same inspection-and-correction task Recent research I have not discussed today : A “dynamic” SATC method in which gains are updated after each correction is performed “Microaveraging” and “Macroaveraging” -oriented methods Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 39 / 41
  • 50. A few side notes This approach allows the human coder to know, at any stage of the inspection process, what the estimated accuracy is at that stage; obtained by Estimating accuracy at the beginning of the process, via k-fold cross validation Updating after each correction is made This approach lends itself to having more than one coder working in parallel on the same inspection-and-correction task Recent research I have not discussed today : A “dynamic” SATC method in which gains are updated after each correction is performed “Microaveraging” and “Macroaveraging” -oriented methods Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 39 / 41
  • 51. Concluding Remarks Take-away message: Semi-automatic text classification needs to be addressed as a task in its own right Active learning typically makes use of probabilities of misclassification but does not make use of gains ⇒ ranking “à la active learning” is suboptimal for SATC The use of utility theory means that the ranking algorithm is optimized for a specific accuracy measure ⇒ Choose the accuracy measure the best mirrors your applicative needs (e.g., Fβ with β > 1), and choose it well! SATC is important, since in more and more application contexts the accuracy obtainable via completely automatic text classification is not sufficient; more and more frequently humans will need to enter the loop Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 40 / 41
  • 52. Thank you! Fabrizio Sebastiani (ISTI-CNR, Pisa, IT) ML & ATC: What’s Next? ASC Methods Conference 41 / 41