SlideShare a Scribd company logo
1 of 51
Download to read offline
A contribution to ranking and description of
classifications
PhD examination
Germ´an S´anchez-Hern´andez
Dept. of Enginyeria de Sistemes, Autom`atica i Inform`atica Industrial (ESAII)
Universitat Polit`ecnica de Catalunya – BarcelonaTech (UPC)
Av. Diagonal 647, 08034 Barcelona
ESADE Business School, Universitat Ramon Llull (URL)
Avda. Torreblanca 59, 08172 Sant Cugat del Vall`es
german.sanchez@esade.edu
Co-Advisors: Juan Carlos Aguado Chao & N´uria Agell Jan´e
September 13th, 2013
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Outline
1 Introduction
2 Literature review
3 Fuzzy criteria for selecting classifications
4 NL-based automatic qualitative description of clusters
5 Application to market segmentation
6 Conclusions
2 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation and framework
Objectives
Theoretical background
Outline
1 Introduction
2 Literature review
3 Fuzzy criteria for selecting classifications
4 NL-based automatic qualitative description of clusters
5 Application to market segmentation
6 Conclusions
3 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation and framework
Objectives
Theoretical background
Motivation and framework
Motivation in Marketing for developing a novel and complete fuzzy
MCDM1 methodology:
1. Generation of segmentations
Unsupervised learning.
2. Rank and selection of segmentations
Criteria to assess segmentations.
Aggregation of the assessments.
3. Description of the best segmentation
Most important features
Natural language
1
Multi-Criteria Decision Making
4 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation and framework
Objectives
Theoretical background
Objectives
1 Generation of classifications (Theoretical background).
2 Evaluation by fuzzy validation criteria (Chapter 3).
3 Rank by aggregating the assessments (Chapter 2).
4 NLG2 system (Chapter 4).
5 Application (Chapter 5).
2
Natural Language Generation
5 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation and framework
Objectives
Theoretical background
Overview
6 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Criteria for selecting classifications
Aggregation functions
Data-to-text systems
Outline
1 Introduction
2 Literature review
3 Fuzzy criteria for selecting classifications
4 NL-based automatic qualitative description of clusters
5 Application to market segmentation
6 Conclusions
7 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Criteria for selecting classifications
Aggregation functions
Data-to-text systems
Criteria for selecting classifications
Alternatives for selecting classifications:
[Kukar, 2003; Osei-Bryson, 2010], classical sequential approach.
[Halkidi et al., 2002] translates the problem into a search one.
[Broder et al., 2008] new classification by voting.
Types of validation criteria [Jain et al., 1999; Theodoridis and Koutroumbas,
2008]
Internal criteria: analyse internal structure
Compactness, separability, prediction strength... [Liu et al., 2010].
External criteria: compare with external structure
Rand, Jaccard, Fowlkes-Mallows, Minkowski indexes... [Wu et al., 2009].
Relative criteria: manual pairwise comparisons (avoided)
[Jain et al., 1999].
8 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Criteria for selecting classifications
Aggregation functions
Data-to-text systems
Summary Validation Criteria
Paper Comments
Internal criteria Extern. Relat.
Compact. Separab. Accuracy Feat. Goals criteria criteria
Ramze Rezaee
et al. (1998)
One index
for fuzzy
c-Means
Yes:
compact.
Yes:
separat.
No No No No No
Cheng et al.
(1999)
Subspace
clustering
Yes: high
density
No No No
Yes: cover.
& correl.
of dim’s
No No
Kukar
(2003)
Reliability
on diagn.
No No
Yes:
reliability
No No No No
Halkidi et al.
(2001)
Review Yes: several No No No
Yes:
several
Yes:
several
Choi et al.
(2005)
Association
rules
No No No No
Yes:
R-F-MV
No No
Tibshirani
& Walther
(2005)
Valid. by
prediction
strength
Yes: variance Yes: bias
Yes:
prediction
strength
No No No No
Yatskiv &
Gusarova
(2005)
Review No No No No No
Yes:
several
Yes:
several
Method
presented
Review &
application
Yes: IC (coherence)
Yes: IA
(accuracy)
No
Yes: IU &
IB (useful.
& balanced)
Yes: ID
(dep.)
No
9 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Criteria for selecting classifications
Aggregation functions
Data-to-text systems
Summary Validation Criteria
Paper Comments
Internal criteria Extern. Relat.
Compact. Separab. Accuracy Feat. Goals criteria criteria
Bittmann
& Gelbard
(2009)
Visualisation
of hierarch.
clustering
Yes: min
hetereog.
No No No
Yes:
visualis.
No No
Wang et al.
(2009)
Clinical
application
Yes: Davies-Bouldin
& rel.-free
No No No No No
Wu et al.
(2009)
External
criteria for
k-Means
No No No No No
Yes:
several
No
Xiong et al.
(2009)
k-Means
Yes:
Sum of
Sq. Errors
Yes:
entropy
and CV
Yes:
F-measure
No No No No
Liu et al.
(2010)
Internal
criteria
review
Yes:
several
Yes:
several
No No No No No
Osei-Bryson
(2010)
Review No No
Yes:
accuracy
Yes: # of
imp.
vars
Yes:
outliers,
Max/Min
No
Yes:
several
Method
presented
Review &
application
Yes: IC (coherence)
Yes: IA
(accuracy)
No
Yes: IU &
IB (useful.
& balanc.)
Yes: ID
(dep.)
No
10 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Criteria for selecting classifications
Aggregation functions
Data-to-text systems
Aggregation functions
Two steps in MCDM [Fodor and Roubens, 1994]:
1 Aggregation of the single evaluations.
2 Exploitation by generating a ranking of the alternatives.
Many di↵erent families of aggregation functions [Chiclana et al., 2004,
2007; Dubois and Prade, 1985; Fodor and Roubens, 1994; Herrera et al., 2003;
Klir and Folger, 1988; Torra, 1997; Torra and Narukawa, 2007; Xu and Da,
2003; Yager, 1988; Zhou et al., 2008].
OWA operator [Yager, 1988]
W (a1, · · · , an) =
nX
i=1
wi a (i)
wi via linguistic quantifier [Zadeh, 1983]:
wi = Q i
n
Q i 1
n
,
Q is a RIM quantifier: Q(r) = r↵
.
11 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Criteria for selecting classifications
Aggregation functions
Data-to-text systems
Data-to-text Systems
Paper System Application area Input data Users Rules
Goldberg et al. (1994) FoG
Weather forecasting
Time series Forecasters Yes
Reiter et al. (2005) Forecasting texts Time series Forecasters No
Sripada et al. (2003) SumTime-Mousam Time series Forecasters Yes
Cawsey et al. (2000) Piglit
Medicine
Events Patients Yes
Hallett and Scott (2005) Summaries of events List of events
Medical sta↵
& patients
Yes
Harris (2008)
Narrative Engine:
text summaries
Events Medical sta↵ No
H¨uske-Kraus (2003b) Review of applications Raw data Medical sta↵ No
H¨uske-Kraus (2003a)
Suregen-2:
Routine reports
Medical sta↵ No
Kahn et al. (1991) Topaz Yes
Portet et al. (2009)
BT-45:
Neonatal summaries
Raw data
from sensors
Medical sta↵
& patients
Yes
Reiter et al. (2003)
Stop:
personalised reports
Manual input Patients Yes
Method presented (2013) Description of groups Generic Tabular data Generic Yes
12 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Criteria for selecting classifications
Aggregation functions
Data-to-text systems
Summary Data-to-text systems
Paper System Application area Input data Users Rules
Ferres et al. (2006) iGraph
Accessibility
Graphical data Visually-imp. Yes
Thomas & Sripada (2008) Atlas.txt
Geo-referenced
data
Visually-imp. No
Kukich (1983)
Ana: textual
stock market
Financial Time series Stock marketers Yes
Hammond and Davis (2005) Ladder
Image
Sketches Yes
Herzog and Wazinski (1994) Vitru Visual scenes No
Roy (2002) Describer Visual scenes Yes
Sripada and Gao (2007) ScubaText Sports Scuba divers No
Iordanskaja et al. (1992) Summaries
Generic
Statistical data Yes
McKeown et al. (1994) PLANDoc List of events No
Method presented (2013)
Description of
groups
Generic Tabular data Generic Yes
13 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
Outline
1 Introduction
2 Literature review
3 Fuzzy criteria for selecting classifications
4 NL-based automatic qualitative description of clusters
5 Application to market segmentation
6 Conclusions
14 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
Motivation
15 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
First criterion: useful number of classes
Objective: su cient but small enough number of clusters.
Definition
Given a classification C, the index of usefulness is characterised by the
following membership function:
IU,K1,K2 (C) =
8
<
:
f1(M), if 1  M < K1;
1, if K1  M  K2;
f2(M), if K2 < M  N,
(1)
where M 2 N is the number of classes of C; K1, K2 2 N such that K1 < K2 are
two prefixed parameters; and f1 is a strict increasing function and f2 is a strict
decreasing function verifying f1(1) = f2(N) = 0 and f1(K1) = f2(K2) = 1.
16 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
First criterion: examples
Examples of functions for K1 = 4 and K2 = 7:
17 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
Second criterion: balanced classes
Objective: to avoid (or boost) unbalanced classifications.
Definition
Given a classification C, the index of balanced classes of C is defined as:
IB (C) =
maxY (CVY ) CVC
maxY (CVY ) minY (CVY )
, (2)
where CVC is the coe cient of variation associated with C and minY (CVY ) and
maxY (CVY ) are given as per Propositions 3.1 and 3.3, respectively.
If unbalanced classes are required,
IB = 1 IB.
18 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
Second criterion: example
Example of IB for two classifications:
Let C1 and C2 be the following two di↵erent classifications of the same data set
consisting of N = 260 individuals:
Y1 = {90, 80, 90}; Y2 = {110, 30, 20, 90}
CVY1 = 0.51; CVY2 = 4.43.
min(CVY ) = 0; max(CVY ) = 6.18.
IB (C1) = 0.92; IB (C2) = 0.28.
19 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
Third criterion: coherent classification
Objective: to ensure that the Global Adequacy Degrees (GAD)
are obtained from similar values of Marginal Adequacy Degrees
(MAD).
Definition
The index of coherence of classification is given as follows:
IC (C) = 1
PM
i=1
PN
j=1[max(µijk ) min(µijk )]
M · N
, (3)
where N 2 N is the number of individuals, M 2 N is the number of classes of
classification C, and µijk is the MAD of individual j to class i according to
descriptor k.
20 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
Third criterion: example
Examples of IC for two classifications
Let’s consider two classifications C1 and C2 consisting of two (A and B) and
three classes (C, D and E), respectively. MADs are shown next:
Class A d1 d2 d3
i1 0.3 0.4 0.6
i2 0.2 0.3 0.2
i3 0.5 0.5 0.3
Class B d1 d2 d3
i1 0.7 0.5 0.5
i2 0.4 0.8 0.5
i3 0.9 0.6 0.7
Class C d1 d2 d3
i1 0.1 0.6 0.4
i2 0.7 0.2 0.3
i3 0.4 0.8 0.3
Class D d1 d2 d3
i1 0.1 0.7 0.3
i2 0.5 0.2 0.7
i3 0.9 0.3 0.4
Class E d1 d2 d3
i1 0.9 0.2 0.7
i2 0.6 0.9 0.6
i3 0.7 0.6 0.1
MADs of C1 more homogeneous ! IC (C1) = 0.75 > IC (C2) = 0.47
21 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
Fourth criterion: dependency on external variables
Objective: high relation with an external variable.
Definition
Given a classification C, its index of dependency on a control variable is
defined as:
ID (C) =
2
N ·
p
M 1 ·
p
S 1
, (4)
where N is the number of individuals, M is the number of classes of C and S is
the number of unique values of the control variable, if it is qualitative, or the
number of considered intervals in the discretisation if it is quantitative.
22 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
First criterion: useful number of classes
Second criterion: balanced classes
Third criterion: coherent classification
Fourth criterion: dependency on external variables
Fifth criterion: accuracy of the predictive model
Fifth criterion: accuracy of the predictive model
Objective: high predictability.
Definition
Given a classification C, its index of accuracy is defined as:
IA(C) = 2 ·
precision(C) · recall(C)
precision(C) + recall(C)
, (5)
where precision(C) and recall(C) are the weighted averages of precision and
recall of classes of C, respectively.
23 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
Arquitecture
Signal Analysis
Data Interpretation
Document Planning
Microplanning and Realisation
Outline
1 Introduction
2 Literature review
3 Fuzzy criteria for selecting classifications
4 NL-based automatic qualitative description of clusters
5 Application to market segmentation
6 Conclusions
24 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
Arquitecture
Signal Analysis
Data Interpretation
Document Planning
Microplanning and Realisation
Motivation
Interpretation thwrough a natural description.
NLG system based on a four-stage architecture [Reiter, 2007]:
1 Signal analysis: basic patterns.
2 Data interpretation: messages and relations.
3 Document planning: messages selection.
4 Microplanning and realisation: text generation.
25 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
Arquitecture
Signal Analysis
Data Interpretation
Document Planning
Microplanning and Realisation
Arquitecture
26 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
Arquitecture
Signal Analysis
Data Interpretation
Document Planning
Microplanning and Realisation
1. Signal Analysis
Objectives: to select variables and to identify the important
values.
27 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
Arquitecture
Signal Analysis
Data Interpretation
Document Planning
Microplanning and Realisation
2. Data Interpretation
Objective: to avoid redundant information (type-A rules).
1 Discarding negative messages.
2 Discarding messages obtained
from VoIs.
3 Discarding modalities.
4 Discarding variables (same
sign).
5 Discarding variables (di↵erent
sign).
Rule Class Variable Modality Type Sign
A.1 Yes Yes Yes
A.2 Yes Yes Yes Yes
A.3 Yes Yes
A.4 Yes Yes Yes Yes
A.5 Yes Yes Yes
28 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
Arquitecture
Signal Analysis
Data Interpretation
Document Planning
Microplanning and Realisation
3. Document Planning
Objectives: to identify relations and modifications.
Type-B rules:
1 Merging modalities of an
ordinal variable.
2 Merging modalities.
3 Merging variables with the
same modalities.
4 Adding single modalities.
5 Merging single messages.
Type-C rules:
1 Use of the semantics of modality
“no”.
2 Use of the semantics of modality
“yes”.
3 Use of the semantics of variables.
4 Modalities as adjectives.
5 Use of linguistic quantifiers.
29 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Motivation
Arquitecture
Signal Analysis
Data Interpretation
Document Planning
Microplanning and Realisation
4. Microplanning and Realisation
Objectives: final structure and transcription.
1 Microplanning: sorts groups of by importance.
2 Realisation: transcription.
30 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Outline
1 Introduction
2 Literature review
3 Fuzzy criteria for selecting classifications
4 NL-based automatic qualitative description of clusters
5 Application to market segmentation
6 Conclusions
31 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Case Presentation
Objective: to apply the methodology reviewed and developed into
a real marketing problem of a B2B environment.
Challenge: comprehend
fluctuations in orders made by
the shops (limited resources).
Objective: to segment the set
of retailers.
Actions:
1 Unsupervised learning process.
2 Selection of segmentations.
3 Natural language description.
32 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Dataset
Grifone:
Outdoor sporting
equipment firm
Firm: Textil Seu, SA.
La Seu d’Urgell
(north of Lleida).
More than 25 years.
260 points of sale.
16 variables.
Antiquity
Assistants
Evaluation
DisplayGrifone
Location
Internet
ThermalExhibitor
Specialists
Aesthetics
Communication
Competition
DisplaySize
GrifoneWeight
Maintenance
PromosSensit
Size
33 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Obtaining segmentations
Unsupervised learning technique.
LAMDA (Learning Algorithm for Multivariate Data Analysis)
[Aguado, 1998; Aguado et al., 1999; Aguilar and L´opez de M´antaras, 1982].
Tolerance.
Hybrid connectives (MinMax, Probabilistic Product, Lukasiewicz, Frank
n-norms).
Density functions.
PromosSensit as external variable.
! 566 segmentations (between 1 and 244 classes).
34 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Criteria
1 Usefulness:
K1 = 3 and K2 = 5.
Linear-exponential: !
f1(M) = M 1
3 1 ; f2(M) = e260 M
1
e(260 5) 1
2 Balanced:
Avoid unbalanced classifications
! IB (C) = maxY (CVY ) CVC
maxY (CVY ) minY (CVY )
.
35 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Criteria
3 Coherence:
Density functions of quantitative variables:
Antiquity: Waissman.
Assistants: Classical.
Evaluation: Gaussian.
4 Dependency:
Chosen variable: PromosSensit.
5 Accuracy:
SVM, 10-fold cross-validation (30 times)
36 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Aggregation of the indexes
OWA operator.
Linguistic quantifier: most of.
RIM function: Q(r) = r1/2.
Computed weights:
(0.447, 0.185, 0.142, 0.120, 0.106).
37 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Results
Rank ID Conn. Toler.
Classes
M
Criterion assessment
OWA
IU IB IC ID IA
1 #259 Minmax 0.439 3 1 0.928 0.251 0.528 0.936 0.8423
2 #260 Minmax 0.469 3 1 0.929 0.254 0.363 0.922 0.8207
3 #258 Minmax 0.422 4 1 0.885 0.226 0.425 0.875 0.8103
4 #243 Minmax 0.290 3 1 0.909 0.256 0.008 0.965 0.7868
5 #244 Minmax 0.304 3 1 0.920 0.255 0.012 0.948 0.7855
6 #257 Minmax 0.411 3 1 0.933 0.279 0.032 0.856 0.7786
7 #256 Minmax 0.400 4 1 0.872 0.246 0.064 0.906 0.7751
8 #253 Minmax 0.359 4 1 0.883 0.231 0.021 0.915 0.7722
9 #252 Minmax 0.352 4 1 0.884 0.237 0.025 0.904 0.7712
10 #255 Minmax 0.393 4 1 0.939 0.263 0.063 0.768 0.7686
Segmentation #259:
1 Class 1: 35 shops
2 Class 2: 98 shops
3 Class 3: 127 shops
38 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
1. Signal Analysis
Objectives: to select variables and to identify important values.
Variables Antiquity, Assistants and Evaluation discretised
(CAIM [Kurgan and Cios, 2004]).
Variable
Intervals
P1 P2 P3
Antiquity (years) less than two three more than three
Assistants few many a lot of
Evaluation bad good excellent
Variables Antiquity, Specialists and Internet discarded.
39 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
1. Signal Analysis
Detection of relevant VoIs and EFs.
Example: tables for variable Competition
Contingency table:
Competition No Weak Strong
1 0 13 21
2 26 38 34
3 24 42 60
Expected frequencies:
Competition No Weak Strong
1 6.6 12.3 15.2
2 19.0 35.3 43.7
3 24.4 45.4 56.2
Values of importance:
Competition No Weak Strong
1 -6.66 0.05 2.25
2 2.59 0.20 -2.15
3 -0.01 -0.26 0.26
Conditional frequencies:
Competition No Weak Strong
1 0.00 0.38 0.62
2 0.27 0.39 0.35
3 0.19 0.33 0.48
40 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
1. Signal Analysis
Landmarks.
Example: initial messages for variable Competition
ID Class Variable Modality Type Sign Relev. Value
#1 1 Competition no VoI neg. high 6.6
#30 1 Competition strong VoI pos. normal 2.3
#47 2 Competition no VoI pos. normal 2.6
#48 2 Competition strong VoI neg. normal 2.1
#68 1 Competition no EF neg. - 0.0
#48: The prop. of shops with mod. “strong” in var. “Competition” is low in Class 2.
#68: None of shops have modality “no” in variable “Competition” in Class 1.
! 67 relevant VoIs (38 highly rel.) and 13 EFs (3 pos., 10 neg.).
41 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
2. Data Interpretation
Objective: to discard redundant messages (type-A rules).
Example: rule A.2 detecting redundancy between messages #68 and #1 in
variable Competition
ID Class Variable Modality Type Sign Relev. Value
#1 1 Competition no VoI neg. high 6.6
#68 1 Competition no EF neg. - 0.0
#1: The prop. of shops with mod. “no” in var. “Competition” is very low in Class 1.
#68: None of shops have modality “no” in variable “Competition” in Class 1.
! discarding message #1 and prioritising #68.
! 28 redundant messages were discarded.
42 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
3. Document Planning
Objectives: to merge related messages (type-B rules) and to
“naturalise” specific messages (type-C rules).
Example: rule B.2 detecting relations between messages #30 and #68 in
variable Competition
ID Class Variable Modality Type Sign Relev. Value
#30 1 Competition strong VoI pos. normal 2.3
#68 1 Competition no EF neg. - 0.0
#30: The prop. of shops with mod. “strong” in var. “Competition” is high in Class 1.
#68: None of shops have modality “no” in variable “Competition” in Class 1.
! the merge of messages #30 and #68 will be done in stage 4.
! 45 out of 52 messages activated type-B rules.
43 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
3. Document Planning
Objectives: to merge related messages (type-B rules) and to
“naturalise” specific messages (type-C rules).
Example: rule C.3 a↵ecting messages of variable Competition
ID Class Variable Modality Type Sign Relev. Value
#30 1 Competition strong VoI pos. normal 2.3
#68 1 Competition no EF neg. - 0.0
#30: The prop. of shops with mod. “strong” in var. “Competition” is low in Class 1.
! #30-naturalised: The prop. of shops with a strong competition is low in Class 1.
#68: None of shops have Competition “no” in Class 1.
! #68-naturalised: All shops have competition in Class 1.
44 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
4. Microplanning and Realisation
Objectives: final structure and transcription.
! 29 groups of messages (sentences), between 1 and 3 messages.
Example: construction of a sentence for variable Maintenance
Group #4 consisting of three messages a↵ected by rules B.2, B.4 and C.4.
Standard:
#37: The prop. of shops with modality “good” in variable “maintenance” is high.
#70: None of shops has mod. “deficient” in var. “maintenance”.
#36: The prop. of shops with modality “regular” in variable “maintenance” is low.
Rule C.4:
#37: The proportion of shops with a good maintenance is high.
#70: None of shops has a deficient maintenance.
#36: The proportion of shops with a regular maintenance is low.
Rule B.2:
#70 & #36: None of shops have a deficient maintenance and the proportion of them with
a regular maintenance is low.
Rule B.4:
#37 & (#70 & #36): The proportion of shops with a good maintenance is high. None of
PoSs has a deficient maintenance and the proportion of them with a regular maintenance is
low.
45 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Case Presentation
Dataset
Obtaining segmentations
Ranking and selecting segmentations
Qualitative description
Results
Qualitative description for Class 1
Class 1
=======
Almost all shops have a medium-sized display and a medium sensitivity to promotions.
The proportion of PoSs with thermal product display is very high.
All PoSs have competition and the proportion of them with a strong competition is high.
The proportion of shops with a good maintenance is high. None of PoSs has a
deficient maintenance and the proportion of them with a regular maintenance is low.
None of PoSs has a deficient communication and the proportion of them with a good
communication is very high.
None of PoSs has a deficient aesthetics and the proportion of them with a good
aesthetics is very high.
The proportions of shops with a number of assistants greater than or equal to many
are very high.
The proportions of stores with a size greater than or equal to medium are high.
The proportions of shops located in inner cities and no mountain towns are high.
The proportion of PoSs with a secondary Grifone weight is high while the proportion
of them with a minimal Grifone weight is low.
Strong competition, good qualities, medium/big stores, non in
mountain, secondary but existing Grifone weight.
46 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Main contributions
Output
Future Research
Outline
1 Introduction
2 Literature review
3 Fuzzy criteria for selecting classifications
4 NL-based automatic qualitative description of clusters
5 Application to market segmentation
6 Conclusions
47 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Main contributions
Output
Future Research
Conclusions
Complete MCDM system presented.
Main contributions:
1 Fuzzy criteria (Chapter 3).
2 Aggregation vs. sequential approach. (Chapters 2 and 5).
3 NLG system (Chapter 4).
4 Application (Chapter 5).
48 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Main contributions
Output
Future Research
Output of this Thesis
1 Journal papers:
Germ´an S´anchez-Hern´andez, Francisco Chiclana, N´uria Agell, Juan Carlos Aguado (2013).
Ranking and selection of unsupervised learning marketing segmentation. Knowledge-Based
Systems, 44:20–33.
2 Conference proceedings:
Germ´an S´anchez, M`onica Casabay´o, Albert Sam`a and N´uria Agell (2008). Forecasting Customer’s
Loyalty by Means of an Unsupervised Fuzzy Learning Method. Electronic proceedings of the 28th
International Symposium on Forecasting, 43. Nice, 22-25 June 2008.
Germ´an S´anchez, N´uria Agell, Juan Carlos Aguado, M´onica S´anchez and Francesc Prats (2007).
Selection Criteria for Fuzzy Unsupervised Learning: Applied to Market Segmentation. In
Foundations of Fuzzy Logic and Soft Computing (IFSA). Lecture Notes in computer Science,
4529:307–310.
Cati Olmo, Germ´an S´anchez, N´uria Agell, M´onica S´anchez and Francesc Prats (2007). Using
Orders of Magnitude and Nominal Variables to Construct Fuzzy Partitions. Proceedings of the
IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 1–6. London, 23-26 July 2007.
49 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Main contributions
Output
Future Research
Output of this Thesis
3 National and international workshops:
Francisco J. Ruiz, Albert Sam`a, Germ´an S´anchez, Jos´e Antonio Sanabria and N´uria Agell (2011).
An interval technical indicator for financial time series forecasting. Proceedings of the 25th
International Workshop on Qualitative Reasoning (QR).
Germ´an S´anchez, Albert Sam`a, Francisco J. Ruiz and N´uria Agell (2010). Moving intervals for
nonlinear time series forecasting. Proceedings of the 13th International Conference of the Catalan
Association for Artificial Intelligence (CCIA).
Jos´e Antonio Sanabria, Germ´an S´anchez, N´uria Agell and Josep Sayeras (2010). An application of
SVMs to predict financial exchange rate by using sentiment indicators. Proceedings of the V
Simposio de Teor´ıa y Aplicaciones de Miner´ıa de Datos (TAMIDA).
Germ´an S´anchez, Juan Carlos Aguado, N´uria Agell, M´onica S´anchez (2009). Automatic
Comparison and Selection of Classifications in Unsupervised Learning Processes. XI Jornadas de
ARCA Sistemas Cualitativos, Diagnosis, Rob´otica, Sistemas Dom´oticos y Computaci´on Ubicua
(JARCA). Almu˜n´ecar (Granada), 24-26 June 2009.
Germ´an S´anchez, Juan Carlos Aguado and N´uria Agell (2007). Forecasting New Customers’
Behaviour by Means of a Fuzzy Unsupervised Method. Artificial Intelligence Research and
Development, Frontiers in Artificial Intelligence and Applications. Proceedings of the 10th CCIA.,
163:368–375. Andorra, 25-26 October 2007. ISBN: 978-1-58603-798.
50 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
Introduction
Literature review
Fuzzy criteria for selecting classifications
NL-based automatic qualitative description of clusters
Application to market segmentation
Conclusions
Main contributions
Output
Future Research
Future Research
Criteria: improvements.
Aggregation: study of other OWA operators and linguistic
quantifiers.
Qualitative description: ontology, grammar.
Application: to assess unsupervised techniques, ensemble of
techniques.
51 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications

More Related Content

Viewers also liked

Resume_v9_Sustainability_IT_General
Resume_v9_Sustainability_IT_GeneralResume_v9_Sustainability_IT_General
Resume_v9_Sustainability_IT_GeneralOscar Segura Samper
 
Legal Issues of Anti-Doping Policy in Pro-Sports
Legal Issues of Anti-Doping Policy in Pro-SportsLegal Issues of Anti-Doping Policy in Pro-Sports
Legal Issues of Anti-Doping Policy in Pro-SportsDal-Young Chang
 
Q315 citi-conf-090915
Q315 citi-conf-090915Q315 citi-conf-090915
Q315 citi-conf-090915sandisk2015
 
2014 investor day-presentation-appendix_finalpdf
2014 investor day-presentation-appendix_finalpdf2014 investor day-presentation-appendix_finalpdf
2014 investor day-presentation-appendix_finalpdfsandisk2015
 
Karlo B. Reyes CV
Karlo B. Reyes CVKarlo B. Reyes CV
Karlo B. Reyes CVKarlo Reyes
 
Reconstructive osseous surgeries
Reconstructive osseous surgeriesReconstructive osseous surgeries
Reconstructive osseous surgeriesAchi Joshi
 
Localized bone augmentation and implant site development
Localized bone augmentation and implant site developmentLocalized bone augmentation and implant site development
Localized bone augmentation and implant site developmentPalm Immsombatti
 
Branding in a Digital Age - Marshall Kingston
Branding in a Digital Age - Marshall KingstonBranding in a Digital Age - Marshall Kingston
Branding in a Digital Age - Marshall KingstonMarshall Kingston
 
Types of bone and membrane used in guided tissue regeneration
Types of bone and membrane used in guided tissue regeneration Types of bone and membrane used in guided tissue regeneration
Types of bone and membrane used in guided tissue regeneration UGDS2014
 
Anterior Single Implant Supported Restoration In Esthetic Zone
Anterior Single Implant Supported Restoration In Esthetic ZoneAnterior Single Implant Supported Restoration In Esthetic Zone
Anterior Single Implant Supported Restoration In Esthetic ZoneMohammed Alshehri
 
Immediate implant placement
Immediate implant placementImmediate implant placement
Immediate implant placementKapil Arora
 

Viewers also liked (15)

Resume_v9_Sustainability_IT_General
Resume_v9_Sustainability_IT_GeneralResume_v9_Sustainability_IT_General
Resume_v9_Sustainability_IT_General
 
공모전
공모전공모전
공모전
 
Legal Issues of Anti-Doping Policy in Pro-Sports
Legal Issues of Anti-Doping Policy in Pro-SportsLegal Issues of Anti-Doping Policy in Pro-Sports
Legal Issues of Anti-Doping Policy in Pro-Sports
 
Q315 citi-conf-090915
Q315 citi-conf-090915Q315 citi-conf-090915
Q315 citi-conf-090915
 
2014 investor day-presentation-appendix_finalpdf
2014 investor day-presentation-appendix_finalpdf2014 investor day-presentation-appendix_finalpdf
2014 investor day-presentation-appendix_finalpdf
 
Karlo B. Reyes CV
Karlo B. Reyes CVKarlo B. Reyes CV
Karlo B. Reyes CV
 
Reconstructive osseous surgeries
Reconstructive osseous surgeriesReconstructive osseous surgeries
Reconstructive osseous surgeries
 
Immediate loading
Immediate loadingImmediate loading
Immediate loading
 
Localized bone augmentation and implant site development
Localized bone augmentation and implant site developmentLocalized bone augmentation and implant site development
Localized bone augmentation and implant site development
 
immediate implant
immediate implantimmediate implant
immediate implant
 
Immediate implant lecture
Immediate implant lectureImmediate implant lecture
Immediate implant lecture
 
Branding in a Digital Age - Marshall Kingston
Branding in a Digital Age - Marshall KingstonBranding in a Digital Age - Marshall Kingston
Branding in a Digital Age - Marshall Kingston
 
Types of bone and membrane used in guided tissue regeneration
Types of bone and membrane used in guided tissue regeneration Types of bone and membrane used in guided tissue regeneration
Types of bone and membrane used in guided tissue regeneration
 
Anterior Single Implant Supported Restoration In Esthetic Zone
Anterior Single Implant Supported Restoration In Esthetic ZoneAnterior Single Implant Supported Restoration In Esthetic Zone
Anterior Single Implant Supported Restoration In Esthetic Zone
 
Immediate implant placement
Immediate implant placementImmediate implant placement
Immediate implant placement
 

Similar to A contribution to ranking and description of classifications

Feb 2009: my University of Wisconsin colloquim presentation
Feb 2009: my University of Wisconsin colloquim presentationFeb 2009: my University of Wisconsin colloquim presentation
Feb 2009: my University of Wisconsin colloquim presentationVojtech Huser
 
Q & A for Literature search
Q & A for Literature searchQ & A for Literature search
Q & A for Literature searchChung-Han Yang
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxChimezie Ogbuji
 
Proposal power point draft.pptx
Proposal power point draft.pptxProposal power point draft.pptx
Proposal power point draft.pptxBetshaTizazu2
 
Data collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysisData collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysisRobinsonRaja1
 
DEA-Based Benchmarking Models In Supply Chain Management: An Application-Orie...
DEA-Based Benchmarking Models In Supply Chain Management: An Application-Orie...DEA-Based Benchmarking Models In Supply Chain Management: An Application-Orie...
DEA-Based Benchmarking Models In Supply Chain Management: An Application-Orie...ertekg
 
Multidisciplinary analysis and optimization under uncertainty
Multidisciplinary analysis and optimization under uncertaintyMultidisciplinary analysis and optimization under uncertainty
Multidisciplinary analysis and optimization under uncertaintyChen Liang
 
Esperanto for Engineers, Technical Talk for Medical Engineers
Esperanto for Engineers, Technical Talk for Medical EngineersEsperanto for Engineers, Technical Talk for Medical Engineers
Esperanto for Engineers, Technical Talk for Medical EngineersPaul Blackett
 
Esperimento di report cqi cqa
Esperimento di report cqi cqaEsperimento di report cqi cqa
Esperimento di report cqi cqaMaurizio Piu
 
Vojtech Huser: spring AMIA conference: representing clinical processes in XPDL
Vojtech Huser: spring AMIA conference: representing clinical processes in XPDLVojtech Huser: spring AMIA conference: representing clinical processes in XPDL
Vojtech Huser: spring AMIA conference: representing clinical processes in XPDLVojtech Huser
 
Selection of Equipment by Using Saw and Vikor Methods
Selection of Equipment by Using Saw and Vikor Methods Selection of Equipment by Using Saw and Vikor Methods
Selection of Equipment by Using Saw and Vikor Methods IJERA Editor
 
Paper presentations: UK e-science AHM meeting, 2005
Paper presentations: UK e-science AHM meeting, 2005Paper presentations: UK e-science AHM meeting, 2005
Paper presentations: UK e-science AHM meeting, 2005Paolo Missier
 
Medinfo 2010 openEHR Clinical Modelling Worshop
Medinfo 2010 openEHR Clinical Modelling WorshopMedinfo 2010 openEHR Clinical Modelling Worshop
Medinfo 2010 openEHR Clinical Modelling WorshopKoray Atalag
 
2015 Vadim Genin NIT MBA Thesis SEM Defining the state of the art 14122015
2015  Vadim Genin NIT MBA Thesis  SEM Defining the state of the art 141220152015  Vadim Genin NIT MBA Thesis  SEM Defining the state of the art 14122015
2015 Vadim Genin NIT MBA Thesis SEM Defining the state of the art 14122015Vadim Genin
 
Hypervolume-based search for test case prioritization - ssbse 2015
Hypervolume-based search for test case prioritization - ssbse 2015Hypervolume-based search for test case prioritization - ssbse 2015
Hypervolume-based search for test case prioritization - ssbse 2015Vrije Universiteit Brussel
 

Similar to A contribution to ranking and description of classifications (20)

Feb 2009: my University of Wisconsin colloquim presentation
Feb 2009: my University of Wisconsin colloquim presentationFeb 2009: my University of Wisconsin colloquim presentation
Feb 2009: my University of Wisconsin colloquim presentation
 
Q & A for Literature search
Q & A for Literature searchQ & A for Literature search
Q & A for Literature search
 
Multiplex
MultiplexMultiplex
Multiplex
 
Reference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptxReference Domain Ontologies and Large Medical Language Models.pptx
Reference Domain Ontologies and Large Medical Language Models.pptx
 
Proposal power point draft.pptx
Proposal power point draft.pptxProposal power point draft.pptx
Proposal power point draft.pptx
 
Data collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysisData collection,tabulation,processing and analysis
Data collection,tabulation,processing and analysis
 
DEA-Based Benchmarking Models In Supply Chain Management: An Application-Orie...
DEA-Based Benchmarking Models In Supply Chain Management: An Application-Orie...DEA-Based Benchmarking Models In Supply Chain Management: An Application-Orie...
DEA-Based Benchmarking Models In Supply Chain Management: An Application-Orie...
 
Hanaa phd presentation 14-4-2017
Hanaa phd  presentation  14-4-2017Hanaa phd  presentation  14-4-2017
Hanaa phd presentation 14-4-2017
 
Multidisciplinary analysis and optimization under uncertainty
Multidisciplinary analysis and optimization under uncertaintyMultidisciplinary analysis and optimization under uncertainty
Multidisciplinary analysis and optimization under uncertainty
 
1. 5sc methodology colombia
1. 5sc methodology colombia1. 5sc methodology colombia
1. 5sc methodology colombia
 
1. 5sc methodology colombia
1. 5sc methodology colombia1. 5sc methodology colombia
1. 5sc methodology colombia
 
Esperanto for Engineers, Technical Talk for Medical Engineers
Esperanto for Engineers, Technical Talk for Medical EngineersEsperanto for Engineers, Technical Talk for Medical Engineers
Esperanto for Engineers, Technical Talk for Medical Engineers
 
Esperimento di report cqi cqa
Esperimento di report cqi cqaEsperimento di report cqi cqa
Esperimento di report cqi cqa
 
FMS-2016-03-08
FMS-2016-03-08FMS-2016-03-08
FMS-2016-03-08
 
Vojtech Huser: spring AMIA conference: representing clinical processes in XPDL
Vojtech Huser: spring AMIA conference: representing clinical processes in XPDLVojtech Huser: spring AMIA conference: representing clinical processes in XPDL
Vojtech Huser: spring AMIA conference: representing clinical processes in XPDL
 
Selection of Equipment by Using Saw and Vikor Methods
Selection of Equipment by Using Saw and Vikor Methods Selection of Equipment by Using Saw and Vikor Methods
Selection of Equipment by Using Saw and Vikor Methods
 
Paper presentations: UK e-science AHM meeting, 2005
Paper presentations: UK e-science AHM meeting, 2005Paper presentations: UK e-science AHM meeting, 2005
Paper presentations: UK e-science AHM meeting, 2005
 
Medinfo 2010 openEHR Clinical Modelling Worshop
Medinfo 2010 openEHR Clinical Modelling WorshopMedinfo 2010 openEHR Clinical Modelling Worshop
Medinfo 2010 openEHR Clinical Modelling Worshop
 
2015 Vadim Genin NIT MBA Thesis SEM Defining the state of the art 14122015
2015  Vadim Genin NIT MBA Thesis  SEM Defining the state of the art 141220152015  Vadim Genin NIT MBA Thesis  SEM Defining the state of the art 14122015
2015 Vadim Genin NIT MBA Thesis SEM Defining the state of the art 14122015
 
Hypervolume-based search for test case prioritization - ssbse 2015
Hypervolume-based search for test case prioritization - ssbse 2015Hypervolume-based search for test case prioritization - ssbse 2015
Hypervolume-based search for test case prioritization - ssbse 2015
 

Recently uploaded

Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...ThinkInnovation
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksBoston Institute of Analytics
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证acoha1
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationmuqadasqasim10
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancingmohamed Elzalabany
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfRobertoOcampo24
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024patrickdtherriault
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Valters Lauzums
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfgreat91
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证dq9vz1isj
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunksgmuir1066
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样jk0tkvfv
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchersdarmandersingh4580
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证pwgnohujw
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token PredictionNABLAS株式会社
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证ppy8zfkfm
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.pptRachmaGhifari
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证ju0dztxtn
 

Recently uploaded (20)

Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
Identify Rules that Predict Patient’s Heart Disease - An Application of Decis...
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024Northern New England Tableau User Group (TUG) May 2024
Northern New England Tableau User Group (TUG) May 2024
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
edited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdfedited gordis ebook sixth edition david d.pdf
edited gordis ebook sixth edition david d.pdf
 
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
1:1原版定制伦敦政治经济学院毕业证(LSE毕业证)成绩单学位证书留信学历认证
 
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam DunksNOAM AAUG Adobe Summit 2024: Summit Slam Dunks
NOAM AAUG Adobe Summit 2024: Summit Slam Dunks
 
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
如何办理(UCLA毕业证书)加州大学洛杉矶分校毕业证成绩单学位证留信学历认证原件一样
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
1:1原版定制利物浦大学毕业证(Liverpool毕业证)成绩单学位证书留信学历认证
 
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI  MANAJEMEN OF PENYAKIT TETANUS.pptMATERI  MANAJEMEN OF PENYAKIT TETANUS.ppt
MATERI MANAJEMEN OF PENYAKIT TETANUS.ppt
 
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
如何办理英国卡迪夫大学毕业证(Cardiff毕业证书)成绩单留信学历认证
 

A contribution to ranking and description of classifications

  • 1. A contribution to ranking and description of classifications PhD examination Germ´an S´anchez-Hern´andez Dept. of Enginyeria de Sistemes, Autom`atica i Inform`atica Industrial (ESAII) Universitat Polit`ecnica de Catalunya – BarcelonaTech (UPC) Av. Diagonal 647, 08034 Barcelona ESADE Business School, Universitat Ramon Llull (URL) Avda. Torreblanca 59, 08172 Sant Cugat del Vall`es german.sanchez@esade.edu Co-Advisors: Juan Carlos Aguado Chao & N´uria Agell Jan´e September 13th, 2013
  • 2. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Outline 1 Introduction 2 Literature review 3 Fuzzy criteria for selecting classifications 4 NL-based automatic qualitative description of clusters 5 Application to market segmentation 6 Conclusions 2 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 3. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation and framework Objectives Theoretical background Outline 1 Introduction 2 Literature review 3 Fuzzy criteria for selecting classifications 4 NL-based automatic qualitative description of clusters 5 Application to market segmentation 6 Conclusions 3 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 4. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation and framework Objectives Theoretical background Motivation and framework Motivation in Marketing for developing a novel and complete fuzzy MCDM1 methodology: 1. Generation of segmentations Unsupervised learning. 2. Rank and selection of segmentations Criteria to assess segmentations. Aggregation of the assessments. 3. Description of the best segmentation Most important features Natural language 1 Multi-Criteria Decision Making 4 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 5. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation and framework Objectives Theoretical background Objectives 1 Generation of classifications (Theoretical background). 2 Evaluation by fuzzy validation criteria (Chapter 3). 3 Rank by aggregating the assessments (Chapter 2). 4 NLG2 system (Chapter 4). 5 Application (Chapter 5). 2 Natural Language Generation 5 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 6. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation and framework Objectives Theoretical background Overview 6 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 7. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Criteria for selecting classifications Aggregation functions Data-to-text systems Outline 1 Introduction 2 Literature review 3 Fuzzy criteria for selecting classifications 4 NL-based automatic qualitative description of clusters 5 Application to market segmentation 6 Conclusions 7 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 8. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Criteria for selecting classifications Aggregation functions Data-to-text systems Criteria for selecting classifications Alternatives for selecting classifications: [Kukar, 2003; Osei-Bryson, 2010], classical sequential approach. [Halkidi et al., 2002] translates the problem into a search one. [Broder et al., 2008] new classification by voting. Types of validation criteria [Jain et al., 1999; Theodoridis and Koutroumbas, 2008] Internal criteria: analyse internal structure Compactness, separability, prediction strength... [Liu et al., 2010]. External criteria: compare with external structure Rand, Jaccard, Fowlkes-Mallows, Minkowski indexes... [Wu et al., 2009]. Relative criteria: manual pairwise comparisons (avoided) [Jain et al., 1999]. 8 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 9. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Criteria for selecting classifications Aggregation functions Data-to-text systems Summary Validation Criteria Paper Comments Internal criteria Extern. Relat. Compact. Separab. Accuracy Feat. Goals criteria criteria Ramze Rezaee et al. (1998) One index for fuzzy c-Means Yes: compact. Yes: separat. No No No No No Cheng et al. (1999) Subspace clustering Yes: high density No No No Yes: cover. & correl. of dim’s No No Kukar (2003) Reliability on diagn. No No Yes: reliability No No No No Halkidi et al. (2001) Review Yes: several No No No Yes: several Yes: several Choi et al. (2005) Association rules No No No No Yes: R-F-MV No No Tibshirani & Walther (2005) Valid. by prediction strength Yes: variance Yes: bias Yes: prediction strength No No No No Yatskiv & Gusarova (2005) Review No No No No No Yes: several Yes: several Method presented Review & application Yes: IC (coherence) Yes: IA (accuracy) No Yes: IU & IB (useful. & balanced) Yes: ID (dep.) No 9 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 10. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Criteria for selecting classifications Aggregation functions Data-to-text systems Summary Validation Criteria Paper Comments Internal criteria Extern. Relat. Compact. Separab. Accuracy Feat. Goals criteria criteria Bittmann & Gelbard (2009) Visualisation of hierarch. clustering Yes: min hetereog. No No No Yes: visualis. No No Wang et al. (2009) Clinical application Yes: Davies-Bouldin & rel.-free No No No No No Wu et al. (2009) External criteria for k-Means No No No No No Yes: several No Xiong et al. (2009) k-Means Yes: Sum of Sq. Errors Yes: entropy and CV Yes: F-measure No No No No Liu et al. (2010) Internal criteria review Yes: several Yes: several No No No No No Osei-Bryson (2010) Review No No Yes: accuracy Yes: # of imp. vars Yes: outliers, Max/Min No Yes: several Method presented Review & application Yes: IC (coherence) Yes: IA (accuracy) No Yes: IU & IB (useful. & balanc.) Yes: ID (dep.) No 10 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 11. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Criteria for selecting classifications Aggregation functions Data-to-text systems Aggregation functions Two steps in MCDM [Fodor and Roubens, 1994]: 1 Aggregation of the single evaluations. 2 Exploitation by generating a ranking of the alternatives. Many di↵erent families of aggregation functions [Chiclana et al., 2004, 2007; Dubois and Prade, 1985; Fodor and Roubens, 1994; Herrera et al., 2003; Klir and Folger, 1988; Torra, 1997; Torra and Narukawa, 2007; Xu and Da, 2003; Yager, 1988; Zhou et al., 2008]. OWA operator [Yager, 1988] W (a1, · · · , an) = nX i=1 wi a (i) wi via linguistic quantifier [Zadeh, 1983]: wi = Q i n Q i 1 n , Q is a RIM quantifier: Q(r) = r↵ . 11 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 12. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Criteria for selecting classifications Aggregation functions Data-to-text systems Data-to-text Systems Paper System Application area Input data Users Rules Goldberg et al. (1994) FoG Weather forecasting Time series Forecasters Yes Reiter et al. (2005) Forecasting texts Time series Forecasters No Sripada et al. (2003) SumTime-Mousam Time series Forecasters Yes Cawsey et al. (2000) Piglit Medicine Events Patients Yes Hallett and Scott (2005) Summaries of events List of events Medical sta↵ & patients Yes Harris (2008) Narrative Engine: text summaries Events Medical sta↵ No H¨uske-Kraus (2003b) Review of applications Raw data Medical sta↵ No H¨uske-Kraus (2003a) Suregen-2: Routine reports Medical sta↵ No Kahn et al. (1991) Topaz Yes Portet et al. (2009) BT-45: Neonatal summaries Raw data from sensors Medical sta↵ & patients Yes Reiter et al. (2003) Stop: personalised reports Manual input Patients Yes Method presented (2013) Description of groups Generic Tabular data Generic Yes 12 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 13. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Criteria for selecting classifications Aggregation functions Data-to-text systems Summary Data-to-text systems Paper System Application area Input data Users Rules Ferres et al. (2006) iGraph Accessibility Graphical data Visually-imp. Yes Thomas & Sripada (2008) Atlas.txt Geo-referenced data Visually-imp. No Kukich (1983) Ana: textual stock market Financial Time series Stock marketers Yes Hammond and Davis (2005) Ladder Image Sketches Yes Herzog and Wazinski (1994) Vitru Visual scenes No Roy (2002) Describer Visual scenes Yes Sripada and Gao (2007) ScubaText Sports Scuba divers No Iordanskaja et al. (1992) Summaries Generic Statistical data Yes McKeown et al. (1994) PLANDoc List of events No Method presented (2013) Description of groups Generic Tabular data Generic Yes 13 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 14. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model Outline 1 Introduction 2 Literature review 3 Fuzzy criteria for selecting classifications 4 NL-based automatic qualitative description of clusters 5 Application to market segmentation 6 Conclusions 14 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 15. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model Motivation 15 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 16. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model First criterion: useful number of classes Objective: su cient but small enough number of clusters. Definition Given a classification C, the index of usefulness is characterised by the following membership function: IU,K1,K2 (C) = 8 < : f1(M), if 1  M < K1; 1, if K1  M  K2; f2(M), if K2 < M  N, (1) where M 2 N is the number of classes of C; K1, K2 2 N such that K1 < K2 are two prefixed parameters; and f1 is a strict increasing function and f2 is a strict decreasing function verifying f1(1) = f2(N) = 0 and f1(K1) = f2(K2) = 1. 16 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 17. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model First criterion: examples Examples of functions for K1 = 4 and K2 = 7: 17 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 18. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model Second criterion: balanced classes Objective: to avoid (or boost) unbalanced classifications. Definition Given a classification C, the index of balanced classes of C is defined as: IB (C) = maxY (CVY ) CVC maxY (CVY ) minY (CVY ) , (2) where CVC is the coe cient of variation associated with C and minY (CVY ) and maxY (CVY ) are given as per Propositions 3.1 and 3.3, respectively. If unbalanced classes are required, IB = 1 IB. 18 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 19. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model Second criterion: example Example of IB for two classifications: Let C1 and C2 be the following two di↵erent classifications of the same data set consisting of N = 260 individuals: Y1 = {90, 80, 90}; Y2 = {110, 30, 20, 90} CVY1 = 0.51; CVY2 = 4.43. min(CVY ) = 0; max(CVY ) = 6.18. IB (C1) = 0.92; IB (C2) = 0.28. 19 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 20. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model Third criterion: coherent classification Objective: to ensure that the Global Adequacy Degrees (GAD) are obtained from similar values of Marginal Adequacy Degrees (MAD). Definition The index of coherence of classification is given as follows: IC (C) = 1 PM i=1 PN j=1[max(µijk ) min(µijk )] M · N , (3) where N 2 N is the number of individuals, M 2 N is the number of classes of classification C, and µijk is the MAD of individual j to class i according to descriptor k. 20 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 21. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model Third criterion: example Examples of IC for two classifications Let’s consider two classifications C1 and C2 consisting of two (A and B) and three classes (C, D and E), respectively. MADs are shown next: Class A d1 d2 d3 i1 0.3 0.4 0.6 i2 0.2 0.3 0.2 i3 0.5 0.5 0.3 Class B d1 d2 d3 i1 0.7 0.5 0.5 i2 0.4 0.8 0.5 i3 0.9 0.6 0.7 Class C d1 d2 d3 i1 0.1 0.6 0.4 i2 0.7 0.2 0.3 i3 0.4 0.8 0.3 Class D d1 d2 d3 i1 0.1 0.7 0.3 i2 0.5 0.2 0.7 i3 0.9 0.3 0.4 Class E d1 d2 d3 i1 0.9 0.2 0.7 i2 0.6 0.9 0.6 i3 0.7 0.6 0.1 MADs of C1 more homogeneous ! IC (C1) = 0.75 > IC (C2) = 0.47 21 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 22. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model Fourth criterion: dependency on external variables Objective: high relation with an external variable. Definition Given a classification C, its index of dependency on a control variable is defined as: ID (C) = 2 N · p M 1 · p S 1 , (4) where N is the number of individuals, M is the number of classes of C and S is the number of unique values of the control variable, if it is qualitative, or the number of considered intervals in the discretisation if it is quantitative. 22 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 23. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation First criterion: useful number of classes Second criterion: balanced classes Third criterion: coherent classification Fourth criterion: dependency on external variables Fifth criterion: accuracy of the predictive model Fifth criterion: accuracy of the predictive model Objective: high predictability. Definition Given a classification C, its index of accuracy is defined as: IA(C) = 2 · precision(C) · recall(C) precision(C) + recall(C) , (5) where precision(C) and recall(C) are the weighted averages of precision and recall of classes of C, respectively. 23 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 24. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation Arquitecture Signal Analysis Data Interpretation Document Planning Microplanning and Realisation Outline 1 Introduction 2 Literature review 3 Fuzzy criteria for selecting classifications 4 NL-based automatic qualitative description of clusters 5 Application to market segmentation 6 Conclusions 24 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 25. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation Arquitecture Signal Analysis Data Interpretation Document Planning Microplanning and Realisation Motivation Interpretation thwrough a natural description. NLG system based on a four-stage architecture [Reiter, 2007]: 1 Signal analysis: basic patterns. 2 Data interpretation: messages and relations. 3 Document planning: messages selection. 4 Microplanning and realisation: text generation. 25 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 26. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation Arquitecture Signal Analysis Data Interpretation Document Planning Microplanning and Realisation Arquitecture 26 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 27. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation Arquitecture Signal Analysis Data Interpretation Document Planning Microplanning and Realisation 1. Signal Analysis Objectives: to select variables and to identify the important values. 27 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 28. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation Arquitecture Signal Analysis Data Interpretation Document Planning Microplanning and Realisation 2. Data Interpretation Objective: to avoid redundant information (type-A rules). 1 Discarding negative messages. 2 Discarding messages obtained from VoIs. 3 Discarding modalities. 4 Discarding variables (same sign). 5 Discarding variables (di↵erent sign). Rule Class Variable Modality Type Sign A.1 Yes Yes Yes A.2 Yes Yes Yes Yes A.3 Yes Yes A.4 Yes Yes Yes Yes A.5 Yes Yes Yes 28 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 29. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation Arquitecture Signal Analysis Data Interpretation Document Planning Microplanning and Realisation 3. Document Planning Objectives: to identify relations and modifications. Type-B rules: 1 Merging modalities of an ordinal variable. 2 Merging modalities. 3 Merging variables with the same modalities. 4 Adding single modalities. 5 Merging single messages. Type-C rules: 1 Use of the semantics of modality “no”. 2 Use of the semantics of modality “yes”. 3 Use of the semantics of variables. 4 Modalities as adjectives. 5 Use of linguistic quantifiers. 29 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 30. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Motivation Arquitecture Signal Analysis Data Interpretation Document Planning Microplanning and Realisation 4. Microplanning and Realisation Objectives: final structure and transcription. 1 Microplanning: sorts groups of by importance. 2 Realisation: transcription. 30 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 31. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Outline 1 Introduction 2 Literature review 3 Fuzzy criteria for selecting classifications 4 NL-based automatic qualitative description of clusters 5 Application to market segmentation 6 Conclusions 31 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 32. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Case Presentation Objective: to apply the methodology reviewed and developed into a real marketing problem of a B2B environment. Challenge: comprehend fluctuations in orders made by the shops (limited resources). Objective: to segment the set of retailers. Actions: 1 Unsupervised learning process. 2 Selection of segmentations. 3 Natural language description. 32 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 33. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Dataset Grifone: Outdoor sporting equipment firm Firm: Textil Seu, SA. La Seu d’Urgell (north of Lleida). More than 25 years. 260 points of sale. 16 variables. Antiquity Assistants Evaluation DisplayGrifone Location Internet ThermalExhibitor Specialists Aesthetics Communication Competition DisplaySize GrifoneWeight Maintenance PromosSensit Size 33 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 34. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Obtaining segmentations Unsupervised learning technique. LAMDA (Learning Algorithm for Multivariate Data Analysis) [Aguado, 1998; Aguado et al., 1999; Aguilar and L´opez de M´antaras, 1982]. Tolerance. Hybrid connectives (MinMax, Probabilistic Product, Lukasiewicz, Frank n-norms). Density functions. PromosSensit as external variable. ! 566 segmentations (between 1 and 244 classes). 34 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 35. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Criteria 1 Usefulness: K1 = 3 and K2 = 5. Linear-exponential: ! f1(M) = M 1 3 1 ; f2(M) = e260 M 1 e(260 5) 1 2 Balanced: Avoid unbalanced classifications ! IB (C) = maxY (CVY ) CVC maxY (CVY ) minY (CVY ) . 35 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 36. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Criteria 3 Coherence: Density functions of quantitative variables: Antiquity: Waissman. Assistants: Classical. Evaluation: Gaussian. 4 Dependency: Chosen variable: PromosSensit. 5 Accuracy: SVM, 10-fold cross-validation (30 times) 36 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 37. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Aggregation of the indexes OWA operator. Linguistic quantifier: most of. RIM function: Q(r) = r1/2. Computed weights: (0.447, 0.185, 0.142, 0.120, 0.106). 37 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 38. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Results Rank ID Conn. Toler. Classes M Criterion assessment OWA IU IB IC ID IA 1 #259 Minmax 0.439 3 1 0.928 0.251 0.528 0.936 0.8423 2 #260 Minmax 0.469 3 1 0.929 0.254 0.363 0.922 0.8207 3 #258 Minmax 0.422 4 1 0.885 0.226 0.425 0.875 0.8103 4 #243 Minmax 0.290 3 1 0.909 0.256 0.008 0.965 0.7868 5 #244 Minmax 0.304 3 1 0.920 0.255 0.012 0.948 0.7855 6 #257 Minmax 0.411 3 1 0.933 0.279 0.032 0.856 0.7786 7 #256 Minmax 0.400 4 1 0.872 0.246 0.064 0.906 0.7751 8 #253 Minmax 0.359 4 1 0.883 0.231 0.021 0.915 0.7722 9 #252 Minmax 0.352 4 1 0.884 0.237 0.025 0.904 0.7712 10 #255 Minmax 0.393 4 1 0.939 0.263 0.063 0.768 0.7686 Segmentation #259: 1 Class 1: 35 shops 2 Class 2: 98 shops 3 Class 3: 127 shops 38 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 39. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description 1. Signal Analysis Objectives: to select variables and to identify important values. Variables Antiquity, Assistants and Evaluation discretised (CAIM [Kurgan and Cios, 2004]). Variable Intervals P1 P2 P3 Antiquity (years) less than two three more than three Assistants few many a lot of Evaluation bad good excellent Variables Antiquity, Specialists and Internet discarded. 39 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 40. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description 1. Signal Analysis Detection of relevant VoIs and EFs. Example: tables for variable Competition Contingency table: Competition No Weak Strong 1 0 13 21 2 26 38 34 3 24 42 60 Expected frequencies: Competition No Weak Strong 1 6.6 12.3 15.2 2 19.0 35.3 43.7 3 24.4 45.4 56.2 Values of importance: Competition No Weak Strong 1 -6.66 0.05 2.25 2 2.59 0.20 -2.15 3 -0.01 -0.26 0.26 Conditional frequencies: Competition No Weak Strong 1 0.00 0.38 0.62 2 0.27 0.39 0.35 3 0.19 0.33 0.48 40 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 41. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description 1. Signal Analysis Landmarks. Example: initial messages for variable Competition ID Class Variable Modality Type Sign Relev. Value #1 1 Competition no VoI neg. high 6.6 #30 1 Competition strong VoI pos. normal 2.3 #47 2 Competition no VoI pos. normal 2.6 #48 2 Competition strong VoI neg. normal 2.1 #68 1 Competition no EF neg. - 0.0 #48: The prop. of shops with mod. “strong” in var. “Competition” is low in Class 2. #68: None of shops have modality “no” in variable “Competition” in Class 1. ! 67 relevant VoIs (38 highly rel.) and 13 EFs (3 pos., 10 neg.). 41 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 42. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description 2. Data Interpretation Objective: to discard redundant messages (type-A rules). Example: rule A.2 detecting redundancy between messages #68 and #1 in variable Competition ID Class Variable Modality Type Sign Relev. Value #1 1 Competition no VoI neg. high 6.6 #68 1 Competition no EF neg. - 0.0 #1: The prop. of shops with mod. “no” in var. “Competition” is very low in Class 1. #68: None of shops have modality “no” in variable “Competition” in Class 1. ! discarding message #1 and prioritising #68. ! 28 redundant messages were discarded. 42 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 43. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description 3. Document Planning Objectives: to merge related messages (type-B rules) and to “naturalise” specific messages (type-C rules). Example: rule B.2 detecting relations between messages #30 and #68 in variable Competition ID Class Variable Modality Type Sign Relev. Value #30 1 Competition strong VoI pos. normal 2.3 #68 1 Competition no EF neg. - 0.0 #30: The prop. of shops with mod. “strong” in var. “Competition” is high in Class 1. #68: None of shops have modality “no” in variable “Competition” in Class 1. ! the merge of messages #30 and #68 will be done in stage 4. ! 45 out of 52 messages activated type-B rules. 43 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 44. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description 3. Document Planning Objectives: to merge related messages (type-B rules) and to “naturalise” specific messages (type-C rules). Example: rule C.3 a↵ecting messages of variable Competition ID Class Variable Modality Type Sign Relev. Value #30 1 Competition strong VoI pos. normal 2.3 #68 1 Competition no EF neg. - 0.0 #30: The prop. of shops with mod. “strong” in var. “Competition” is low in Class 1. ! #30-naturalised: The prop. of shops with a strong competition is low in Class 1. #68: None of shops have Competition “no” in Class 1. ! #68-naturalised: All shops have competition in Class 1. 44 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 45. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description 4. Microplanning and Realisation Objectives: final structure and transcription. ! 29 groups of messages (sentences), between 1 and 3 messages. Example: construction of a sentence for variable Maintenance Group #4 consisting of three messages a↵ected by rules B.2, B.4 and C.4. Standard: #37: The prop. of shops with modality “good” in variable “maintenance” is high. #70: None of shops has mod. “deficient” in var. “maintenance”. #36: The prop. of shops with modality “regular” in variable “maintenance” is low. Rule C.4: #37: The proportion of shops with a good maintenance is high. #70: None of shops has a deficient maintenance. #36: The proportion of shops with a regular maintenance is low. Rule B.2: #70 & #36: None of shops have a deficient maintenance and the proportion of them with a regular maintenance is low. Rule B.4: #37 & (#70 & #36): The proportion of shops with a good maintenance is high. None of PoSs has a deficient maintenance and the proportion of them with a regular maintenance is low. 45 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 46. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Case Presentation Dataset Obtaining segmentations Ranking and selecting segmentations Qualitative description Results Qualitative description for Class 1 Class 1 ======= Almost all shops have a medium-sized display and a medium sensitivity to promotions. The proportion of PoSs with thermal product display is very high. All PoSs have competition and the proportion of them with a strong competition is high. The proportion of shops with a good maintenance is high. None of PoSs has a deficient maintenance and the proportion of them with a regular maintenance is low. None of PoSs has a deficient communication and the proportion of them with a good communication is very high. None of PoSs has a deficient aesthetics and the proportion of them with a good aesthetics is very high. The proportions of shops with a number of assistants greater than or equal to many are very high. The proportions of stores with a size greater than or equal to medium are high. The proportions of shops located in inner cities and no mountain towns are high. The proportion of PoSs with a secondary Grifone weight is high while the proportion of them with a minimal Grifone weight is low. Strong competition, good qualities, medium/big stores, non in mountain, secondary but existing Grifone weight. 46 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 47. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Main contributions Output Future Research Outline 1 Introduction 2 Literature review 3 Fuzzy criteria for selecting classifications 4 NL-based automatic qualitative description of clusters 5 Application to market segmentation 6 Conclusions 47 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 48. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Main contributions Output Future Research Conclusions Complete MCDM system presented. Main contributions: 1 Fuzzy criteria (Chapter 3). 2 Aggregation vs. sequential approach. (Chapters 2 and 5). 3 NLG system (Chapter 4). 4 Application (Chapter 5). 48 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 49. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Main contributions Output Future Research Output of this Thesis 1 Journal papers: Germ´an S´anchez-Hern´andez, Francisco Chiclana, N´uria Agell, Juan Carlos Aguado (2013). Ranking and selection of unsupervised learning marketing segmentation. Knowledge-Based Systems, 44:20–33. 2 Conference proceedings: Germ´an S´anchez, M`onica Casabay´o, Albert Sam`a and N´uria Agell (2008). Forecasting Customer’s Loyalty by Means of an Unsupervised Fuzzy Learning Method. Electronic proceedings of the 28th International Symposium on Forecasting, 43. Nice, 22-25 June 2008. Germ´an S´anchez, N´uria Agell, Juan Carlos Aguado, M´onica S´anchez and Francesc Prats (2007). Selection Criteria for Fuzzy Unsupervised Learning: Applied to Market Segmentation. In Foundations of Fuzzy Logic and Soft Computing (IFSA). Lecture Notes in computer Science, 4529:307–310. Cati Olmo, Germ´an S´anchez, N´uria Agell, M´onica S´anchez and Francesc Prats (2007). Using Orders of Magnitude and Nominal Variables to Construct Fuzzy Partitions. Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 1–6. London, 23-26 July 2007. 49 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 50. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Main contributions Output Future Research Output of this Thesis 3 National and international workshops: Francisco J. Ruiz, Albert Sam`a, Germ´an S´anchez, Jos´e Antonio Sanabria and N´uria Agell (2011). An interval technical indicator for financial time series forecasting. Proceedings of the 25th International Workshop on Qualitative Reasoning (QR). Germ´an S´anchez, Albert Sam`a, Francisco J. Ruiz and N´uria Agell (2010). Moving intervals for nonlinear time series forecasting. Proceedings of the 13th International Conference of the Catalan Association for Artificial Intelligence (CCIA). Jos´e Antonio Sanabria, Germ´an S´anchez, N´uria Agell and Josep Sayeras (2010). An application of SVMs to predict financial exchange rate by using sentiment indicators. Proceedings of the V Simposio de Teor´ıa y Aplicaciones de Miner´ıa de Datos (TAMIDA). Germ´an S´anchez, Juan Carlos Aguado, N´uria Agell, M´onica S´anchez (2009). Automatic Comparison and Selection of Classifications in Unsupervised Learning Processes. XI Jornadas de ARCA Sistemas Cualitativos, Diagnosis, Rob´otica, Sistemas Dom´oticos y Computaci´on Ubicua (JARCA). Almu˜n´ecar (Granada), 24-26 June 2009. Germ´an S´anchez, Juan Carlos Aguado and N´uria Agell (2007). Forecasting New Customers’ Behaviour by Means of a Fuzzy Unsupervised Method. Artificial Intelligence Research and Development, Frontiers in Artificial Intelligence and Applications. Proceedings of the 10th CCIA., 163:368–375. Andorra, 25-26 October 2007. ISBN: 978-1-58603-798. 50 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications
  • 51. Introduction Literature review Fuzzy criteria for selecting classifications NL-based automatic qualitative description of clusters Application to market segmentation Conclusions Main contributions Output Future Research Future Research Criteria: improvements. Aggregation: study of other OWA operators and linguistic quantifiers. Qualitative description: ontology, grammar. Application: to assess unsupervised techniques, ensemble of techniques. 51 / 51 Germ´an S´anchez-Hern´andez Ranking and description of classifications