26. RECONOCIMIENTO DE FORMAS DE
FUNCIONES DICRIMINANTES
Fd(x)=W.X=[ 𝑤1 𝑤2 𝑤3] .
𝑥1
𝑥2
1
Fd(x)=𝑤1 𝑥1 + 𝑤2 𝑥2 + 𝑤3
Si Fd(x) >0 ----> clase 1
Si Fd(x) <0 ----> clase 2
Si Fd(x) =0 ----> clase indeterminada
Si se tienen n clases en el peor de los casos se necesitaran
Funciones discriminantes para dividir el espacio en n regiones distintas
𝑛
2
En el caso optimo se necesitarían n-1 funciones discriminantes.
27. RECONOCEDOR EUCLIDEO POR
MEDIO DE LA DISTANCIA EUCLIDEA
Para el reconocedor basado en la distancia Euclidea se debe basar en:
Las clases son de naturaleza determinística ( no es estadística).
Toda la información necesaria y suficiente para su diseño
se encuentra disponible a priori.
Una única clase (α𝑖) se representa por un único vector el cual
es el prototipo de la clase 𝑍𝑖 = μ𝑖 .
α1
α2
𝑍2
𝑍1
X1
X2
30. TRABAJO EN CLASE
Encontrar matemáticamente la función discriminante
lineal a partir de: d𝐸2(p,Zi)= 𝑝 − 𝑍𝑖 2
31. Método reconocedor euclídeo
1) Determinar el prototipo o centroides de las clases de universo de
trabajo.
2) Dado un vector P a reconocer, calcular las distancias euclideas de P
a cada prototipo Z1, Z2, Z3…. Zn.
3) Asignar a la clase i tal que la distancia sea mínima.
Mejorar generalización
Training = 70% del total de patrones
Test = 30% del total de patrones
32. DISTANCIAS
En Matlab >>> Doc pdist
Metrics
Given an m-by-n data matrix X, which is treated as m (1-by-n) row vectors x1, x2, ..., xm,
the various distances between the vector xs and xt are defined as follows:
Euclidean distance
39. • Ir a Matlab Doc, perform,
confusion,
• Plotregression, perfcurve,
plotroc, plotconfusion
40. ROC CURVE
• En la Teoría de detección de señales, una curva ROC(acrónimo de Receiver
Operating Characteristic, o Característica Operativa del Receptor) es una
representación gráfica de la sensibilidad frente a la especificidad para un sistema
clasificador binario según se varía el umbral de discriminación.
• El análisis de la curva ROC, o simplemente análisis ROC, proporciona
herramientas para seleccionar los modelos posiblemente óptimos y descartar
modelos subóptimos independientemente de (y antes de especificar) el coste de
la distribución de las dos clases sobre las que se decide
67. DIMENSIONALITY REDUCTION
• The SEMMA methodology was defined by its developer,
SAS Institute, as the "Process for selecting, exploiting and
modeling large amounts of data to discover known
business patterns" . The name of this methodology is due
to its initials in English: Sample, Explore, Modify, Model
and Asses.
Phases of the Semma Methodology
96. • Análisis principal de componentes realiza la reducción de
dimensionalidad del sistema teniendo en cuenta en
preservar la máxima varianza del sistema, pero no se
enfoca en mejorar el porcentaje de clasificación.
• Otros usos de el Análisis Principal de Componentes es la
proyección multidimensional de datos.
• Leer Laboratorio PCA Matlab.
DIMENTIONALITY REDUCTION
125. Lambda es una parámetro de regularización que permite aumentar la dispersión
Y minimizar la redundancia de las características a medida que lambda aumenta
FSCNA
126. Videos de ayuda
• fscnca
• https://www.mathworks.com/help/stats/neighborhood-component-analysis.html#bvf6c2d-1
• https://www.mathworks.com/help/stats/fscnca.html?requestedDomain=www.mathworks.com#bvcz0le-2
• Video 3 (0verfitting and underfitting)
• https://www.youtube.com/watch?v=AYI1J3EmuaU&feature=youtu.be
• Video 4 machine learning linear models
• https://www.youtube.com/watch?v=2wEROvCN9-0&feature=youtu.be
• Machine learning perceptron
• https://youtu.be/qxxFOotbxUE
• Logistic regression
• https://youtu.be/f79fgzz2KOk
• decision trees
• https://youtu.be/Z9gRD102Ijg
• ensemble methods : Boosting
• https://youtu.be/BfM22W5pV64
• Neural Networks
• https://youtu.be/_ovutFsNBcQ
128. In the preliminary research, the technique of Moving Window has been
implemented in conjunction with the calculation of a standard ratio of variance
inter-group/intra variance (λ). The Euclidean metric standard was used for
calculating the variances.
Dimensionality Reduction and Classification
where
Schematic illustration of the process
of DIM calculus.
Moving Window And Variance Analisis1.
134. Moving Window With
Variance Analysis
Laboratorio de Moving Window
TeamViewer
Id ususario 500-501-818
Contraseña Unab1952
135. fscnca Feature selection using neighborhood component analysis for classification
fsrnca Feature selection using neighborhood component analysis for regression
sequentialfs Sequential feature selection
relieff Importance of attributes (predictors) using ReliefF algorithm
rica Feature extraction by using reconstruction ICA
sparsefilt Feature extraction by using sparse filtering
transform Transform predictors into extracted features
tsne t-Distributed Stochastic Neighbor Embedding
barttest Bartlett's test
canoncorr Canonical correlation
pca Principal component analysis of raw data
pcacov Principal component analysis on covariance matrix
pcares Residuals from principal component analysis
ppca Probabilistic principal component analysis
factoran Factor analysis
rotatefactors Rotate factor loadings
nnmf Nonnegative matrix factorization
cmdscale Classical multidimensional scaling
mahal Mahalanobis distance
mdscale Nonclassical multidimensional scaling
pdist Pairwise distance between pairs of objects
squareform Format distance matrix
procrustes Procrustes analysis
FeatureSelectionNCAClassification Feature selection for classification using neighborhood component analysis
(NCA)
FeatureSelectionNCARegression Feature selection for regression using neighborhood component analysis (NCA)
ReconstructionICA Feature extraction by reconstruction ICA
136. Clase roca
97 Datos
60 variables
Clase metal
111 datos
60 variables
SONAR
208 Datos
60 variables
2 clases
148. UNBALANCED DATA SET
• “focused resampling” consisted of resampling only those minority examples that
occurred on the boundary between the minority and majority classes.
149. SMOTE Synthetic Minority Over-sampling
Technique
• SMOTE We propose an over-sampling approach in which the minority class is
over-sampled by creating “synthetic” examples rather than by over-sampling
with replacement. This approach is inspired by a technique that proved
successful in handwritten character recognition (Ha & Bunke, 1997). The
minority class is over-sampled by taking each minority class sample and
introducing synthetic examples along the line segments joining any/all of the k
minority class nearest neighbors. Depending upon the amount of over-sampling
required, neighbors from the k nearest neighbors are randomly chosen.
150.
151. ADASYN
• ADASYN:
• Its a improved version of Smote. What it does is same as SMOTE just with a
minor improvement. After creating those sample it adds a random small values
to the points thus making it more realistic. In other words instead of all the
sample being linearly correlated to the parent they have a little more variance in
them i.e they are bit scattered.
152.
153. Resampling Methods
There are a variety of techniques for statistical inference called resampling methods. These
methods have been used to test the predictive capacity of linear and quadratic discriminant
functions.
They are basically simulation techniques that reuse the observed data to constitute a universe
from which to extract repeated samples. The requirement of high computational power common
to all of them has been called "intensive computing" techniques [Noreen, E., 1989].
The classic methodology of Jackknife Quenouille / Tukey [Quenouille, B., 1949] [Tukey, J., 1958] is,
together with the bootstrap method, a tool based on resampling the elements of a sample to
obtain approximate properties of estimators . These tools are very useful when information is not
available about the sample distribution of a statistic or in cases where the sample distribution is
dependent on unknown parameters.
Jackknife
Also called validation leaving one out or U method, the method classifies each case of the analysis
using derived functions from all cases except the case itself and performs it for all samples of the
population studied. The predictive capacity of the discriminant function will be the average of all
the iterations performed.
154. • MÉTODOS DE VALIDACIÓN (remuestreo):
• Método H (Holdout): It divides the set of cases into two groups: 1) Training: it is
made up of two thirds and is used to induce a classificatory model. 2) Testing: it is
formed by the last third part and it is used to estimate the error rate True.
• Método de remuestreo (random subsampling): It is a variant of the H method, and
is based on applying the H method multiple times (varying the selection criteria of
the training group and the test group), and calculates the error based on the average
error rates obtained.
• Método de validación cruzada (cross-validation):
It is based on the partitioning of the sample into K subsets, approximately of the
same size, where K - 1 subsets constitute the training group and the remainder the
test group
• Método de Bootstraping: In a set of cardinality N cases, a random sample is chosen
with replacement of the same size as the training group, leaving the cases not
selected as test group. (Replacement sampling consists of extracting elements from a
population so that, after each extraction, the extracted element is re-entered and
can be re-selected.)
155. • Bootstrap
• The bootstrap methodology owes its name and its original formulation to
[Efron, B., 1979]. It is the most developed line among resampling methods,
The bootstrap method basically consists of performing the learning process
with the learning and validation subgroup with the validation subgroup N
times, to obtain a final success rate that will be equal to the average of the
hits of each of the N iterations performed .
• It is shown that this method tends asymptotically to a given final validation
[Hall, P., 1988]. The Figure shows 10 validation processes with 140 learning
iterations and validation in each process. It is observed that in all cases,
validation tends to the same value in only 140. In this project have been
carried out processes of validation of 5000 iterations.
•
Leer laboratorio
156. bootci Bootstrap confidence interval
bootstrp Bootstrap sampling
combnk Enumeration of combinations
crossval Loss estimate using cross validation
datasample Randomly sample from data, with or without
replacement
jackknife Jackknife sampling
randsample Random sample
Resampling Techniques
Resample data set using bootstrap, jackknife, and cross validation
Functions
185. •Early Stoping:
Early Stopping is an overfitting-preventing method in which the
data set are divided into three subsets for training, validation and
test. The parameters of the MLP (i.e., weights and biases) are
calculated with the training set. The error is monitorized during
the training process using the validation test. Normally, the
validation error will decrease as training proceeds; if the
validation error starts rising, this is an indirect indication that the
MLP is beginning to overfit the data. If the validation error
continues rising over a specific number of training iterations, the
training process is stopped and the weights and biases at the
minimun of the validation error are retained. In this process, it is
also useful plotting the test set error, to check if it is significantly
different to the error of the validation test. If this happens to be
the case, it may indicate an inadequate division of the data set.
OVERFITTING
186. OVERFITTING
0 2 4 6 8 10 12 14
10
-8
10
-6
10
-4
10
-2
10
0
Best Validation Performance is 0.017607 at epoch 10
MeanSquaredError(mse) 14 Epochs
Train
Validation
Test
Best
Goal
187. Regularization:
The process of model regularization involves modifying the objective of data-
based error function by introducing a penalty term associated to the model
parameters, in the form of the mean of the sum of squares of the network
weights and biases:
OVERFITTING
188. This penalty term will cause the network to have smaller weights and biases,
forcing the network response to be smoother and less likely to overfit. Still, this
does not answer how should we tune this penalty term in order to optimize the
generalization performance.
In the training process, we would want to determine the optimal model
parameters in an adaptive way. In the work of David Mackay [74], this was
achieved by inserting the MLP model within a Bayesian framework, where the
weigths and biases of the network are assumed to be random variables with
specified distribution, and the regularization parameters are related to the
unknown variances associated with these distributions. We can then estimate
these parameters using statistical techniques, and one of the most powerful
ones is Bayesian Regularization.
189. Bayesian Regularization: If the MLP models are trained
with a set of inputs and targets in the from (x1; d1);
(x2; d2); (x3; d3)….(xn; dn), the objective of the
training process is minimizing the square error
function:
200. Aplicaciones de IA con Matlab
• Logistic regression
• https://youtu.be/f79fgzz2KOk
• Herramientas imprescindibles de machine learning con matlab
• https://www.mathworks.com/videos/essential-tools-for-machine-
learning-
1488913197554.html?elqsid=1489596058376&potential_use=Education
• Inteligencia artificial con video
• https://www.mathworks.com/videos/computer-vision-made-easy-
1485791126907.html?elqsid=1492701559914&potential_use=Education