SlideShare a Scribd company logo
1 of 11
Download to read offline
http://www.iaeme.com/IJCIET/index.asp 139 editor@iaeme.com
International Journal of Civil Engineering and Technology (IJCIET)
Volume 9, Issue 11, November 2018, pp. 139–149, Article ID: IJCIET_09_11_014
Available online at http://www.iaeme.com/ijciet/issues.asp?JType=IJCIET&VType=9&IType=11
ISSN Print: 0976-6308 and ISSN Online: 0976-6316
© IAEME Publication Scopus Indexed
PARTICLE SWARM OPTIMIZATION FOR
MULTIDIMENSIONAL CLUSTERING OF
NATURAL LANGUAGE DATA
G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva
Federal State Budget Educational Institution of Higher Education "Moscow State University
of Psychology and Education", Moscow, Russia
ABSTRACT
Consider a non-linear dimensionality reduction method which takes into account
the discriminating power of the solution found for given values of the categorical
variable associated with each observation. Stochastic optimization method known as
the "Particle swarm optimization" is proposed to found characteristics that ensure the
best separation of observations in terms of a given quality functional. The basis for
evaluating the quality of the solution lies in the purity of the clusters obtained with the
k-means method, or with using self-organizing Kohonen feature maps.
Keywords: Stochastic Optimization, Text Analysis, Swarm Clustering.
Cite this Article: G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva, Particle
Swarm Optimization for Multidimensional Clustering of Natural Language Data,
International Journal of Civil Engineering and Technology, 9(11), 2018, pp. 139–149.
http://www.iaeme.com/IJCIET/issues.asp?JType=IJCIET&VType=9&IType=11
1. INTRODUCTION
This article considers a method for the dimensionality reduction of multidimensional data,
taking into account the discriminatory power of the solution found for given values of the
categorical variable associated with each observation. In order to search for features ensuring
that the observations are best distinguished in terms of a given quality functional, it is
proposed to use a numerical procedure based on the stochastic optimization method known as
the "Particle swarm method". The quality assessment of the solution is based on the purity of
the clusters obtained in the space found using k-means clustering. As a result of applying this
method to any vector component subset of the original dimension, the coordinates of a given
number of principal points (centroids) are obtained, which allow us to determine whether any
observation of the original sample belongs to the corresponding clusters. The structure and
composition of the resulting clusters will be determined with a specific component subset of
the original multidimensional characteristic vector (and, to a certain extent, with the initial
coordinates of the centroids, usually chosen randomly). The result of "learning" (clustering)
Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data
http://www.iaeme.com/IJCIET/index.asp 140 editor@iaeme.com
will be the coordinates of centroids found in the given characteristic space, which ensure the
best division of the set of observations.
If a categorical variable is specified, where the value is known for all observations of the
training sample, a quality functional associated with the purity of the classes found in the
context of this categorical variable can be specified for any clustering result. Then the search
for a component subset of discriminant classes that are optimal for discrimination can be
considered a combinatorial task to search for one of 2N (where N is the dimension of the
original feature space) parameter combinations that provides the best values of the given
quality functional.
The task to reduce the original dimensionality of the feature space with quasi-optimal
discrimination, then formulated according to a given category, was considered as a
combinatorial optimization task. A stochastic method was proposed to solve it using a
multicomponent quality functional.
2. FORMULATION OF THE TASKS
A sample of observations is given, where { }, is a set of observations,
characterized by class and multidimensional parametric vector , { }, one of
possible values of categorical variable, { } parametric vector of real
components (coordinates of the observation in the k- dimensional space). It is necessary to
determine the set of components from - { }
{ } , , from unique, such that ̀ { } provides quasi-
optimal values of the quality functional ( ̀ ) .
3. THE CONCEPT OF THE QUALITY FUNCTIONAL OF THE
FEATURE SPACE FOUND
The quality functional ( ̀ ) is defined, large values, which will correspond to the results of
clustering with greater cluster homogeneity. The k-means clustering method is used with
centroid initialization using random values or any other suitable method. Homogeneity will
be defined as a generalized measure of the deviation in the number of observations for a
given class { } of "winners" in each cluster from the total number of
observations assigned to the given class (hereinafter, the partial homogeneity ) and
the generalized deviation of the number of unique observation classes in each cluster from 1
(hereinafter formation homogeneity ). If the number of observations of each class
in V is comparable, it is advisable to supplement the quality functional with a characteristic
reflecting the generalized deviation of the volume of resultant clusters from the expected
volume (for example, from the average arithmetic volume l/n), the term weighting W will be
hereinafter used to refer to this parameter.
4. GRAPHICAL REPRESENTATION OF STRUCTURAL
CHARACTERISTICS OF THE CLUSTERING RESULTS
To illustrate the results of applying the proposed algorithm, a graphic representation of the
cluster structure purity proposed by the researcher Narayana Swamy [1] will be used [5].
Since this form of representation (Fig. 1) is not standardized, a short explanation will be
given on its interpretation.
G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva
http://www.iaeme.com/IJCIET/index.asp 141 editor@iaeme.com
Figure 1 Example of a cluster purity diagram
It is assumed that class tags of clustered observations are known in advance and their
values are listed in the legend on the right-hand side of Fig. 1. Each of the classes is
associated with a color code, also reflected in the legend. A set of concentric circles is placed
in the main part of the diagram, whereby their numbers correspond to the number of clusters
obtained during the clustering procedure (9 for the example in Fig. 1). The ratio of the areas
of the inner circles relative to each other corresponds to the ratio of the volumes of the
corresponding clusters. The size of the sectors in the inner circles corresponds to the volume
of class observations with the corresponding color coding in the corresponding cluster, for
example, in the cluster represented by the left upper circle, most of the observations are
assigned to the class c4 with a red color code.
5. PRACTICAL ASSESSMENTS OF THE QUALITY FUNCTIONAL
Formal estimates of each ( ̀ ) , previously listed, can be given. Before starting
the calculations, the clustering result ̀ should be obtained using the k-means clustering
method, the set of j resulting clusters will be further denoted as { }, where
{ } { } the set of z class labels corresponding to observations assigned to
the given cluster. Let ( { }) function return the absolute number of class
labels that have the maximum share in the i-th cluster. Then partial homogeneity can be
estimated as
∑
(
⁄ (1)
where - is the number of observations relating to cluster as a result of clustering.
The result of maximizing partial homogeneity on the sample data for a fixed number of
clusters is reflected in Fig. 2
Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data
http://www.iaeme.com/IJCIET/index.asp 142 editor@iaeme.com
Figure 2 Result of maximizing partial homogeneity
Let ( { }) be the function that returns the number of unique class labels
in the i-th cluster. Then formation homogeneity
∑
( ⁄ (2)
The result of maximizing formation homogeneity on the same data array is reflected in
Fig. 3
Figure 3 Result of maximizing formation homogeneity
Despite the similarity in the results of maximizing both criteria, it is easy to see that in the
first case (Fig. 2), preference is given to increasing the share of the winning class in each of
the clusters, while in the second case (Figure 3), preference is given to the smaller number of
classes represented within the same cluster.
The weighting estimate based on the assumption of equal sizes of resultant clusters is
given as
∑ √( ⁄ (3)
where - is the number of observations relating to cluster as a result of clustering, l –
total sample size.
The result of maximizing weighting with the same initial data is shown in Fig. 4
G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva
http://www.iaeme.com/IJCIET/index.asp 143 editor@iaeme.com
Figure 4 Result of maximizing weighting
The cumulative quality functional can be written as follows
( ̀ ) (4)
where , and are the gain factors corresponding to the
components of the given quality functional, allowing to designate the desired characteristics
of the clustering results obtained in the optimization process. The result of maximizing the
total quality functional for equal pairs , and , is reflected in Fig.
5.
Figure 5 Result of maximizing the three-component quality function
The value of ( ̀ ) – tends towards one, strict equality to 1 is achieved with the
described evaluation procedures in the case when all clusters are of equal size and when
within each of the clusters there are observations of only one class.
6. STOCHASTIC SOLUTION OF THE COMBINATORIAL
OPTIMIZATION TASK
The solution of optimization tasks that do not have an explicit analytical interpretation often
involve search methods for quasi-optimal parameters based on numerical estimates of the
gradient of the quality functional or on stochastic methods of directed search [2, 3]. This
section describes the stochastic optimization method based on the "particle swarm" method
applied to the formulated dimensionality reduction task.
The original idea of the "particle swarm" method was proposed and developed in studies
[4, 5, 6] the original algorithm was applied to model the social behavior of birds, bees and
other animals characterized by spatial movement within large groups (swarms). Later, it was
noted that it is possible to use this model effectively to study feature spaces, in particular, to
search for quasi-optimal solutions of multidimensional optimization tasks.
Swarm optimization algorithms can be referred to as evolutionary, whereby the general
scheme for enumeration of solutions is described by the following a sequence of steps:
Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data
http://www.iaeme.com/IJCIET/index.asp 144 editor@iaeme.com
1. A population of "individuals" is generated, each of which contains some random
solution of the target task. The search for a solution is represented by an iterative
process (steps 2 and 3). In each iteration (epoch), the decisions (positions) of all
individuals are slightly modified according to the rules ensuring the convergence
of the iterative process to quasi-optimal solutions.
2. The direction of the change in the position of each individual is calculated,
depending on its current position, the best solution of the given individual
throughout its "existence" (local extremum) and the best known solution (obtained
by any individual) for the entire population (global extremum).
3. New positions of the individuals (their coordinates) are computed in accordance
with the directions obtained in step 2.
4. The stopping criteria are checked, if the solution parameters do not match, go to
step 2, otherwise the search is interrupted.
The balance between local and global trends in the behavior of individuals is determined
by the coefficients interpreted as the acceleration of the movement of individuals towards the
local and global extremes. The original formulation of the task meant a real solution space,
making it impossible to use the method in linear programming tasks, in particular,
combinatorial optimization tasks. A number of authors proposed the adaptation of the method
to linear tasks [7], while the "accelerations" received a probabilistic interpretation. Let us
consider in more detail the modified version of the algorithm described in [7], which can be
used to find quasi-optimal ones in terms of ( ̀ ) for values ̀ { }.
Search for optimal combinations of parameters from { } can be formulated
as the task of "assembling a multidimensional backpack" with a single sampling. Its solution
{ } can be represented as vector P={ }, where { }, its
components with numbers from m equal to 1, and all others 0, that is 1 in the i-th position P
indicates that enters the component subset ̀ , selected for clustering.
The population of individuals consists of d solutions, each of the individuals stores the
current solution , the best solution and two vectors , , that determine
the inversion frequency of each of the k bits in each of the possible directions.
- probability of replacing 1 with 0 in position for the i-th individual,
- probability of replacing 0 with 1 in position for the i-th individual.
The values , will change at each iteration, as will the positions of the individuals.
For the positions of each individual at each iteration the probability of inversion
of each bit denoted as , it will be determined based on its current position
according to the following rule
{ (5)
The acceleration derivatives ̈ , ̈ will be affected by their current values, the best
local solution , the best global solution and the coefficient of inertia .
̈ ( ) ( ) (6)
̈ ( ) ( ) (7)
G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva
http://www.iaeme.com/IJCIET/index.asp 145 editor@iaeme.com
To calculate ( and ( ) two scaling factors are used
and , the values of which lie within {0…1}, they form a balance between
local and global trends in the derivative position obtained. In addition, at each iteration,
another pair of scaling values is randomly generated and
within limits {0…1}. The final values are formed according to the following rule:
{
( )
( ) ( )
(8)
{
( ) ( )
( )
(9)
{
( )
( )
(10)
{
( ) 〖( 〗 )
( )
(11)
In fact, the probability of inversion of each vector component decreases, provided that its
value coincides with the corresponding value of the known optimal solution, and increases
otherwise.
To calculate the new positions of each individual i, the components of the corresponding
solution e are inverted with probabilities , for this value is compared
with random values RNDie generated for each comparison. For vector the
following normalizing condition is applied:
(12)
Then the rule to determine derived positions will look as follows:
̈ {
̅̅̅̅
(13)
The above-mentioned algorithm corresponds to that described previously described.
Obviously, such a procedure will significantly reduce the probability of new solutions
appearing in the population after a certain number of steps, i.e. the algorithm will be "stuck"
at the point of the local extremum found. To overcome the problem of local extremums, it is
proposed to introduce a procedure to randomize the positions, based on the criterion of
"swarm density". When density refers to the generalized assessment of position deviation of
each individual from , ( – the function that restores the Hamming
distance between the binary vectors a and b, the swarm density will be estimated as
∑ (
( )
)
(14)
where k – the length of the solution binary vector, d – number of individuals in the
population.
The range of values of corresponds to the interval {0 ... 1}, with a value
of one indicating that the current positions of all individuals coincide with the best known
Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data
http://www.iaeme.com/IJCIET/index.asp 146 editor@iaeme.com
solution found by the algorithm. The value is estimated at the end of each
iteration, if the predetermined density threshold is exceeded, it is proposed to randomize the
positions of a certain percentage of individuals and the corresponding acceleration vectors
, , while preserving information about their best local solutions. This approach allows
to automatically derive a numerical procedure from local extremums without losing the
general search direction defined in the optimization process. Lower threshold values for
will lead to a less intensive search for a solution in the area of the current
extremum, and vice versa.
7. EVALUATION OF EFFICIENCY UPON CLUSTERING OF
NATURAL LANGUAGE DATA
The proposed concept was tested on natural language data on air incidents taken from the
open base of the "Aviation safety network" [8]. The initial data contained descriptions of the
incidents in English and the categorical variables associated with these incidents (damage
level, vessel type, flight phase, etc.). Incident descriptions were parameterized using
word2vec technology [9]. For each word included in the description, the corresponding
vector was saved from the freely distributed dictionary, learned from news aggregator Google
News text data. Each observation was then associated with a 300-dimensional numerical
vector obtained by calculating the arithmetic mean values of the component sums for the
word vectors making up this description. These vectors were considered as integral estimates
of the description semantics that do not have an explicit interpretation in the context of the
specified categorical variables. To test the proposed algorithm, a computational experiment
was conducted, the purpose of which was to reduce the dimension of multidimensional vector
descriptions, maximizing their discriminating force with respect to the level of damage
resulting from the incident.
The following levels of damage values were considered:
Serious – the aircraft was significantly damaged as a result of the incident;
Minor – the aircraft suffered minor damage as a result of the incident;
None – the aircraft was not damaged as a result of the incident;
Missing – as a result of the incident, the aircraft was completely lost, its fate is unknown.
The training sample consisted of 600 observations – whereby 300 were used as a training
sample and 300 as a control sample. Division was performed for 4, 8, and 12 clusters, using
k-mean clustering and self-organizing Kohonen feature maps as clustering algorithms when
calculating the quality functional.
The results of this study are graphically presented in table 1.
G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva
http://www.iaeme.com/IJCIET/index.asp 147 editor@iaeme.com
Table 1 Visualization of the purity of cluster structures obtained by reducing the dimensionality of
text descriptions, using 3 damage levels (3 types) and 4 damage levels (4 types) using k-means and
self-organizing Kohonen feature maps.
k-mean (3 types) SOM (3 types) k-mean (4 types) SOM (4 types)
In the clustering results, it is noticeable that for the categorical variable values (damage
level) obtained using 4 levels, cases with minor damages (minor, blue sector) and with
serious damages (serious, red sector) are regularly combined into one cluster. In fact, these
descriptions are quite complete, using the non-linear reduction method in dimensionality to
visualize the location of multidimensional (300 component) observation points in the 3-
dimensional space - t-SNE shows that the points with the corresponding levels of damage are
weakly distinguishable (Fig. 6).
Figure 6 displays the distribution of Minor and Serious observations in the 3-dimensional space
When aggregating Minor and Serious categories into one Damaged class (Table 1,
columns 1 and 2), the separation result improves significantly. In terms of class definition
Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data
http://www.iaeme.com/IJCIET/index.asp 148 editor@iaeme.com
error in the control sample, the error amounts to ~ 12% of errors when partitioning into 4
clusters and ~ 8% of errors when partitioning into 6 and 8 clusters. The results weakly
depend on the chosen clustering method, this conclusion may be specific for this particular
task.
The dimensionality of the resulting feature space in each of the cases presented in Table 1
was formed from one third to one tenth of the original number of components.
8. CONCLUSIONS
A new method was developed and tested for dimensionality reduction of data, which provides
solutions that are quasi-optimal from the point of view of discrimination of the given classes.
The method can be used in combination with text parameterization technology to process
natural language records in arbitrary application areas.
1. The result of the proposed algorithm is not only a combination of the initial
features, but also the coordinates of the cluster centers that combine observations
in the space found (or the trained Kohonen network if it is chosen as a clustering
method).
2. The proposed method does not require preliminary assumptions regarding the type
of initial distribution of the observed features.
3. The described reduction procedure was prototyped, thus confirming its practical
applicability.
4. The efficiency of the proposed technology was evaluated using examples of tasks
with real source space. The method can be extended to any initial spaces in which
points can be clustered using the k-means clustering method.
5. A multicomponent quality functional was formulated that allows to control the
process of dimensionality reduction in the feature space and to form various
characteristics of the resultant space.
6. For combinatorial optimization using the "particle swarm" method, a criterion to
"jam" the algorithm in the local extremum region was proposed. A procedure was
described to derive the algorithm from this domain and performed on the basis of
the criterion test results.
ACKNOWLEDGEMENTS
"This work was financially supported by the Ministry of Education and Science of the
Russian Federation within the framework of the Subsidy Agreement dating September 26,
2017 No. 14.576.21.0092 (Unique identifier of the agreement RFMEFI57617X0092) for the
implementation of applied scientific research on the topic: "Development of a neural network
forecasting system for aviation incidents and safety risk management based on historical data
including many parameters and text descriptions of events".
REFERENCES
[1] Narayana, S. Cluster Purity Visualizer.
https://bl.ocks.org/nswamy14/e28ec2c438e9e8bd302f
[2] Kuravsky, L. S., Marmalyuk, P. A., Yuryev, G. A., and Dumin, P. N. A Numerical
Technique for the Identification of Discrete-State Continuous-Time Markov Models.
Applied Mathematical Sciences, 9(8), 2015, pp. 379-391.
https://dx.doi.org/10.12988/ams.2015.410882
G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva
http://www.iaeme.com/IJCIET/index.asp 149 editor@iaeme.com
[3] Kuravsky, L. S., Marmalyuk, P. A., Yuryev, G. A., Belyaeva, O. B., and Prokopieva, O.
Yu. Mathematical Foundations of Flight Crew Diagnostics Based on Videooculography
Data. Applied Mathematical Sciences. 10(30), 2016, pp. 1449–1466.
https://dx.doi.org/10.12988/ams.2016.6122.
[4] Kennedy, J., and Eberhart, R. Swarm Intelligence. Morgan Kaufmann Publishers, Inc.,
San Francisco, CA, 2001.
[5] Kennedy, J., and Eberhart, R. ”Particle Swarm Optimization”, IEEE International
Conference on Neural Networks (Perth, Australia), IEEE Service Center, Piscataway, NJ,
IV, 1995.
[6] Eberhart, R., and Kennedy, J. A New Optimizer Using Particles Swarm Theory, Proc.
Sixth International Symposium on MicroMachine and Human Science (Nagoya, Japan),
IEEE Service Center, Piscataway, NJ, 1995.
[7] Khanesar, M. A., Tavakoli, H., Teshnehlab, M., and Shoorehdeli, M. A. Novel Binary
Particle Swarm Optimization, Particle Swarm Optimization, Aleksandar Lazinica (Ed.),
InTech, 2009.
https://www.intechopen.com/books/particle_swarm_optimization/novel_binary_particle_
swarm_optimization
[8] Aviation safety network https://aviation-safety.net/database/
[9] Mikolov, T., Yih, W., and Zweig, G. Linguistic Regularities in Continuous Space Word
Representations. Proceedings of NAACL HLT, 2013.

More Related Content

What's hot

Modified hotelling’s 푻 ퟐ control charts using modified mahalanobis distance
Modified hotelling’s 푻 ퟐ  control charts using modified mahalanobis distance Modified hotelling’s 푻 ퟐ  control charts using modified mahalanobis distance
Modified hotelling’s 푻 ퟐ control charts using modified mahalanobis distance IJECEIAES
 
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...IJRES Journal
 
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...jaumebp
 
Paper id 21201488
Paper id 21201488Paper id 21201488
Paper id 21201488IJRAT
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...IJAEMSJORNAL
 
A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...o_almasi
 
The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...
The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...
The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...ijaia
 
Novel algorithms for detection of unknown chemical molecules with specific bi...
Novel algorithms for detection of unknown chemical molecules with specific bi...Novel algorithms for detection of unknown chemical molecules with specific bi...
Novel algorithms for detection of unknown chemical molecules with specific bi...Aboul Ella Hassanien
 
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET Journal
 
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...IJDKP
 
A review of automatic differentiationand its efficient implementation
A review of automatic differentiationand its efficient implementationA review of automatic differentiationand its efficient implementation
A review of automatic differentiationand its efficient implementationssuserfa7e73
 
Advanced SOM & K Mean Method for Load Curve Clustering
Advanced SOM & K Mean Method for Load Curve Clustering Advanced SOM & K Mean Method for Load Curve Clustering
Advanced SOM & K Mean Method for Load Curve Clustering IJECEIAES
 
Certified global minima
Certified global minimaCertified global minima
Certified global minimassuserfa7e73
 
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...idescitation
 
New feature selection based on kernel
New feature selection based on kernelNew feature selection based on kernel
New feature selection based on kerneljournalBEEI
 

What's hot (20)

Modified hotelling’s 푻 ퟐ control charts using modified mahalanobis distance
Modified hotelling’s 푻 ퟐ  control charts using modified mahalanobis distance Modified hotelling’s 푻 ퟐ  control charts using modified mahalanobis distance
Modified hotelling’s 푻 ퟐ control charts using modified mahalanobis distance
 
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
A Mathematical Programming Approach for Selection of Variables in Cluster Ana...
 
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
 
Paper id 21201488
Paper id 21201488Paper id 21201488
Paper id 21201488
 
BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...BPSO&1-NN algorithm-based variable selection for power system stability ident...
BPSO&1-NN algorithm-based variable selection for power system stability ident...
 
A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...A parsimonious SVM model selection criterion for classification of real-world ...
A parsimonious SVM model selection criterion for classification of real-world ...
 
The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...
The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...
The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...
 
Ica 2013021816274759
Ica 2013021816274759Ica 2013021816274759
Ica 2013021816274759
 
recko_paper
recko_paperrecko_paper
recko_paper
 
Novel algorithms for detection of unknown chemical molecules with specific bi...
Novel algorithms for detection of unknown chemical molecules with specific bi...Novel algorithms for detection of unknown chemical molecules with specific bi...
Novel algorithms for detection of unknown chemical molecules with specific bi...
 
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering TechniquesIRJET- A Survey of Text Document Clustering by using Clustering Techniques
IRJET- A Survey of Text Document Clustering by using Clustering Techniques
 
Reducing Power Consumption during Test Application by Test Vector Ordering
Reducing Power Consumption during Test Application by Test Vector OrderingReducing Power Consumption during Test Application by Test Vector Ordering
Reducing Power Consumption during Test Application by Test Vector Ordering
 
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...
 
Data analytics concepts
Data analytics conceptsData analytics concepts
Data analytics concepts
 
A review of automatic differentiationand its efficient implementation
A review of automatic differentiationand its efficient implementationA review of automatic differentiationand its efficient implementation
A review of automatic differentiationand its efficient implementation
 
Advanced SOM & K Mean Method for Load Curve Clustering
Advanced SOM & K Mean Method for Load Curve Clustering Advanced SOM & K Mean Method for Load Curve Clustering
Advanced SOM & K Mean Method for Load Curve Clustering
 
Certified global minima
Certified global minimaCertified global minima
Certified global minima
 
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
An Automatic Medical Image Segmentation using Teaching Learning Based Optimiz...
 
Ijetcas14 507
Ijetcas14 507Ijetcas14 507
Ijetcas14 507
 
New feature selection based on kernel
New feature selection based on kernelNew feature selection based on kernel
New feature selection based on kernel
 

Similar to PARTICLE SWARM OPTIMIZATION FOR MULTIDIMENSIONAL CLUSTERING OF NATURAL LANGUAGE DATA

An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersAn Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersIJCSEA Journal
 
Cone Crusher Model Identification Using Block-Oriented Systems with Orthonorm...
Cone Crusher Model Identification Using Block-Oriented Systems with Orthonorm...Cone Crusher Model Identification Using Block-Oriented Systems with Orthonorm...
Cone Crusher Model Identification Using Block-Oriented Systems with Orthonorm...ijctcm
 
Cone crusher model identification using
Cone crusher model identification usingCone crusher model identification using
Cone crusher model identification usingijctcm
 
PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY
PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITYPROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY
PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITYIAEME Publication
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
 
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
Optimal Feature Selection from VMware ESXi 5.1 Feature SetOptimal Feature Selection from VMware ESXi 5.1 Feature Set
Optimal Feature Selection from VMware ESXi 5.1 Feature Setijccmsjournal
 
Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra drboon
 
Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra drboon
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...IAEME Publication
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
 
Sensitivity of Support Vector Machine Classification to Various Training Feat...
Sensitivity of Support Vector Machine Classification to Various Training Feat...Sensitivity of Support Vector Machine Classification to Various Training Feat...
Sensitivity of Support Vector Machine Classification to Various Training Feat...Nooria Sukmaningtyas
 
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...IRJET Journal
 
Background Estimation Using Principal Component Analysis Based on Limited Mem...
Background Estimation Using Principal Component Analysis Based on Limited Mem...Background Estimation Using Principal Component Analysis Based on Limited Mem...
Background Estimation Using Principal Component Analysis Based on Limited Mem...IJECEIAES
 
An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmIOSR Journals
 
Enhanced Genetic Algorithm with K-Means for the Clustering Problem
Enhanced Genetic Algorithm with K-Means for the Clustering ProblemEnhanced Genetic Algorithm with K-Means for the Clustering Problem
Enhanced Genetic Algorithm with K-Means for the Clustering ProblemAnders Viken
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...CSCJournals
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...Waqas Tariq
 
Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...IJERA Editor
 
IRJET - Movie Genre Prediction from Plot Summaries by Comparing Various C...
IRJET -  	  Movie Genre Prediction from Plot Summaries by Comparing Various C...IRJET -  	  Movie Genre Prediction from Plot Summaries by Comparing Various C...
IRJET - Movie Genre Prediction from Plot Summaries by Comparing Various C...IRJET Journal
 
On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction cscpconf
 

Similar to PARTICLE SWARM OPTIMIZATION FOR MULTIDIMENSIONAL CLUSTERING OF NATURAL LANGUAGE DATA (20)

An Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal ClustersAn Automatic Clustering Technique for Optimal Clusters
An Automatic Clustering Technique for Optimal Clusters
 
Cone Crusher Model Identification Using Block-Oriented Systems with Orthonorm...
Cone Crusher Model Identification Using Block-Oriented Systems with Orthonorm...Cone Crusher Model Identification Using Block-Oriented Systems with Orthonorm...
Cone Crusher Model Identification Using Block-Oriented Systems with Orthonorm...
 
Cone crusher model identification using
Cone crusher model identification usingCone crusher model identification using
Cone crusher model identification using
 
PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY
PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITYPROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY
PROFIT AGENT CLASSIFICATION USING FEATURE SELECTION EIGENVECTOR CENTRALITY
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
Optimal Feature Selection from VMware ESXi 5.1 Feature SetOptimal Feature Selection from VMware ESXi 5.1 Feature Set
Optimal Feature Selection from VMware ESXi 5.1 Feature Set
 
Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra
 
Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra Relevance Vector Machines for Earthquake Response Spectra
Relevance Vector Machines for Earthquake Response Spectra
 
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
AN IMPROVED METHOD FOR IDENTIFYING WELL-TEST INTERPRETATION MODEL BASED ON AG...
 
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...
 
Sensitivity of Support Vector Machine Classification to Various Training Feat...
Sensitivity of Support Vector Machine Classification to Various Training Feat...Sensitivity of Support Vector Machine Classification to Various Training Feat...
Sensitivity of Support Vector Machine Classification to Various Training Feat...
 
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
Disease Classification using ECG Signal Based on PCA Feature along with GA & ...
 
Background Estimation Using Principal Component Analysis Based on Limited Mem...
Background Estimation Using Principal Component Analysis Based on Limited Mem...Background Estimation Using Principal Component Analysis Based on Limited Mem...
Background Estimation Using Principal Component Analysis Based on Limited Mem...
 
An Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution AlgorithmAn Adaptive Masker for the Differential Evolution Algorithm
An Adaptive Masker for the Differential Evolution Algorithm
 
Enhanced Genetic Algorithm with K-Means for the Clustering Problem
Enhanced Genetic Algorithm with K-Means for the Clustering ProblemEnhanced Genetic Algorithm with K-Means for the Clustering Problem
Enhanced Genetic Algorithm with K-Means for the Clustering Problem
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
 
Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...Comparision of methods for combination of multiple classifiers that predict b...
Comparision of methods for combination of multiple classifiers that predict b...
 
IRJET - Movie Genre Prediction from Plot Summaries by Comparing Various C...
IRJET -  	  Movie Genre Prediction from Plot Summaries by Comparing Various C...IRJET -  	  Movie Genre Prediction from Plot Summaries by Comparing Various C...
IRJET - Movie Genre Prediction from Plot Summaries by Comparing Various C...
 
On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction On Selection of Periodic Kernels Parameters in Time Series Prediction
On Selection of Periodic Kernels Parameters in Time Series Prediction
 

More from IAEME Publication

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSIAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSIAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSIAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSIAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOIAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYIAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEIAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTIAEME Publication
 

More from IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Recently uploaded

Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 

Recently uploaded (20)

Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 

PARTICLE SWARM OPTIMIZATION FOR MULTIDIMENSIONAL CLUSTERING OF NATURAL LANGUAGE DATA

  • 1. http://www.iaeme.com/IJCIET/index.asp 139 editor@iaeme.com International Journal of Civil Engineering and Technology (IJCIET) Volume 9, Issue 11, November 2018, pp. 139–149, Article ID: IJCIET_09_11_014 Available online at http://www.iaeme.com/ijciet/issues.asp?JType=IJCIET&VType=9&IType=11 ISSN Print: 0976-6308 and ISSN Online: 0976-6316 © IAEME Publication Scopus Indexed PARTICLE SWARM OPTIMIZATION FOR MULTIDIMENSIONAL CLUSTERING OF NATURAL LANGUAGE DATA G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva Federal State Budget Educational Institution of Higher Education "Moscow State University of Psychology and Education", Moscow, Russia ABSTRACT Consider a non-linear dimensionality reduction method which takes into account the discriminating power of the solution found for given values of the categorical variable associated with each observation. Stochastic optimization method known as the "Particle swarm optimization" is proposed to found characteristics that ensure the best separation of observations in terms of a given quality functional. The basis for evaluating the quality of the solution lies in the purity of the clusters obtained with the k-means method, or with using self-organizing Kohonen feature maps. Keywords: Stochastic Optimization, Text Analysis, Swarm Clustering. Cite this Article: G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva, Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data, International Journal of Civil Engineering and Technology, 9(11), 2018, pp. 139–149. http://www.iaeme.com/IJCIET/issues.asp?JType=IJCIET&VType=9&IType=11 1. INTRODUCTION This article considers a method for the dimensionality reduction of multidimensional data, taking into account the discriminatory power of the solution found for given values of the categorical variable associated with each observation. In order to search for features ensuring that the observations are best distinguished in terms of a given quality functional, it is proposed to use a numerical procedure based on the stochastic optimization method known as the "Particle swarm method". The quality assessment of the solution is based on the purity of the clusters obtained in the space found using k-means clustering. As a result of applying this method to any vector component subset of the original dimension, the coordinates of a given number of principal points (centroids) are obtained, which allow us to determine whether any observation of the original sample belongs to the corresponding clusters. The structure and composition of the resulting clusters will be determined with a specific component subset of the original multidimensional characteristic vector (and, to a certain extent, with the initial coordinates of the centroids, usually chosen randomly). The result of "learning" (clustering)
  • 2. Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data http://www.iaeme.com/IJCIET/index.asp 140 editor@iaeme.com will be the coordinates of centroids found in the given characteristic space, which ensure the best division of the set of observations. If a categorical variable is specified, where the value is known for all observations of the training sample, a quality functional associated with the purity of the classes found in the context of this categorical variable can be specified for any clustering result. Then the search for a component subset of discriminant classes that are optimal for discrimination can be considered a combinatorial task to search for one of 2N (where N is the dimension of the original feature space) parameter combinations that provides the best values of the given quality functional. The task to reduce the original dimensionality of the feature space with quasi-optimal discrimination, then formulated according to a given category, was considered as a combinatorial optimization task. A stochastic method was proposed to solve it using a multicomponent quality functional. 2. FORMULATION OF THE TASKS A sample of observations is given, where { }, is a set of observations, characterized by class and multidimensional parametric vector , { }, one of possible values of categorical variable, { } parametric vector of real components (coordinates of the observation in the k- dimensional space). It is necessary to determine the set of components from - { } { } , , from unique, such that ̀ { } provides quasi- optimal values of the quality functional ( ̀ ) . 3. THE CONCEPT OF THE QUALITY FUNCTIONAL OF THE FEATURE SPACE FOUND The quality functional ( ̀ ) is defined, large values, which will correspond to the results of clustering with greater cluster homogeneity. The k-means clustering method is used with centroid initialization using random values or any other suitable method. Homogeneity will be defined as a generalized measure of the deviation in the number of observations for a given class { } of "winners" in each cluster from the total number of observations assigned to the given class (hereinafter, the partial homogeneity ) and the generalized deviation of the number of unique observation classes in each cluster from 1 (hereinafter formation homogeneity ). If the number of observations of each class in V is comparable, it is advisable to supplement the quality functional with a characteristic reflecting the generalized deviation of the volume of resultant clusters from the expected volume (for example, from the average arithmetic volume l/n), the term weighting W will be hereinafter used to refer to this parameter. 4. GRAPHICAL REPRESENTATION OF STRUCTURAL CHARACTERISTICS OF THE CLUSTERING RESULTS To illustrate the results of applying the proposed algorithm, a graphic representation of the cluster structure purity proposed by the researcher Narayana Swamy [1] will be used [5]. Since this form of representation (Fig. 1) is not standardized, a short explanation will be given on its interpretation.
  • 3. G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva http://www.iaeme.com/IJCIET/index.asp 141 editor@iaeme.com Figure 1 Example of a cluster purity diagram It is assumed that class tags of clustered observations are known in advance and their values are listed in the legend on the right-hand side of Fig. 1. Each of the classes is associated with a color code, also reflected in the legend. A set of concentric circles is placed in the main part of the diagram, whereby their numbers correspond to the number of clusters obtained during the clustering procedure (9 for the example in Fig. 1). The ratio of the areas of the inner circles relative to each other corresponds to the ratio of the volumes of the corresponding clusters. The size of the sectors in the inner circles corresponds to the volume of class observations with the corresponding color coding in the corresponding cluster, for example, in the cluster represented by the left upper circle, most of the observations are assigned to the class c4 with a red color code. 5. PRACTICAL ASSESSMENTS OF THE QUALITY FUNCTIONAL Formal estimates of each ( ̀ ) , previously listed, can be given. Before starting the calculations, the clustering result ̀ should be obtained using the k-means clustering method, the set of j resulting clusters will be further denoted as { }, where { } { } the set of z class labels corresponding to observations assigned to the given cluster. Let ( { }) function return the absolute number of class labels that have the maximum share in the i-th cluster. Then partial homogeneity can be estimated as ∑ ( ⁄ (1) where - is the number of observations relating to cluster as a result of clustering. The result of maximizing partial homogeneity on the sample data for a fixed number of clusters is reflected in Fig. 2
  • 4. Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data http://www.iaeme.com/IJCIET/index.asp 142 editor@iaeme.com Figure 2 Result of maximizing partial homogeneity Let ( { }) be the function that returns the number of unique class labels in the i-th cluster. Then formation homogeneity ∑ ( ⁄ (2) The result of maximizing formation homogeneity on the same data array is reflected in Fig. 3 Figure 3 Result of maximizing formation homogeneity Despite the similarity in the results of maximizing both criteria, it is easy to see that in the first case (Fig. 2), preference is given to increasing the share of the winning class in each of the clusters, while in the second case (Figure 3), preference is given to the smaller number of classes represented within the same cluster. The weighting estimate based on the assumption of equal sizes of resultant clusters is given as ∑ √( ⁄ (3) where - is the number of observations relating to cluster as a result of clustering, l – total sample size. The result of maximizing weighting with the same initial data is shown in Fig. 4
  • 5. G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva http://www.iaeme.com/IJCIET/index.asp 143 editor@iaeme.com Figure 4 Result of maximizing weighting The cumulative quality functional can be written as follows ( ̀ ) (4) where , and are the gain factors corresponding to the components of the given quality functional, allowing to designate the desired characteristics of the clustering results obtained in the optimization process. The result of maximizing the total quality functional for equal pairs , and , is reflected in Fig. 5. Figure 5 Result of maximizing the three-component quality function The value of ( ̀ ) – tends towards one, strict equality to 1 is achieved with the described evaluation procedures in the case when all clusters are of equal size and when within each of the clusters there are observations of only one class. 6. STOCHASTIC SOLUTION OF THE COMBINATORIAL OPTIMIZATION TASK The solution of optimization tasks that do not have an explicit analytical interpretation often involve search methods for quasi-optimal parameters based on numerical estimates of the gradient of the quality functional or on stochastic methods of directed search [2, 3]. This section describes the stochastic optimization method based on the "particle swarm" method applied to the formulated dimensionality reduction task. The original idea of the "particle swarm" method was proposed and developed in studies [4, 5, 6] the original algorithm was applied to model the social behavior of birds, bees and other animals characterized by spatial movement within large groups (swarms). Later, it was noted that it is possible to use this model effectively to study feature spaces, in particular, to search for quasi-optimal solutions of multidimensional optimization tasks. Swarm optimization algorithms can be referred to as evolutionary, whereby the general scheme for enumeration of solutions is described by the following a sequence of steps:
  • 6. Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data http://www.iaeme.com/IJCIET/index.asp 144 editor@iaeme.com 1. A population of "individuals" is generated, each of which contains some random solution of the target task. The search for a solution is represented by an iterative process (steps 2 and 3). In each iteration (epoch), the decisions (positions) of all individuals are slightly modified according to the rules ensuring the convergence of the iterative process to quasi-optimal solutions. 2. The direction of the change in the position of each individual is calculated, depending on its current position, the best solution of the given individual throughout its "existence" (local extremum) and the best known solution (obtained by any individual) for the entire population (global extremum). 3. New positions of the individuals (their coordinates) are computed in accordance with the directions obtained in step 2. 4. The stopping criteria are checked, if the solution parameters do not match, go to step 2, otherwise the search is interrupted. The balance between local and global trends in the behavior of individuals is determined by the coefficients interpreted as the acceleration of the movement of individuals towards the local and global extremes. The original formulation of the task meant a real solution space, making it impossible to use the method in linear programming tasks, in particular, combinatorial optimization tasks. A number of authors proposed the adaptation of the method to linear tasks [7], while the "accelerations" received a probabilistic interpretation. Let us consider in more detail the modified version of the algorithm described in [7], which can be used to find quasi-optimal ones in terms of ( ̀ ) for values ̀ { }. Search for optimal combinations of parameters from { } can be formulated as the task of "assembling a multidimensional backpack" with a single sampling. Its solution { } can be represented as vector P={ }, where { }, its components with numbers from m equal to 1, and all others 0, that is 1 in the i-th position P indicates that enters the component subset ̀ , selected for clustering. The population of individuals consists of d solutions, each of the individuals stores the current solution , the best solution and two vectors , , that determine the inversion frequency of each of the k bits in each of the possible directions. - probability of replacing 1 with 0 in position for the i-th individual, - probability of replacing 0 with 1 in position for the i-th individual. The values , will change at each iteration, as will the positions of the individuals. For the positions of each individual at each iteration the probability of inversion of each bit denoted as , it will be determined based on its current position according to the following rule { (5) The acceleration derivatives ̈ , ̈ will be affected by their current values, the best local solution , the best global solution and the coefficient of inertia . ̈ ( ) ( ) (6) ̈ ( ) ( ) (7)
  • 7. G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva http://www.iaeme.com/IJCIET/index.asp 145 editor@iaeme.com To calculate ( and ( ) two scaling factors are used and , the values of which lie within {0…1}, they form a balance between local and global trends in the derivative position obtained. In addition, at each iteration, another pair of scaling values is randomly generated and within limits {0…1}. The final values are formed according to the following rule: { ( ) ( ) ( ) (8) { ( ) ( ) ( ) (9) { ( ) ( ) (10) { ( ) 〖( 〗 ) ( ) (11) In fact, the probability of inversion of each vector component decreases, provided that its value coincides with the corresponding value of the known optimal solution, and increases otherwise. To calculate the new positions of each individual i, the components of the corresponding solution e are inverted with probabilities , for this value is compared with random values RNDie generated for each comparison. For vector the following normalizing condition is applied: (12) Then the rule to determine derived positions will look as follows: ̈ { ̅̅̅̅ (13) The above-mentioned algorithm corresponds to that described previously described. Obviously, such a procedure will significantly reduce the probability of new solutions appearing in the population after a certain number of steps, i.e. the algorithm will be "stuck" at the point of the local extremum found. To overcome the problem of local extremums, it is proposed to introduce a procedure to randomize the positions, based on the criterion of "swarm density". When density refers to the generalized assessment of position deviation of each individual from , ( – the function that restores the Hamming distance between the binary vectors a and b, the swarm density will be estimated as ∑ ( ( ) ) (14) where k – the length of the solution binary vector, d – number of individuals in the population. The range of values of corresponds to the interval {0 ... 1}, with a value of one indicating that the current positions of all individuals coincide with the best known
  • 8. Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data http://www.iaeme.com/IJCIET/index.asp 146 editor@iaeme.com solution found by the algorithm. The value is estimated at the end of each iteration, if the predetermined density threshold is exceeded, it is proposed to randomize the positions of a certain percentage of individuals and the corresponding acceleration vectors , , while preserving information about their best local solutions. This approach allows to automatically derive a numerical procedure from local extremums without losing the general search direction defined in the optimization process. Lower threshold values for will lead to a less intensive search for a solution in the area of the current extremum, and vice versa. 7. EVALUATION OF EFFICIENCY UPON CLUSTERING OF NATURAL LANGUAGE DATA The proposed concept was tested on natural language data on air incidents taken from the open base of the "Aviation safety network" [8]. The initial data contained descriptions of the incidents in English and the categorical variables associated with these incidents (damage level, vessel type, flight phase, etc.). Incident descriptions were parameterized using word2vec technology [9]. For each word included in the description, the corresponding vector was saved from the freely distributed dictionary, learned from news aggregator Google News text data. Each observation was then associated with a 300-dimensional numerical vector obtained by calculating the arithmetic mean values of the component sums for the word vectors making up this description. These vectors were considered as integral estimates of the description semantics that do not have an explicit interpretation in the context of the specified categorical variables. To test the proposed algorithm, a computational experiment was conducted, the purpose of which was to reduce the dimension of multidimensional vector descriptions, maximizing their discriminating force with respect to the level of damage resulting from the incident. The following levels of damage values were considered: Serious – the aircraft was significantly damaged as a result of the incident; Minor – the aircraft suffered minor damage as a result of the incident; None – the aircraft was not damaged as a result of the incident; Missing – as a result of the incident, the aircraft was completely lost, its fate is unknown. The training sample consisted of 600 observations – whereby 300 were used as a training sample and 300 as a control sample. Division was performed for 4, 8, and 12 clusters, using k-mean clustering and self-organizing Kohonen feature maps as clustering algorithms when calculating the quality functional. The results of this study are graphically presented in table 1.
  • 9. G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva http://www.iaeme.com/IJCIET/index.asp 147 editor@iaeme.com Table 1 Visualization of the purity of cluster structures obtained by reducing the dimensionality of text descriptions, using 3 damage levels (3 types) and 4 damage levels (4 types) using k-means and self-organizing Kohonen feature maps. k-mean (3 types) SOM (3 types) k-mean (4 types) SOM (4 types) In the clustering results, it is noticeable that for the categorical variable values (damage level) obtained using 4 levels, cases with minor damages (minor, blue sector) and with serious damages (serious, red sector) are regularly combined into one cluster. In fact, these descriptions are quite complete, using the non-linear reduction method in dimensionality to visualize the location of multidimensional (300 component) observation points in the 3- dimensional space - t-SNE shows that the points with the corresponding levels of damage are weakly distinguishable (Fig. 6). Figure 6 displays the distribution of Minor and Serious observations in the 3-dimensional space When aggregating Minor and Serious categories into one Damaged class (Table 1, columns 1 and 2), the separation result improves significantly. In terms of class definition
  • 10. Particle Swarm Optimization for Multidimensional Clustering of Natural Language Data http://www.iaeme.com/IJCIET/index.asp 148 editor@iaeme.com error in the control sample, the error amounts to ~ 12% of errors when partitioning into 4 clusters and ~ 8% of errors when partitioning into 6 and 8 clusters. The results weakly depend on the chosen clustering method, this conclusion may be specific for this particular task. The dimensionality of the resulting feature space in each of the cases presented in Table 1 was formed from one third to one tenth of the original number of components. 8. CONCLUSIONS A new method was developed and tested for dimensionality reduction of data, which provides solutions that are quasi-optimal from the point of view of discrimination of the given classes. The method can be used in combination with text parameterization technology to process natural language records in arbitrary application areas. 1. The result of the proposed algorithm is not only a combination of the initial features, but also the coordinates of the cluster centers that combine observations in the space found (or the trained Kohonen network if it is chosen as a clustering method). 2. The proposed method does not require preliminary assumptions regarding the type of initial distribution of the observed features. 3. The described reduction procedure was prototyped, thus confirming its practical applicability. 4. The efficiency of the proposed technology was evaluated using examples of tasks with real source space. The method can be extended to any initial spaces in which points can be clustered using the k-means clustering method. 5. A multicomponent quality functional was formulated that allows to control the process of dimensionality reduction in the feature space and to form various characteristics of the resultant space. 6. For combinatorial optimization using the "particle swarm" method, a criterion to "jam" the algorithm in the local extremum region was proposed. A procedure was described to derive the algorithm from this domain and performed on the basis of the criterion test results. ACKNOWLEDGEMENTS "This work was financially supported by the Ministry of Education and Science of the Russian Federation within the framework of the Subsidy Agreement dating September 26, 2017 No. 14.576.21.0092 (Unique identifier of the agreement RFMEFI57617X0092) for the implementation of applied scientific research on the topic: "Development of a neural network forecasting system for aviation incidents and safety risk management based on historical data including many parameters and text descriptions of events". REFERENCES [1] Narayana, S. Cluster Purity Visualizer. https://bl.ocks.org/nswamy14/e28ec2c438e9e8bd302f [2] Kuravsky, L. S., Marmalyuk, P. A., Yuryev, G. A., and Dumin, P. N. A Numerical Technique for the Identification of Discrete-State Continuous-Time Markov Models. Applied Mathematical Sciences, 9(8), 2015, pp. 379-391. https://dx.doi.org/10.12988/ams.2015.410882
  • 11. G.A. Yuryev, E.K. Verkhovskaya and N.E. Yuryeva http://www.iaeme.com/IJCIET/index.asp 149 editor@iaeme.com [3] Kuravsky, L. S., Marmalyuk, P. A., Yuryev, G. A., Belyaeva, O. B., and Prokopieva, O. Yu. Mathematical Foundations of Flight Crew Diagnostics Based on Videooculography Data. Applied Mathematical Sciences. 10(30), 2016, pp. 1449–1466. https://dx.doi.org/10.12988/ams.2016.6122. [4] Kennedy, J., and Eberhart, R. Swarm Intelligence. Morgan Kaufmann Publishers, Inc., San Francisco, CA, 2001. [5] Kennedy, J., and Eberhart, R. ”Particle Swarm Optimization”, IEEE International Conference on Neural Networks (Perth, Australia), IEEE Service Center, Piscataway, NJ, IV, 1995. [6] Eberhart, R., and Kennedy, J. A New Optimizer Using Particles Swarm Theory, Proc. Sixth International Symposium on MicroMachine and Human Science (Nagoya, Japan), IEEE Service Center, Piscataway, NJ, 1995. [7] Khanesar, M. A., Tavakoli, H., Teshnehlab, M., and Shoorehdeli, M. A. Novel Binary Particle Swarm Optimization, Particle Swarm Optimization, Aleksandar Lazinica (Ed.), InTech, 2009. https://www.intechopen.com/books/particle_swarm_optimization/novel_binary_particle_ swarm_optimization [8] Aviation safety network https://aviation-safety.net/database/ [9] Mikolov, T., Yih, W., and Zweig, G. Linguistic Regularities in Continuous Space Word Representations. Proceedings of NAACL HLT, 2013.