SlideShare a Scribd company logo
1 of 5
Download to read offline
International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013


            Information-Theory Analysis of Cell
          Characteristics in Breast Cancer Patients

                                             David Blokh
                                  C. D. Technologies Ltd., Israel
                                     david_blokh@012.net.il

Abstract
A problem of selecting a subset of parameters containing a maximum amount of information on all
parameters of a given set is considered. The proposed method of selection is based on the information-
theory analysis and rank statistics. The uncertainty coefficient (normalized mutual information) is used as
a measure of information about one parameter contained in another parameter. The most informative
characteristics are selected from the set of cytological characteristics of breast cancer patients.

Keywords
Normalized mutual information, Newman-Keuls test, Cytological characteristics, Breast cancer

1. Introduction
The selection of a subgroup of parameters containing a maximum amount of information on all
parameters of a given group of parameters is an important problem of medical informatics. This
problem has applications in analyzing oncology patients data [1], mathematical modeling of
tumor growth [2] and developing intellectual medical systems [3, 4].
In the present article, a subgroup of parameters describing cytological characteristics of breast
cancer patients and containing the maximum amount of information on all parameters of this
group is selected.
This selection is required for constructing models of cancer diagnosis and prognosis.
The problem under consideration is rather difficult. First, this group includes both discrete and
continuous parameters; second, the distributions of continuous parameters are non-Gaussian and,
third, the interrelations between parameters are nonlinear.
Any method of selecting information-intensive parameters is based on the use of a measure of
parameters correlation. In the majority of methods, a correlation coefficient is used as such a
measure. However, the application of the correlation coefficient suggests that parameters
distributions are Gaussian, and the correlations of parameters are linear. Therefore, in the present
article, the uncertainty coefficient (normalized mutual information) is used as a measure of
parameters correlation. This coefficient evaluates nonlinear correlations between parameters with
arbitrary distributions.
Thus, this article presents a method of selecting the most informative parameters, which has no
restrictions on the distribution of the parameters and on the correlations between the parameters.
Application of the uncertainty coefficient has made it possible to obtain interesting results in
medicine [5] and, in particular, in oncology [6, 7, 8]. The approach proposed in [6] is
presented in monograph [9].




DOI : 10.5121/ijbb.2013.3101                                                                             1
International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013

2. Materials and Methods
2.1. Materials
To illustrate the method, we used data from the Wisconsin Breast Cancer Database [10]. These
data cover 663 patients, each patient being presented by 10 cytological parameters, 9 of which
are continuous (clump thickness, uniformity of cell size, uniformity of cell shape, marginal
adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, and mitoses)
and one is discrete (Class).
2.2. Data Analysis
2.2.1. Statement of the problem
Assume that the initial data on n objects are presented in the form of a           n × m array   [akj ] ,
where each row k is an object described by m parameters. It is needed to find a parameter or a
subgroup of parameters containing the greatest amount of information about all m parameters.
2.2.2. Description of the algorithm
The algorithm of selecting a subgroup of the most informative parameters from the entire group
of parameters includes four procedures. A short description of each procedure is as follows. A
more complete description of the application of information-theory analysis to the selection
problem is presented in [11].
1. Discretization.
This procedure transforms parameters having continuous values into parameters having discrete
values.
If an acceptable method of a continuous parameter discretization is unavailable, use a formal
approach to the discretization [12].
2. Construction of the uncertainty coefficient matrix.
For i-th and j-th parameters 1 ≤ i,     j ≤ m,   calculate the uncertainty coefficient     cij   [5] and

construct m × m uncertainty coefficient matrix [c ] .
                                                 ij
3. Construction of the rank matrix.
For each column of the matrix      [cij ] , we rank its elements and assign rank 1 to the smallest
element of the column. We obtain the matrix      m × m of ranks [rij ] , where each column of the
matrix contains ranks from 1 to m.
We estimate the amount of information about all m parameters contained in the i-th parameter
by the sum of all the entries of the i-th row of the matrix   [rij ] .
4. Application of the multiple comparison method.

Apply the multiple comparison method to the sums of             [rij ]   matrix rows [13]. This gives a
clustering of parameters that contains the desired subgroup of parameters.
Remark. Programs for the algorithm of uncertainty coefficient calculation and the Friedman test
are implemented in the package SPSS [14].

                                                                                                       2
International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013

3. Results
We consecutively perform the four procedures of the selection algorithm.
1. Categories of parameters contained in the Wisconsin Breast Cancer Database are given below.
        Parameter name                                           Category
        1. Clump Thickness:                                     1,2,3,4,5,6,7,8,9,10.
        2. Uniformity of Cell Size:                             1,2,3,4,5,6,7,8,9,10.
        3. Uniformity of Cell Shape:                            1,2,3,4,5,6,7,8,9,10.
        4. Marginal Adhesion:                                   1,2,3,4,5,6,7,8,9,10.
        5. Single epithelial cell size                          1,2,3,4,5,6,7,8,9,10.
        6. Bare Nuclei:                                         1,2,3,4,5,6,7,8,9,10.
        7. Bland Chromatin:                                     1,2,3,4,5,6,7,8,9,10.
        8. Normal Nucleoli:                                     1,2,3,4,5,6,7,8,9,10.
        9. Mitose:                                              1,2,3,4,5,6,7,8,9,10.
       10. Class:                                                   2-benign, 4-malignant
2. Calculate the uncertainty coefficientscij 1 ≤ i, j ≤ 10 for the parameters of the Wisconsin
Breast Cancer Database and construct the [cij ] matrix of uncertainty coefficients (Table 1).

                          Table 1. Uncertainty coefficient matrix                  [cij ] .
  No       Parameter name             1          2       3      4          5      6      7           8      9      10
  1    Clump thickness                1          .222    .194   .144       .170   .205   .136        .180   .171   .439
  2    Uniformity of cell size        .181       1       .406   .263       .305   .319   .243        .301   .251   .676
  3    Uniformity of cell shape       .165       .419    1      .238       .250   .282   .211        .266   .209   .556
  4    Marginal adhesion              .105       .232    .205   1          .211   .239   .198        .196   .175   .382
  5    Single epithelial cell size    .135       .295    .235   .231       1      .240   .191        .256   .255   .503
  6    Bare nuclei                    .137       .259    .223   .219       .202   1      .197        .222   .154   .491
  7    Bland chromatin                .123       .267    .226   .247       .218   .268   1           .245   .155   .499
  8    Normal nucleoli                .121       .245    .211   .181       .216   .223   .181        1      .180   .386
  9    Mitoses                        .071       .125    .103   .100       .133   .096   .071        .111   1      .226
  10   Class                          .180       .339    .269   .215       .259   .301   .225        .236   .223   1

3. Rank the   [cij ] matrix (Table 1) columns and obtain a rank matrix [rij ] (Table 2).
                                       Table 2. Rank matrix            [rij ] .
  No       Parameter name             1      2      3     4     5      6     7     8     9      10     Sum of ranks
  1     Clump thickness               10      2      2     2     2      2     2     2     3      4          31
  2     Uniformity of cell size         9    10      9     9     9      9     9     9     8      9          90
  3     Uniformity of cell shape        7     9     10     7     7      7     7     8     6      8          76
  4     Marginal adhesion               2     3      3    10     4      4     6     3     4      2          41
  5     Single epithelial cell size     5     7      7     6    10      5     4     7     9      7          67
  6     Bare nuclei                     6     5      5     5     3     10     5     4     1      5          49
  7     Bland chromatin                 4     6      6     8     6      6    10     6     2      6          60
  8     Normal nucleoli                 3     4      4     3     5      3     3    10     5      3          43
  9     Mitoses                         1     1      1     1     1      1     1     1    10      1          19
  10    Class                           8     8      8     4     8      8     8     5     7     10          74
Consider Table 2 as the Friedman statistical model [15] and examine the row effect of this table.


                                                                                                                          3
International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013

Hypotheses
H0: There is no row effect (“null hypothesis”).
H1: The null hypothesis is invalid.

Critical range. The sample is “large”; therefore, the critical range is the upper 1%-range of   9
                                                                                                 2

distribution.

Calculation of the    2 -criterion [15] gives  2 = 48.48.
The critical range is   9
                         2
                             > 21.67. Since 48.48 > 21.67, the null hypothesis with respect to Table
2 is rejected. Thus, according to the Friedman test, the row effect exists. Hence, there is a
difference between the rows under consideration.

4. For multiple comparisons, we use the Newman-Keuls test [13]. We obtain R − R      > 8.15 ,
                                                                           j    j +1

where Rj and Rj+1 are the j-th and (j+1)-th elements of “Sum of ranks” column of Table 2.
Using the multiple comparisons method, we construct the parameter clustering shown in Table 3.

                                    Table 3. Parameter Clustering.

                No    Cluster              Parameter name                Sum of ranks
                1    Cluster 1      Uniformity of cell size                 90
                2    Cluster 2      Uniformity of cell shape                76
                3                   Class "benign" or "malignant"           74
                4                   Single epithelial cell size             67
                5                   Bland chromatin                         60
                6    Cluster 3      Bare nuclei                             49
                7                   Normal nucleoli                         43
                8                   Marginal adhesion                       41
                9    Cluster 4      Clump thickness                         31
                10   Cluster 5      Mitoses                                 19
The obtained clustering has the following properties:
For two neighboring clusters of Table 3, the smallest element of one cluster and the greatest
element of another cluster located nearby are significantly different (  T =0.01);

Elements belonging to the same cluster do not differ from each other (  T =0.01);

The differences between Cluster 1 (Uniformity of cell size) and all the characteristics are
statistically significant (  T =0.01).

The characteristic "Uniformity of cell size" contains the greatest amount of information about all
the characteristics under study. It corresponds to the thesis that the uniformity of cell size in
unicellular organisms and within tissues in multicellular organisms could be necessary to
maintain the homeostasis of transcription and proliferation [16]. Thus, an adequate result has
been obtained using the proposed algorithm.
We believe that the processing other biomedical data using the proposed algorithm will also give
adequate results.


                                                                                                  4
International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013


4. Conclusions
In the present paper, a method of selecting parameters based on the assessment of correlations
between single parameters is considered. However, problems arising in medical practice often
require the selection of parameters taking into account correlations between groups of
parameters. The development of methods of selecting informative parameters taking into account
not only correlations between single parameters, but also correlations between groups of
parameters, should be a subject of further research.
References
   1.  G. S. Atwal, R. Rabadan, G. Lozano et al., (2008) “An Information-Theoretic Analysis of
       Genetics, Gender and Age in Cancer Patients”, PLoS ONE, vol. 3, no. 4, 7 p.
   2. R. Molina-Pena, M. M. Alvarez, (2012) “A Simple Mathematical Model Based on the Cancer
       Stem Cell Hypothesis Suggests Kinetic Commonalities in Solid Tumor Growth”, PLoS ONE,
       vol. 7, no. 2, 11 p.
   3. S. M. Anzar, P. S. Sathidevi, (2012) “An Efficient PSO Optimized Integration Weight Estimation
       Using D-Prime Statistics for a Multibiometric System”, International Journal on Bioinformatics
       & Biosciences, vol. 2, no. 3, pp. 31-42.
   4. P. Preckova, J. Zvarova and K. Zvara, (2012) “Measuring diversity in medical reports based on
       categorized attributes and international classification systems”, BMC Medical Informatics and
       Decision Making, vol. 12, 11 p.
   5. J. Zvarova, M. Studeny, (1997) “Information theoretical approach to constitution and reduction of
       medical data”, International Journal of Medical Informatics, vol. 45, no. 1-2, pp. 65-74.
   6. D. Blokh, I. Stambler, E. Afrimzon, et al., (2007) “The information-theory analysis of Michaelis-
       Menten constants for detection of breast cancer”, Cancer Detection and Prevention, vol. 31, no.
       6, pp. 489-498.
   7. D. Blokh, N. Zurgil, I. Stambler, et al., (2008) “An information-theoretical model for breast
       cancer detection”, Methods of Information in Medicine, vol. 47, no. 4, pp. 322-327.
   8. D. Blokh, I. Stambler, E. Afrimzon, et al., (2009) “Comparative analysis of cell parameter groups
       for breast cancer detection”, Computer Methods and Programs in Biomedicine, vol. 94, no. 3, pp.
       239-249.
   9. P. J. Gutierrez Diez, I.H. Russo, J. Russo, (2012) The Evolution of the Use of Mathematics in
       Cancer Research, Springer, New York.
   10. W.H. Wolberg, O.L. Mangasarian, (1990) “Multisurface method of pattern separation for medical
       diagnosis applied to breast cytology”, Proceedings of the National Academy of Sciences, Vol. 87,
       No. 23, pp. 9193-9196.
   11. D. Blokh, (2012) “Clustering financial time series via information-theory analysis and rank
       statistics”, Journal of Pattern Recognition Research, vol. 7, no. 1, pp. 106-115.
   12. G.V. Glass, J.C. Stanley, (1970) Statistical Methods in Education and Psychology, Prentice-Hall,
       New Jersey.
   13. S.A. Glantz, (1994) Primer of Biostatistics, 4th ed., McGraw-Hill, New York.
   14. A. Buhl, P. Zofel, (2001) SPSS Version 10, Addison-Wesley, Munchen.
   15. W.J. Conover, (1999) Practical Nonparametric Statistics. Wiley-Interscience, New York.
   16. C.-Y. Wu, P.A. Rolfe, D.K. Gifford, G.R. Fink, (2010) “Control of Transcription by Cell Size”,
       PLoS Biology, Vol. 8, No. 11, 16 p.




                                                                                                     5

More Related Content

Viewers also liked

Topic Maps: Theory & Practice
Topic Maps: Theory & PracticeTopic Maps: Theory & Practice
Topic Maps: Theory & Practicebbater
 
Type Theory and Practical Application
Type Theory and Practical ApplicationType Theory and Practical Application
Type Theory and Practical ApplicationJack Fox
 
Topic Top 5 presentation- Uncertainty Reduction Theory
Topic Top 5 presentation- Uncertainty Reduction TheoryTopic Top 5 presentation- Uncertainty Reduction Theory
Topic Top 5 presentation- Uncertainty Reduction Theoryannemarieyoung3
 
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Sunil Nair
 
Herzberg’s two factor theory
Herzberg’s two factor theoryHerzberg’s two factor theory
Herzberg’s two factor theoryRashid Mwinyi
 
Theories and Organizations
Theories and OrganizationsTheories and Organizations
Theories and Organizationsguestcc1ebaf
 
X and y theory presentation
X and y theory presentationX and y theory presentation
X and y theory presentationmikurem8166
 
Chapter 2 The Evolution Of Management Theory
Chapter 2   The Evolution Of Management TheoryChapter 2   The Evolution Of Management Theory
Chapter 2 The Evolution Of Management Theorymanagement 2
 
Theories of Motivation
Theories of MotivationTheories of Motivation
Theories of Motivationmasumi kadali
 
Organization theories
Organization theoriesOrganization theories
Organization theoriesSSBinny
 

Viewers also liked (11)

Topic Maps: Theory & Practice
Topic Maps: Theory & PracticeTopic Maps: Theory & Practice
Topic Maps: Theory & Practice
 
Type Theory and Practical Application
Type Theory and Practical ApplicationType Theory and Practical Application
Type Theory and Practical Application
 
Topic Top 5 presentation- Uncertainty Reduction Theory
Topic Top 5 presentation- Uncertainty Reduction TheoryTopic Top 5 presentation- Uncertainty Reduction Theory
Topic Top 5 presentation- Uncertainty Reduction Theory
 
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...
 
Herzberg’s two factor theory
Herzberg’s two factor theoryHerzberg’s two factor theory
Herzberg’s two factor theory
 
Theories and Organizations
Theories and OrganizationsTheories and Organizations
Theories and Organizations
 
Systems theory
Systems theorySystems theory
Systems theory
 
X and y theory presentation
X and y theory presentationX and y theory presentation
X and y theory presentation
 
Chapter 2 The Evolution Of Management Theory
Chapter 2   The Evolution Of Management TheoryChapter 2   The Evolution Of Management Theory
Chapter 2 The Evolution Of Management Theory
 
Theories of Motivation
Theories of MotivationTheories of Motivation
Theories of Motivation
 
Organization theories
Organization theoriesOrganization theories
Organization theories
 

Similar to Information-Theory Analysis of Cell Characteristics in Breast Cancer Patients

IRJET- Diagnosis of Breast Cancer using Decision Tree Models and SVM
IRJET-  	  Diagnosis of Breast Cancer using Decision Tree Models and SVMIRJET-  	  Diagnosis of Breast Cancer using Decision Tree Models and SVM
IRJET- Diagnosis of Breast Cancer using Decision Tree Models and SVMIRJET Journal
 
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO... ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...cscpconf
 
Using Artificial Neural Networks to Detect Multiple Cancers from a Blood Test
Using Artificial Neural Networks to Detect Multiple Cancers from a Blood TestUsing Artificial Neural Networks to Detect Multiple Cancers from a Blood Test
Using Artificial Neural Networks to Detect Multiple Cancers from a Blood TestStevenQu1
 
Design of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesDesign of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesMohamed Loey
 
Classification of squamous cell cervical cytology
Classification of squamous cell cervical cytologyClassification of squamous cell cervical cytology
Classification of squamous cell cervical cytologykarthigailakshmi
 
APLICACIONES DE ESPACIOS Y SUBESPACIOS VECTORIALES EN BIOTECNOLOGÍA
APLICACIONES DE ESPACIOS Y SUBESPACIOS VECTORIALES EN BIOTECNOLOGÍAAPLICACIONES DE ESPACIOS Y SUBESPACIOS VECTORIALES EN BIOTECNOLOGÍA
APLICACIONES DE ESPACIOS Y SUBESPACIOS VECTORIALES EN BIOTECNOLOGÍACORAIMAEDITHTORRESGU
 
Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...
Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...
Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...IOSR Journals
 
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...IRJET Journal
 
Cotton Grading - approaches for the textile value chain - POLAND 2016
Cotton Grading - approaches for the textile value chain - POLAND 2016Cotton Grading - approaches for the textile value chain - POLAND 2016
Cotton Grading - approaches for the textile value chain - POLAND 2016Debashish Banerjee
 
Detection of Breast Cancer using BPN Classifier in Mammograms
Detection of Breast Cancer using BPN Classifier in MammogramsDetection of Breast Cancer using BPN Classifier in Mammograms
Detection of Breast Cancer using BPN Classifier in MammogramsIRJET Journal
 
Microarray Data Classification Using Support Vector Machine
Microarray Data Classification Using Support Vector MachineMicroarray Data Classification Using Support Vector Machine
Microarray Data Classification Using Support Vector MachineCSCJournals
 
Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE method...
Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE method...Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE method...
Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE method...IOSRjournaljce
 
25 17 dec16 13743 28032-1-sm(edit)
25 17 dec16 13743 28032-1-sm(edit)25 17 dec16 13743 28032-1-sm(edit)
25 17 dec16 13743 28032-1-sm(edit)IAESIJEECS
 
poster_final version
poster_final versionposter_final version
poster_final versionYang Wu
 
Cancer cell segmentation and detection using nc ratio
Cancer cell segmentation and detection using nc ratioCancer cell segmentation and detection using nc ratio
Cancer cell segmentation and detection using nc ratioeSAT Publishing House
 
IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...
IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...
IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...IRJET Journal
 
My own Machine Learning project - Breast Cancer Prediction
My own Machine Learning project - Breast Cancer PredictionMy own Machine Learning project - Breast Cancer Prediction
My own Machine Learning project - Breast Cancer PredictionGabriele Mineo
 

Similar to Information-Theory Analysis of Cell Characteristics in Breast Cancer Patients (20)

IRJET- Diagnosis of Breast Cancer using Decision Tree Models and SVM
IRJET-  	  Diagnosis of Breast Cancer using Decision Tree Models and SVMIRJET-  	  Diagnosis of Breast Cancer using Decision Tree Models and SVM
IRJET- Diagnosis of Breast Cancer using Decision Tree Models and SVM
 
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO... ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 
sec dc.pptx
sec dc.pptxsec dc.pptx
sec dc.pptx
 
Using Artificial Neural Networks to Detect Multiple Cancers from a Blood Test
Using Artificial Neural Networks to Detect Multiple Cancers from a Blood TestUsing Artificial Neural Networks to Detect Multiple Cancers from a Blood Test
Using Artificial Neural Networks to Detect Multiple Cancers from a Blood Test
 
Design of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesDesign of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer Diseases
 
Classification of squamous cell cervical cytology
Classification of squamous cell cervical cytologyClassification of squamous cell cervical cytology
Classification of squamous cell cervical cytology
 
APLICACIONES DE ESPACIOS Y SUBESPACIOS VECTORIALES EN BIOTECNOLOGÍA
APLICACIONES DE ESPACIOS Y SUBESPACIOS VECTORIALES EN BIOTECNOLOGÍAAPLICACIONES DE ESPACIOS Y SUBESPACIOS VECTORIALES EN BIOTECNOLOGÍA
APLICACIONES DE ESPACIOS Y SUBESPACIOS VECTORIALES EN BIOTECNOLOGÍA
 
Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...
Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...
Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...
 
A01110107
A01110107A01110107
A01110107
 
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
 
Cotton Grading - approaches for the textile value chain - POLAND 2016
Cotton Grading - approaches for the textile value chain - POLAND 2016Cotton Grading - approaches for the textile value chain - POLAND 2016
Cotton Grading - approaches for the textile value chain - POLAND 2016
 
Detection of Breast Cancer using BPN Classifier in Mammograms
Detection of Breast Cancer using BPN Classifier in MammogramsDetection of Breast Cancer using BPN Classifier in Mammograms
Detection of Breast Cancer using BPN Classifier in Mammograms
 
Microarray Data Classification Using Support Vector Machine
Microarray Data Classification Using Support Vector MachineMicroarray Data Classification Using Support Vector Machine
Microarray Data Classification Using Support Vector Machine
 
Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE method...
Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE method...Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE method...
Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE method...
 
25 17 dec16 13743 28032-1-sm(edit)
25 17 dec16 13743 28032-1-sm(edit)25 17 dec16 13743 28032-1-sm(edit)
25 17 dec16 13743 28032-1-sm(edit)
 
Az4102375381
Az4102375381Az4102375381
Az4102375381
 
poster_final version
poster_final versionposter_final version
poster_final version
 
Cancer cell segmentation and detection using nc ratio
Cancer cell segmentation and detection using nc ratioCancer cell segmentation and detection using nc ratio
Cancer cell segmentation and detection using nc ratio
 
IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...
IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...
IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...
 
My own Machine Learning project - Breast Cancer Prediction
My own Machine Learning project - Breast Cancer PredictionMy own Machine Learning project - Breast Cancer Prediction
My own Machine Learning project - Breast Cancer Prediction
 

Recently uploaded

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 

Recently uploaded (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Information-Theory Analysis of Cell Characteristics in Breast Cancer Patients

  • 1. International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013 Information-Theory Analysis of Cell Characteristics in Breast Cancer Patients David Blokh C. D. Technologies Ltd., Israel david_blokh@012.net.il Abstract A problem of selecting a subset of parameters containing a maximum amount of information on all parameters of a given set is considered. The proposed method of selection is based on the information- theory analysis and rank statistics. The uncertainty coefficient (normalized mutual information) is used as a measure of information about one parameter contained in another parameter. The most informative characteristics are selected from the set of cytological characteristics of breast cancer patients. Keywords Normalized mutual information, Newman-Keuls test, Cytological characteristics, Breast cancer 1. Introduction The selection of a subgroup of parameters containing a maximum amount of information on all parameters of a given group of parameters is an important problem of medical informatics. This problem has applications in analyzing oncology patients data [1], mathematical modeling of tumor growth [2] and developing intellectual medical systems [3, 4]. In the present article, a subgroup of parameters describing cytological characteristics of breast cancer patients and containing the maximum amount of information on all parameters of this group is selected. This selection is required for constructing models of cancer diagnosis and prognosis. The problem under consideration is rather difficult. First, this group includes both discrete and continuous parameters; second, the distributions of continuous parameters are non-Gaussian and, third, the interrelations between parameters are nonlinear. Any method of selecting information-intensive parameters is based on the use of a measure of parameters correlation. In the majority of methods, a correlation coefficient is used as such a measure. However, the application of the correlation coefficient suggests that parameters distributions are Gaussian, and the correlations of parameters are linear. Therefore, in the present article, the uncertainty coefficient (normalized mutual information) is used as a measure of parameters correlation. This coefficient evaluates nonlinear correlations between parameters with arbitrary distributions. Thus, this article presents a method of selecting the most informative parameters, which has no restrictions on the distribution of the parameters and on the correlations between the parameters. Application of the uncertainty coefficient has made it possible to obtain interesting results in medicine [5] and, in particular, in oncology [6, 7, 8]. The approach proposed in [6] is presented in monograph [9]. DOI : 10.5121/ijbb.2013.3101 1
  • 2. International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013 2. Materials and Methods 2.1. Materials To illustrate the method, we used data from the Wisconsin Breast Cancer Database [10]. These data cover 663 patients, each patient being presented by 10 cytological parameters, 9 of which are continuous (clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, and mitoses) and one is discrete (Class). 2.2. Data Analysis 2.2.1. Statement of the problem Assume that the initial data on n objects are presented in the form of a n × m array [akj ] , where each row k is an object described by m parameters. It is needed to find a parameter or a subgroup of parameters containing the greatest amount of information about all m parameters. 2.2.2. Description of the algorithm The algorithm of selecting a subgroup of the most informative parameters from the entire group of parameters includes four procedures. A short description of each procedure is as follows. A more complete description of the application of information-theory analysis to the selection problem is presented in [11]. 1. Discretization. This procedure transforms parameters having continuous values into parameters having discrete values. If an acceptable method of a continuous parameter discretization is unavailable, use a formal approach to the discretization [12]. 2. Construction of the uncertainty coefficient matrix. For i-th and j-th parameters 1 ≤ i, j ≤ m, calculate the uncertainty coefficient cij [5] and construct m × m uncertainty coefficient matrix [c ] . ij 3. Construction of the rank matrix. For each column of the matrix [cij ] , we rank its elements and assign rank 1 to the smallest element of the column. We obtain the matrix m × m of ranks [rij ] , where each column of the matrix contains ranks from 1 to m. We estimate the amount of information about all m parameters contained in the i-th parameter by the sum of all the entries of the i-th row of the matrix [rij ] . 4. Application of the multiple comparison method. Apply the multiple comparison method to the sums of [rij ] matrix rows [13]. This gives a clustering of parameters that contains the desired subgroup of parameters. Remark. Programs for the algorithm of uncertainty coefficient calculation and the Friedman test are implemented in the package SPSS [14]. 2
  • 3. International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013 3. Results We consecutively perform the four procedures of the selection algorithm. 1. Categories of parameters contained in the Wisconsin Breast Cancer Database are given below. Parameter name Category 1. Clump Thickness: 1,2,3,4,5,6,7,8,9,10. 2. Uniformity of Cell Size: 1,2,3,4,5,6,7,8,9,10. 3. Uniformity of Cell Shape: 1,2,3,4,5,6,7,8,9,10. 4. Marginal Adhesion: 1,2,3,4,5,6,7,8,9,10. 5. Single epithelial cell size 1,2,3,4,5,6,7,8,9,10. 6. Bare Nuclei: 1,2,3,4,5,6,7,8,9,10. 7. Bland Chromatin: 1,2,3,4,5,6,7,8,9,10. 8. Normal Nucleoli: 1,2,3,4,5,6,7,8,9,10. 9. Mitose: 1,2,3,4,5,6,7,8,9,10. 10. Class: 2-benign, 4-malignant 2. Calculate the uncertainty coefficientscij 1 ≤ i, j ≤ 10 for the parameters of the Wisconsin Breast Cancer Database and construct the [cij ] matrix of uncertainty coefficients (Table 1). Table 1. Uncertainty coefficient matrix [cij ] . No Parameter name 1 2 3 4 5 6 7 8 9 10 1 Clump thickness 1 .222 .194 .144 .170 .205 .136 .180 .171 .439 2 Uniformity of cell size .181 1 .406 .263 .305 .319 .243 .301 .251 .676 3 Uniformity of cell shape .165 .419 1 .238 .250 .282 .211 .266 .209 .556 4 Marginal adhesion .105 .232 .205 1 .211 .239 .198 .196 .175 .382 5 Single epithelial cell size .135 .295 .235 .231 1 .240 .191 .256 .255 .503 6 Bare nuclei .137 .259 .223 .219 .202 1 .197 .222 .154 .491 7 Bland chromatin .123 .267 .226 .247 .218 .268 1 .245 .155 .499 8 Normal nucleoli .121 .245 .211 .181 .216 .223 .181 1 .180 .386 9 Mitoses .071 .125 .103 .100 .133 .096 .071 .111 1 .226 10 Class .180 .339 .269 .215 .259 .301 .225 .236 .223 1 3. Rank the [cij ] matrix (Table 1) columns and obtain a rank matrix [rij ] (Table 2). Table 2. Rank matrix [rij ] . No Parameter name 1 2 3 4 5 6 7 8 9 10 Sum of ranks 1 Clump thickness 10 2 2 2 2 2 2 2 3 4 31 2 Uniformity of cell size 9 10 9 9 9 9 9 9 8 9 90 3 Uniformity of cell shape 7 9 10 7 7 7 7 8 6 8 76 4 Marginal adhesion 2 3 3 10 4 4 6 3 4 2 41 5 Single epithelial cell size 5 7 7 6 10 5 4 7 9 7 67 6 Bare nuclei 6 5 5 5 3 10 5 4 1 5 49 7 Bland chromatin 4 6 6 8 6 6 10 6 2 6 60 8 Normal nucleoli 3 4 4 3 5 3 3 10 5 3 43 9 Mitoses 1 1 1 1 1 1 1 1 10 1 19 10 Class 8 8 8 4 8 8 8 5 7 10 74 Consider Table 2 as the Friedman statistical model [15] and examine the row effect of this table. 3
  • 4. International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013 Hypotheses H0: There is no row effect (“null hypothesis”). H1: The null hypothesis is invalid. Critical range. The sample is “large”; therefore, the critical range is the upper 1%-range of 9 2 distribution. Calculation of the  2 -criterion [15] gives  2 = 48.48. The critical range is 9 2 > 21.67. Since 48.48 > 21.67, the null hypothesis with respect to Table 2 is rejected. Thus, according to the Friedman test, the row effect exists. Hence, there is a difference between the rows under consideration. 4. For multiple comparisons, we use the Newman-Keuls test [13]. We obtain R − R > 8.15 , j j +1 where Rj and Rj+1 are the j-th and (j+1)-th elements of “Sum of ranks” column of Table 2. Using the multiple comparisons method, we construct the parameter clustering shown in Table 3. Table 3. Parameter Clustering. No Cluster Parameter name Sum of ranks 1 Cluster 1 Uniformity of cell size 90 2 Cluster 2 Uniformity of cell shape 76 3 Class "benign" or "malignant" 74 4 Single epithelial cell size 67 5 Bland chromatin 60 6 Cluster 3 Bare nuclei 49 7 Normal nucleoli 43 8 Marginal adhesion 41 9 Cluster 4 Clump thickness 31 10 Cluster 5 Mitoses 19 The obtained clustering has the following properties: For two neighboring clusters of Table 3, the smallest element of one cluster and the greatest element of another cluster located nearby are significantly different (  T =0.01); Elements belonging to the same cluster do not differ from each other (  T =0.01); The differences between Cluster 1 (Uniformity of cell size) and all the characteristics are statistically significant (  T =0.01). The characteristic "Uniformity of cell size" contains the greatest amount of information about all the characteristics under study. It corresponds to the thesis that the uniformity of cell size in unicellular organisms and within tissues in multicellular organisms could be necessary to maintain the homeostasis of transcription and proliferation [16]. Thus, an adequate result has been obtained using the proposed algorithm. We believe that the processing other biomedical data using the proposed algorithm will also give adequate results. 4
  • 5. International Journal on Bioinformatics & Biosciences (IJBB) Vol.3, No.1, March 2013 4. Conclusions In the present paper, a method of selecting parameters based on the assessment of correlations between single parameters is considered. However, problems arising in medical practice often require the selection of parameters taking into account correlations between groups of parameters. The development of methods of selecting informative parameters taking into account not only correlations between single parameters, but also correlations between groups of parameters, should be a subject of further research. References 1. G. S. Atwal, R. Rabadan, G. Lozano et al., (2008) “An Information-Theoretic Analysis of Genetics, Gender and Age in Cancer Patients”, PLoS ONE, vol. 3, no. 4, 7 p. 2. R. Molina-Pena, M. M. Alvarez, (2012) “A Simple Mathematical Model Based on the Cancer Stem Cell Hypothesis Suggests Kinetic Commonalities in Solid Tumor Growth”, PLoS ONE, vol. 7, no. 2, 11 p. 3. S. M. Anzar, P. S. Sathidevi, (2012) “An Efficient PSO Optimized Integration Weight Estimation Using D-Prime Statistics for a Multibiometric System”, International Journal on Bioinformatics & Biosciences, vol. 2, no. 3, pp. 31-42. 4. P. Preckova, J. Zvarova and K. Zvara, (2012) “Measuring diversity in medical reports based on categorized attributes and international classification systems”, BMC Medical Informatics and Decision Making, vol. 12, 11 p. 5. J. Zvarova, M. Studeny, (1997) “Information theoretical approach to constitution and reduction of medical data”, International Journal of Medical Informatics, vol. 45, no. 1-2, pp. 65-74. 6. D. Blokh, I. Stambler, E. Afrimzon, et al., (2007) “The information-theory analysis of Michaelis- Menten constants for detection of breast cancer”, Cancer Detection and Prevention, vol. 31, no. 6, pp. 489-498. 7. D. Blokh, N. Zurgil, I. Stambler, et al., (2008) “An information-theoretical model for breast cancer detection”, Methods of Information in Medicine, vol. 47, no. 4, pp. 322-327. 8. D. Blokh, I. Stambler, E. Afrimzon, et al., (2009) “Comparative analysis of cell parameter groups for breast cancer detection”, Computer Methods and Programs in Biomedicine, vol. 94, no. 3, pp. 239-249. 9. P. J. Gutierrez Diez, I.H. Russo, J. Russo, (2012) The Evolution of the Use of Mathematics in Cancer Research, Springer, New York. 10. W.H. Wolberg, O.L. Mangasarian, (1990) “Multisurface method of pattern separation for medical diagnosis applied to breast cytology”, Proceedings of the National Academy of Sciences, Vol. 87, No. 23, pp. 9193-9196. 11. D. Blokh, (2012) “Clustering financial time series via information-theory analysis and rank statistics”, Journal of Pattern Recognition Research, vol. 7, no. 1, pp. 106-115. 12. G.V. Glass, J.C. Stanley, (1970) Statistical Methods in Education and Psychology, Prentice-Hall, New Jersey. 13. S.A. Glantz, (1994) Primer of Biostatistics, 4th ed., McGraw-Hill, New York. 14. A. Buhl, P. Zofel, (2001) SPSS Version 10, Addison-Wesley, Munchen. 15. W.J. Conover, (1999) Practical Nonparametric Statistics. Wiley-Interscience, New York. 16. C.-Y. Wu, P.A. Rolfe, D.K. Gifford, G.R. Fink, (2010) “Control of Transcription by Cell Size”, PLoS Biology, Vol. 8, No. 11, 16 p. 5