SlideShare a Scribd company logo
1 of 26
Sales, C. M. D., & Wakker, P. P. (2010). Combining metric and qualitative
approach in a measure of similarity for ill-structured data sets. Paper
presented at the XVII Meeting of the Portuguese Association of Data
Classification and Analysis JOCLAD, Lisbon.




          Combining metric and qualitative approach
          in a measure of similarity for ill-structured
                                               data sets
                                       Célia M. D. Sales & Peter P. Wakker
                                                     JOCLAD 2010, Lisboa
Metric-Frequency Similarity Measure (MF)

   Sales, C.M.D. & Wakker, P. P. (2009). The metric-
    frequency measure of similarity for ill-structured data
    sets, with an application to family therapy. British
    Journal of Mathematical and Statistical Psychology, 62,
    663-682.

   Software for calculation available:

http://people.few.eur.nl/wakker/miscella/mf.similarity.cal
  culate/mfsimexplanation.htm

    2                     Sales & Wakker, JOCLAD 2010
The Team
       Célia M. D. Sales
           Dept. of Psychology and Sociology, Universidade Autónoma de
            Lisboa (CIP/UAL)
           Center for Social Research and Intervention (CIS-ICSTE/IUL)
       Peter P. Wakker
           Dept. of Economics, H13-27, Erasmus University, Rotterdam, The
            Netherlands

           Acknowledgements:
           Angela Fragoeiro
           Francisco Ortega Beviá
           Sónia Noronha

    3                                Sales & Wakker, JOCLAD 2010
Outline

       Why developing MF
       MF rationale
       MF formula explained
       Illustration in a real case
       Next steps




    4                           Sales & Wakker, JOCLAD 2010
The challenge
       The development of the MF arose from a problem in
        psychotherapy research

           Individualized change measures
               Self-report measures
               Patient elaborates a list of problems and rates how much each
                problem bothers him
               Each questionnaire is unique as it varies in the number of items and in
                their content


           Example: Zarastro



    5                                   Sales & Wakker, JOCLAD 2010
Mother’s complaints

1.       There were death threats to the mother.
2.       I had to leave the house many times.
3.       Family was very upset because of the constant threats to me.
4.       He was 16 years without relating to the outside, broke up with
         lifetime friends.
5.       In the family we had continuous fights and arguments.
6.       We called the police 5 times, in 2 years.
7.       We went to a series of private psychiatrists.
8.       He did very poorly in school.
9.       We had short quiet times, going back to the threats, creating distress.
10.      He was very strange and aggressive.

     6                             Sales & Wakker, JOCLAD 2010
Zarastro’s Complaints

1.       I’m very shy.
2.       I don’t know how to keep a conversation going.
3.       It’s hard to maintain my few friendships.
4.       It bothers me to have eye-contact with people on the street.
5.       I feel that people are watching me.
6.       I’m not able to show my dislike or distress.
7.       I’m worried about not having a job or schooling/studies/education.
8.       I have a cold relationship with my younger brother.
9.       It’s hard to talk and show my complaints at home.
10.      I’m obsessed with the past.

     7                             Sales & Wakker, JOCLAD 2010
His brother José
1.  My brother Z. has a distorted perception of the relationships at home.
2.  My brother Z. has difficulties relating to others.
3.  My brother Z. has been losing his friends.
4.  Lack of emotional communication.
5.  My brother Amadeus is often impolite towards Zarastro
6.  My brother Amadeus is very cold and reserved, doesn’t interact.
7.  My brother Amadeus has lost interest in Zarastro’s problem.
8.  My sister has a severe depression.
9.  It affects my job because it decreases my attention.
10. It affects my relationships with others.
11. I feel down when I have to take care of Zarastro.



 8                           Sales & Wakker, JOCLAD 2010
His brother Amadeus
1.       I feel anxious sitting at the table between Zarastro and my mother.
2.       My mother makes her children depend on her.
3.       My mother wants her cubs surrounding her.
4.       Zarastro is watching too much T.V.
5.       We (extended family) share the same fate.




     9                            Sales & Wakker, JOCLAD 2010
The challenge


    To what extent members have a similar perception of the
     problems in the family?




    10                    Sales & Wakker, JOCLAD 2010
ill-structured data set
        All family members have a different questionnaire
         corresponding to their personal view of the existing
         problems;
        There is no control over the content of the items that
         each family member can raise;
        There is no limit to the number of items that are
         conceivable;
        Each person is free to add new items or delete previous
         items in subsequent questionnaire administration;
        Each item is weighted in a Likert scale.

    11                        Sales & Wakker, JOCLAD 2010
MF: The rational
    Similarity must consider:
        numerical differences
        presence or absence of features (number of items raised)


    MF combines both metric and frequency components and
     is targeted towards situations in which the number of
     aspects is unpredictable.




    12                         Sales & Wakker, JOCLAD 2010
MF: The rational
    Amos Tversky (1977)
        Assessment of similarity between stimuli described as a
         comparison of features, rather than by the computation of
         metric distance between points that represent objects
        Tversky's formula applies to similarity measurements where
         items are either present or absent, but have no degrees of
         intensity
    MF is an extension to the case where also intensities have
     been measured
    The primary purpose of the MF is to be widely applicable
     to handle situations that are ill structured and complex


    13                         Sales & Wakker, JOCLAD 2010
   Imagine two members of a family (mother and
        father)
       Each indicated a number of "problems" (items),
                                                                   j = number of (“joint”)
        and scored how serious they feel these
                                                                   items raised by both (items
        problems are (1-7 scale)
                                                                   A,B,C,D= 4);
       Items not raised receive a score 0
                                                                   f = number of items raised
                                                                   by the father and not by the
                                                                   mother (items E,F,G=3);
                 FATHER            MOTHER
             Items    Scores    Items   Scores                     m = number of items raised
               A        7         A       5                        by the mother and not by
               B        6         B       6                        the father(item H=1)
               C        1         C       2
               D        1         D       1                        The total number of items
               E        3         H       1                        is j + f + m (= 4 + 3 + 1 =
               F        2                                          8).
               G        2
1



        14                           Sales & Wakker, JOCLAD 2010
MF: The formula explained




   Score      Frequency                      MF overall
 Similarity    Similarity                    Similarity




15             Sales & Wakker, JOCLAD 2010
Stage 1: Score Similarity
     Items   Normalized scores     Normalized scores   |diff|   Similarity
             of the father         of the mother                1-|diff|
        A              1                   5/7          2/7        5/7
        B             6/7                  6/7           0         7/7
        C             1/7                  2/7          1/7        6/7
        D             1/7                  1/7           0         7/7
        E             3/7                   0           3/7        4/7
        F             2/7                   0           2/7        5/7
        G             2/7                   0           2/7        5/7
        H              0                   1/7          1/7        6/7
                                                                        +
                                                                  45/7
1



                     (1 | diff |)
                      j f m

16                               Sales & Wakker, JOCLAD 2010
Score         Frequency                     MF overall
     Similarity       Similarity                   Similarity



 (1 | diff |)   0-1 scaled similarity measure based only on
                  the average differences of the scores of the

  j f m         father and the mother.




17                   Sales & Wakker, JOCLAD 2010
Stage 2: Frequency similarity

     Frequency
      similarity

       Similarity based on the number of
       items raised jointly by the father and
       the mother


       (Dis)similarity based on the difference of the
       number of items raised by the father and the
       mother

18                   Sales & Wakker, JOCLAD 2010
Step 2.1 – Similarity based on the number
of items raised jointly
    Reflected by a number j/N
    N is a normalization factor that ensures that j/N, f/N, and
     m/N never exceed 1.
        N should be the same for all participants whose mutual
         similarity weights are calculated
        Thus, it should exceed the maximum number of items raised
         by any single participant in the group considered. For instance,
         it can be the maximum number of conceivable items.
        In our example, N = 20 has been chosen, so that j/N = 4/20 =
         0.2.




    19                          Sales & Wakker, JOCLAD 2010
Step 2.1 – Similarity on the number of items
raised jointly
    Instead of the number j/N, we will use a transformation
               j/N.
    The transformation is curved downwards (concave):
     similarity increases less for high values of j (and j/N) than
     for low values.
    Thus, an increase from j=0 to j=1 has more impact on
     mother and father similarity than an increase from j=17
     to j=18, which is plausible.
    In our example, the transformation yields
               0.2 = 0.45

    20                       Sales & Wakker, JOCLAD 2010
Step 2.2 – (Dis)similarity on the difference of
    the number of items
         FATHER           MOTHER
                                             f = number of items raised by
     Items    Scores   Items   Scores
                                             the father and not by the
       A        7        A       5
                                             mother (items E,F,G=3)
       B        6        B       6
       C        1        C       2
       D        1        D       1           m = number of items raised
       E        3        H       1           by the mother and not by the
       F        2                            father (item H=1)
       G        2
1
     |fm|

     | f/N- m/N|

     1 - | f/N- m/N|

    21                      Sales & Wakker, JOCLAD 2010
Frequency                     MF overall
     Score Similarity
                                  Similarity                   Similarity



     (1 | diff |)
      j f m

                    ( j / N )  1 |  ( f / N )   (m / N ) |
                                         2



22                             Sales & Wakker, JOCLAD 2010
Stage 3: MF overall similarity



       (1 | diff |)
     ½                + ¼ + ¼(j/N)  ¼|(f/N)(m/N)|
        j f m


 The MF measure results as the half-half midpoint of the score-
 similarity and the frequency-similarity




23                      Sales & Wakker, JOCLAD 2010
Pre-Treatment
                  Zarastro   Mother    José
         Mother   w: 0.45
                                                   stress formula 1 = 0.0328; stress formula 2 = 0.1054; r(monotonic)
                   f: 0.58
                                                   squared=0.9889; r-squared (p.v.a.f.)=0.9319)
                     0.51
         José     w: 0.33    w: 0.33                                          ---------------------- ZARASTRO
                   f: 0.55   f: 0.52               -------------------------|
                     0.44     0.43                 |                          ------------ MOTHER
     Amadeus      w: 0.29    w: 0.37      w:       |
                   f: 0.49   f: 0.49     0.33      |--------------------------------------------------- JOSÉ
                     0.39     0.43     f: 0.45     |
                                         0.39      -------------------------------------------------------------------------
                                                   AMADEUS

                                                   Post-Treatment
                  Zarastro   Mother    José
         Mother   w: 0.81
                                                   stress formula 1 = 0.0000; stress formula 2 = 0.0000; r(monotonic)
                   f: 0.55
                                                   squared=1.0000; r-squared (p.v.a.f.)=0.9944)
                     0.68
                   (0.17)
         José     w: 0.81    w: 0.75
                   f: 0.54   f: 0.58                           ---- ZARASTRO
                     0.68     0.67                 ------------|
                   (0.24)    (0.24)                |           -------------------- JOSÉ
                                                   |
     Amadeus      w: 0.71    w: 0.72      w:
                                                   |--- MOTHER
                   f: 0.49   f: 0.50     0.70      |
                     0.60     0.61     f: 0.44     -------------------------------------------------------------------------
                   (0.21)    (0.18)      0.57      AMADEUS
                                       (0.18)
1
w: the score-similarity; f: the frequency-similarity; printed in bold: the overall similarity; within
parentheses is the pre-post change, given by the difference in overall similarity of those two times.
    24                                           Sales & Wakker, JOCLAD 2010
Conclusion
    The MF is pragmatic and easily applicable to data sets
     with little structure
    In particular it need not be anticipated which variables
     will be observed, or how many variables, and they may be
     metric or qualitative




    25                    Sales & Wakker, JOCLAD 2010
Next Steps
    New software for MF calculation, available on-line

    Applying MF in psychotherapy research:

        Comparing alternative data entering (categorized vs. raw items)

        Implementation the MF in a software for patient progress
         (Family Therapy and Group Therapy)

    Applying MF in other fields:

        Comparing agreements in open-ended judgments



    26                         Sales & Wakker, JOCLAD 2010

More Related Content

More from Célia M. D. Sales

More from Célia M. D. Sales (17)

Qui quadrado
Qui quadradoQui quadrado
Qui quadrado
 
Anova spss
Anova spssAnova spss
Anova spss
 
Anova a 1 factor
Anova a 1 factorAnova a 1 factor
Anova a 1 factor
 
Teste t student
Teste t studentTeste t student
Teste t student
 
Testes hipoteses introducao
Testes hipoteses introducaoTestes hipoteses introducao
Testes hipoteses introducao
 
Testes hipot parametricos_pressupostos
Testes hipot parametricos_pressupostosTestes hipot parametricos_pressupostos
Testes hipot parametricos_pressupostos
 
Distrib probab
Distrib probabDistrib probab
Distrib probab
 
Estatistica descritivaunivariada
Estatistica descritivaunivariadaEstatistica descritivaunivariada
Estatistica descritivaunivariada
 
Definicao estatistica
Definicao estatisticaDefinicao estatistica
Definicao estatistica
 
Questionar 2010
Questionar 2010Questionar 2010
Questionar 2010
 
Da populacao a amostra
Da populacao a amostraDa populacao a amostra
Da populacao a amostra
 
Desenhos Ex Post Facto 2010
Desenhos Ex Post Facto 2010Desenhos Ex Post Facto 2010
Desenhos Ex Post Facto 2010
 
Desenhos Experimentais (MIP 6)
Desenhos Experimentais (MIP 6)Desenhos Experimentais (MIP 6)
Desenhos Experimentais (MIP 6)
 
Causalidade Aleatorizacao Validade Interna (MIP 5)
Causalidade Aleatorizacao Validade Interna (MIP 5)Causalidade Aleatorizacao Validade Interna (MIP 5)
Causalidade Aleatorizacao Validade Interna (MIP 5)
 
Principios Eticos Publicacao Apa (MIP 4)
Principios Eticos Publicacao Apa (MIP 4)Principios Eticos Publicacao Apa (MIP 4)
Principios Eticos Publicacao Apa (MIP 4)
 
Apa Artigo Empirico 2010 (MIP 2)
Apa Artigo Empirico 2010 (MIP 2)Apa Artigo Empirico 2010 (MIP 2)
Apa Artigo Empirico 2010 (MIP 2)
 
Delimitacao Tema Investigacao (MIP 1)
Delimitacao Tema Investigacao (MIP 1)Delimitacao Tema Investigacao (MIP 1)
Delimitacao Tema Investigacao (MIP 1)
 

Recently uploaded

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...JojoEDelaCruz
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 

Recently uploaded (20)

ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
ENG 5 Q4 WEEk 1 DAY 1 Restate sentences heard in one’s own words. Use appropr...
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 

Combining metric and qualitative approach in a measure of similarity for ill-structured data sets

  • 1. Sales, C. M. D., & Wakker, P. P. (2010). Combining metric and qualitative approach in a measure of similarity for ill-structured data sets. Paper presented at the XVII Meeting of the Portuguese Association of Data Classification and Analysis JOCLAD, Lisbon. Combining metric and qualitative approach in a measure of similarity for ill-structured data sets Célia M. D. Sales & Peter P. Wakker JOCLAD 2010, Lisboa
  • 2. Metric-Frequency Similarity Measure (MF)  Sales, C.M.D. & Wakker, P. P. (2009). The metric- frequency measure of similarity for ill-structured data sets, with an application to family therapy. British Journal of Mathematical and Statistical Psychology, 62, 663-682.  Software for calculation available: http://people.few.eur.nl/wakker/miscella/mf.similarity.cal culate/mfsimexplanation.htm 2 Sales & Wakker, JOCLAD 2010
  • 3. The Team  Célia M. D. Sales  Dept. of Psychology and Sociology, Universidade Autónoma de Lisboa (CIP/UAL)  Center for Social Research and Intervention (CIS-ICSTE/IUL)  Peter P. Wakker  Dept. of Economics, H13-27, Erasmus University, Rotterdam, The Netherlands  Acknowledgements:  Angela Fragoeiro  Francisco Ortega Beviá  Sónia Noronha 3 Sales & Wakker, JOCLAD 2010
  • 4. Outline  Why developing MF  MF rationale  MF formula explained  Illustration in a real case  Next steps 4 Sales & Wakker, JOCLAD 2010
  • 5. The challenge  The development of the MF arose from a problem in psychotherapy research  Individualized change measures  Self-report measures  Patient elaborates a list of problems and rates how much each problem bothers him  Each questionnaire is unique as it varies in the number of items and in their content  Example: Zarastro 5 Sales & Wakker, JOCLAD 2010
  • 6. Mother’s complaints 1. There were death threats to the mother. 2. I had to leave the house many times. 3. Family was very upset because of the constant threats to me. 4. He was 16 years without relating to the outside, broke up with lifetime friends. 5. In the family we had continuous fights and arguments. 6. We called the police 5 times, in 2 years. 7. We went to a series of private psychiatrists. 8. He did very poorly in school. 9. We had short quiet times, going back to the threats, creating distress. 10. He was very strange and aggressive. 6 Sales & Wakker, JOCLAD 2010
  • 7. Zarastro’s Complaints 1. I’m very shy. 2. I don’t know how to keep a conversation going. 3. It’s hard to maintain my few friendships. 4. It bothers me to have eye-contact with people on the street. 5. I feel that people are watching me. 6. I’m not able to show my dislike or distress. 7. I’m worried about not having a job or schooling/studies/education. 8. I have a cold relationship with my younger brother. 9. It’s hard to talk and show my complaints at home. 10. I’m obsessed with the past. 7 Sales & Wakker, JOCLAD 2010
  • 8. His brother José 1. My brother Z. has a distorted perception of the relationships at home. 2. My brother Z. has difficulties relating to others. 3. My brother Z. has been losing his friends. 4. Lack of emotional communication. 5. My brother Amadeus is often impolite towards Zarastro 6. My brother Amadeus is very cold and reserved, doesn’t interact. 7. My brother Amadeus has lost interest in Zarastro’s problem. 8. My sister has a severe depression. 9. It affects my job because it decreases my attention. 10. It affects my relationships with others. 11. I feel down when I have to take care of Zarastro. 8 Sales & Wakker, JOCLAD 2010
  • 9. His brother Amadeus 1. I feel anxious sitting at the table between Zarastro and my mother. 2. My mother makes her children depend on her. 3. My mother wants her cubs surrounding her. 4. Zarastro is watching too much T.V. 5. We (extended family) share the same fate. 9 Sales & Wakker, JOCLAD 2010
  • 10. The challenge  To what extent members have a similar perception of the problems in the family? 10 Sales & Wakker, JOCLAD 2010
  • 11. ill-structured data set  All family members have a different questionnaire corresponding to their personal view of the existing problems;  There is no control over the content of the items that each family member can raise;  There is no limit to the number of items that are conceivable;  Each person is free to add new items or delete previous items in subsequent questionnaire administration;  Each item is weighted in a Likert scale. 11 Sales & Wakker, JOCLAD 2010
  • 12. MF: The rational  Similarity must consider:  numerical differences  presence or absence of features (number of items raised)  MF combines both metric and frequency components and is targeted towards situations in which the number of aspects is unpredictable. 12 Sales & Wakker, JOCLAD 2010
  • 13. MF: The rational  Amos Tversky (1977)  Assessment of similarity between stimuli described as a comparison of features, rather than by the computation of metric distance between points that represent objects  Tversky's formula applies to similarity measurements where items are either present or absent, but have no degrees of intensity  MF is an extension to the case where also intensities have been measured  The primary purpose of the MF is to be widely applicable to handle situations that are ill structured and complex 13 Sales & Wakker, JOCLAD 2010
  • 14. Imagine two members of a family (mother and father)  Each indicated a number of "problems" (items), j = number of (“joint”) and scored how serious they feel these items raised by both (items problems are (1-7 scale) A,B,C,D= 4);  Items not raised receive a score 0 f = number of items raised by the father and not by the mother (items E,F,G=3); FATHER MOTHER Items Scores Items Scores m = number of items raised A 7 A 5 by the mother and not by B 6 B 6 the father(item H=1) C 1 C 2 D 1 D 1 The total number of items E 3 H 1 is j + f + m (= 4 + 3 + 1 = F 2 8). G 2 1 14 Sales & Wakker, JOCLAD 2010
  • 15. MF: The formula explained Score Frequency MF overall Similarity Similarity Similarity 15 Sales & Wakker, JOCLAD 2010
  • 16. Stage 1: Score Similarity Items Normalized scores Normalized scores |diff| Similarity of the father of the mother 1-|diff| A 1 5/7 2/7 5/7 B 6/7 6/7 0 7/7 C 1/7 2/7 1/7 6/7 D 1/7 1/7 0 7/7 E 3/7 0 3/7 4/7 F 2/7 0 2/7 5/7 G 2/7 0 2/7 5/7 H 0 1/7 1/7 6/7 + 45/7 1 (1 | diff |) j f m 16 Sales & Wakker, JOCLAD 2010
  • 17. Score Frequency MF overall Similarity Similarity Similarity (1 | diff |) 0-1 scaled similarity measure based only on the average differences of the scores of the j f m father and the mother. 17 Sales & Wakker, JOCLAD 2010
  • 18. Stage 2: Frequency similarity Frequency similarity Similarity based on the number of items raised jointly by the father and the mother (Dis)similarity based on the difference of the number of items raised by the father and the mother 18 Sales & Wakker, JOCLAD 2010
  • 19. Step 2.1 – Similarity based on the number of items raised jointly  Reflected by a number j/N  N is a normalization factor that ensures that j/N, f/N, and m/N never exceed 1.  N should be the same for all participants whose mutual similarity weights are calculated  Thus, it should exceed the maximum number of items raised by any single participant in the group considered. For instance, it can be the maximum number of conceivable items.  In our example, N = 20 has been chosen, so that j/N = 4/20 = 0.2. 19 Sales & Wakker, JOCLAD 2010
  • 20. Step 2.1 – Similarity on the number of items raised jointly  Instead of the number j/N, we will use a transformation j/N.  The transformation is curved downwards (concave): similarity increases less for high values of j (and j/N) than for low values.  Thus, an increase from j=0 to j=1 has more impact on mother and father similarity than an increase from j=17 to j=18, which is plausible.  In our example, the transformation yields 0.2 = 0.45 20 Sales & Wakker, JOCLAD 2010
  • 21. Step 2.2 – (Dis)similarity on the difference of the number of items FATHER MOTHER f = number of items raised by Items Scores Items Scores the father and not by the A 7 A 5 mother (items E,F,G=3) B 6 B 6 C 1 C 2 D 1 D 1 m = number of items raised E 3 H 1 by the mother and not by the F 2 father (item H=1) G 2 1 |fm| | f/N- m/N| 1 - | f/N- m/N| 21 Sales & Wakker, JOCLAD 2010
  • 22. Frequency MF overall Score Similarity Similarity Similarity (1 | diff |) j f m  ( j / N )  1 |  ( f / N )   (m / N ) | 2 22 Sales & Wakker, JOCLAD 2010
  • 23. Stage 3: MF overall similarity (1 | diff |) ½ + ¼ + ¼(j/N)  ¼|(f/N)(m/N)| j f m The MF measure results as the half-half midpoint of the score- similarity and the frequency-similarity 23 Sales & Wakker, JOCLAD 2010
  • 24. Pre-Treatment Zarastro Mother José Mother w: 0.45 stress formula 1 = 0.0328; stress formula 2 = 0.1054; r(monotonic) f: 0.58 squared=0.9889; r-squared (p.v.a.f.)=0.9319) 0.51 José w: 0.33 w: 0.33 ---------------------- ZARASTRO f: 0.55 f: 0.52 -------------------------| 0.44 0.43 | ------------ MOTHER Amadeus w: 0.29 w: 0.37 w: | f: 0.49 f: 0.49 0.33 |--------------------------------------------------- JOSÉ 0.39 0.43 f: 0.45 | 0.39 ------------------------------------------------------------------------- AMADEUS Post-Treatment Zarastro Mother José Mother w: 0.81 stress formula 1 = 0.0000; stress formula 2 = 0.0000; r(monotonic) f: 0.55 squared=1.0000; r-squared (p.v.a.f.)=0.9944) 0.68 (0.17) José w: 0.81 w: 0.75 f: 0.54 f: 0.58 ---- ZARASTRO 0.68 0.67 ------------| (0.24) (0.24) | -------------------- JOSÉ | Amadeus w: 0.71 w: 0.72 w: |--- MOTHER f: 0.49 f: 0.50 0.70 | 0.60 0.61 f: 0.44 ------------------------------------------------------------------------- (0.21) (0.18) 0.57 AMADEUS (0.18) 1 w: the score-similarity; f: the frequency-similarity; printed in bold: the overall similarity; within parentheses is the pre-post change, given by the difference in overall similarity of those two times. 24 Sales & Wakker, JOCLAD 2010
  • 25. Conclusion  The MF is pragmatic and easily applicable to data sets with little structure  In particular it need not be anticipated which variables will be observed, or how many variables, and they may be metric or qualitative 25 Sales & Wakker, JOCLAD 2010
  • 26. Next Steps  New software for MF calculation, available on-line  Applying MF in psychotherapy research:  Comparing alternative data entering (categorized vs. raw items)  Implementation the MF in a software for patient progress (Family Therapy and Group Therapy)  Applying MF in other fields:  Comparing agreements in open-ended judgments 26 Sales & Wakker, JOCLAD 2010