SlideShare a Scribd company logo
1 of 28
Download to read offline
Analysing Online Social Network
  Data with Biclustering and
          Triclustering
 Dmitry Gnatyshak1      Dmitry Ignatov1
 Jonas Poelmans1,2    Alexander Semenov1
        1NRU  HSE, Moscow, Russia
           2KU Leuven, Belgium


                   BIR 2012
               Nizhny Novgorod
Motivation I
• There is large amount of network data that can
  be represented as bipartite and tripartite graphs
• Standard techniques like maximal bicliques
  search result in huge number of patterns (in the
  worst  case  exponential  w.r.t.  of  input  size)…
• Therefore we need some relaxation of this notion
  and good measures of interestingness of biclique
  communities
Motivation II
• Applied lattice theory provides us with a notion of formal concept which is
  almost the same thing as biclique
• L. C. Freeman, D. R. White. Using Galois Lattices to Represent Network
  Data Sociological Methodology 1993 (23).
• Social Networks 18(3), 1996
    – L. C. Freeman, Cliques, Galois Lattices, and the Structure of Human Social
      Groups.
    – V. Duquenne, Lattice analysis and the representation of handicap
      associations.
    – D. R. White. Statistical entailments and the Galois lattice.
• J.W. Mohr, Vincent D. The duality of culture and practice: Poverty relief in
  New York City, 1888—1917 Theory and Society, 1997
• Camille Roth et al., Towards Concise Representation for Taxonomies of
  Epistemic Communities, CLA 4th Intl Conf on Concept Lattices and their
  Applications, 2006
• And many other papers on application to social network analysis with FCA
Motivation III
• Concept-based bicluster (Ignatov et al., 2010) is a
  scalable approximation of a formal concept
  (biclique)
   – Less number of patterns to analyze
   – Less computational time (polynomial vs exp.)
   – Manual tuning of bicluster (community) density
     threshold
   – Tolerance to missing (object, attribute) pairs
• For analyzing three-way network data like
  folksonomies we proposed triclustering
  (Ignatov et al., 2011)
Formal Concept Analysis
           [Wille, 1982, Ganter & Wille, 1999]

Definition 1. Formal Context is a triple (G, M, I ), where G is a
 set of (formal) objects, M is a set of (formal) attributes, and I
  G M is the incidence relation which shows that object g 
 G posseses an attribute m  M.

Example
             Car           House         Laptop        Bicycle
Kate         x                                         x
Mike         x                           x
Alex                       x             x
David                      x             x             x
Formal Concept Analysis
 Definition 2. Derivation operators (defining Galois connection)
 AI := { m  M | gIm for all g  A } is the set of attributes common to all
      objects in A
 BI := { g  G | gIm for all m  B } is the set of objects that have all
      attributes from B

Example

        Car    House   Laptop Bicycle        {Kate, Mike}I = {Car}
Kate    x                       x
                                             {Laptop} I = {Mike, Alex, David}
Mike    x              x
Alex           x       x                     {Car, House} I = {}G
David          x       x        x            {} IG =M
Formal Concept Analysis
Definition 3. (A, B) is a formal concept of (G, M, I) iff
                    A  G, B  M, AI = B, and BI = A .
A is the extent and B is the intent of the concept (A, B).
B (G, M , I ) is a set of all concepts of the context (G, M, I)


Example                                           •   A pair ({Kate, Mike},{Car}) is a
                                                      formal concept
        Car    House     Laptop Bicycle
Kate    x                         x               •   ({Alex, David} ,{Laptop}) doesn‘t
                                                      form a formal concept, because
Mike    x                x                            {Laptop} I{Alex, David}
Alex           x         x                        •   ({Alex, David} {House, Laptop})
David          x         x        x                   is a formal concept
FCA and Graphs
       a   b    c   d
Kate   x            x
Mike x          x
Alex       x    x
David      x    x   x



   Formal Context       Bipartite graph
   Formal Concept          Biclique
 (maximal rectangle)
Formal Concept Analysis
 Definition 4. A formal concept (A,B) is said to be more general than (C,B), that
    is (A,B)  (C,D) iff A  C (equivalently D  B)
 The set of all concepts of the context (G, M, I) ordered by relation  forms a
   complete lattice B (G, M , I ) called concept lattice (Galois lattice).


Example

        Car    House   Laptop Bicycle      ({Alex, David, Mike} ,{Laptop})
Kate    x                     x            is more general than concept
Mike    x              x
                                           ({Alex, David} {House, Laptop})
Alex           x       x
David          x       x      x
Concept Lattice Diagram




          a   b   c   d
                          a - Car
1 Kate    x           x
                          b - House
2 Mike    x       x
3 Alex        x   x       c - Laptop
4 David       x   x   x   d - Bicycle
Biclustering


Geometric inerpretation
Biclustering Example
                           Car   House    Laptop Bicycle
                   Kate    x                       x
                   Mike    x              x
                   Alex          x        x
                   David         x        x        x



Since (House, David) is in the context
(HouseI, DavidI)= ({Alex, David}, {House,Laptop, Bicycle})
(HouseI, DavidI)=5/6
Biclustering properties
• Number of all biclusters for a context (G,M,I)
  not greater than |I| vs 2min{|G|,|M|} formal
  concepts. Usually |I| « 2min{|G|,|M|}, especially
  for sparse contexts.
• Probably dense biclusters ((bicluster) min)
  are good representation of communities,
  because all users inside the extent of every
  dense bicluster have almost all interests from
  its intent.
Triadic FCA and Folksonomies
 Definition 1. Triadic Formal Context is a quadruple (G, M, B, Y ), where G is a
   set of (formal) objects, M is a set of (formal) attributes, B is a set of conditions,
   and Y  G M B is the incidence relation which shows that object g  G
   posseses an attribute m  M under condition.


Example. Folksonomy
as triadic context (U, T, R, Y),
where
U is a set of users
T is a set of tags
R is a set of resources
Concept forming operators in triadic case




To define triclusters we propose box operators
OAC-Triclustering
              [Ignatov et al., 2011]




One dense tricluster VS 33 =27 formal triconcepts
Pseudo Triclustering for Social
          Networks
Algorithm
     Formal context 𝑈 , 𝐼, 𝑋                      Formal context 𝑈 , 𝐺, 𝑌
       (minimal density 𝑝 )                         (minimal density 𝑝 )




𝑆𝑒𝑡  𝑜𝑓  𝑏𝑖𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠                                              𝑆𝑒𝑡  𝑜𝑓  𝑏𝑖𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠
𝐵𝑖𝑐𝑙 𝑈𝐼                                                          𝐵𝑖𝑐𝑙 𝑈𝐺
                        Biclusters’  extent  similarity  𝜇 ≥ 𝐶



                            Set of pseudo-triclusters
                        (Possible constraint: 𝜌   ≥ 𝜌 𝑚𝑖𝑛)
Algorithm
Let 𝐵𝑖𝑐𝑙 𝑈𝐼 be a set of user-interest biclusters and 𝐵𝑖𝑐𝑙 𝑈𝐺 be a set of
user-group biclusters.
For each      𝑈 , 𝐼 ∈ 𝐵𝑖𝑐𝑙 𝑈𝐼  and      𝑈 , 𝐺 ∈ 𝐵𝑖𝑐𝑙 𝑈𝐺 triple
  𝑈 ∩ 𝑈 , 𝐼, 𝐺 is added to triclusters’ set if 𝑈 ∩ 𝑈 ≠ ∅ and
        ∩
µμ =          ≥ 𝐶, 0 ≤ 𝐶 ≤ 1.
        ∪
Thus, µμ  is used as a measure of quality of these pseudo-triclusters.
Another measure is an average density of biclusters:
                                 ,         ,
                                               .
Test setting: Intel Core i7-2600 system with 3.4 GHz and 8 GB RAM
Constraints for the formal contexts used: 𝜌 ≥ 0.5.
Data
Pseudo-triclustering algorithm was tested on the data of
Vkontakte, Russian social networking site. Students of two major
technical and two universities for humanities and sociology were
considered:


                               Bauman   MIPT    RSUH    RSSU
        # users                18542    4786    10266   12281
        # interests             8118    2593    5892    3733
        # groups               153985   46312   95619   102046
Biclustering results
                  Bauman                                   MIPT                                      RSUH                                   RSSU

 𝝆           UI               UG                  UI                    UG                  UI                 UG                  UI                 UG

      Time        #    Time        #       Time        #         Time        #       Time        #      Time        #       Time        #      Time        #

0.0    9188       8863 1874458 248077        863       2492 109012           46873    3958       5293 519772 116882          2588       4014 693658 145086

0.1    8882       8331 1296056 173786        827       2401      91187       38226    3763       4925 419145        93219    2450       3785 527135 110964

0.2    8497       6960 966000 120075         780       2015      74498       28391    3656       4003 330371        68709    2369       3220 402159        79802

0.3    8006       5513 788008      85227     761       1600      63888       21152    3361       3123 275394        50650    2284       2612 332523        58321

0.4    7700       4308 676733      59179     705       1270      56365       15306    3252       2399 232154        35434    2184       2037 281164        40657

0.5    7536       3777 654047      53877     668       1091      54868       13828    3189       2087 224808        32578    2179       1782 270605        37244

0.6    7324       2718 522110      18586     670           775   44850        5279    3075       1367 174657        10877    2159       1264 211897        12908

0.7    7250       2409 511711      15577     743           676   43854        4399    3007       1224 171554         9171    2084       1109 208632        10957

0.8    7217       2326 508368      14855     663           654   43526        4215    3032       1188 170984         8742    2121       1081 209084        10503

0.9    7246       2314 507983      14691     669           647   43216        4157    2985       1180 174781         8649    2096       1072 206902        10422

1.0    7236       2309 511466      14654     669           647   43434        4148    3057       1177 173240         8635    2086       1068 207198        10408
Pseudo triclustering results
           Bauman                   MIPT                      RSUH                    RSSU
µμ
      Time, ms   Count        Time, ms     Count        Time, ms   Count        Time, ms     Count

0.0    3353426    230161         77562       24852        256801     35275        183595       55338
0.1      76758      10928        35137        5969         62736      5679         18725        5582
0.2      80647      8539         31231        4908         58695      5089         16466        3641
0.3      77956      6107         27859        3770         53789      3865         17448        2772
0.4      60929           31       2060             12       9890           14      13585             12
0.5      66709           24       2327             10       9353           14      12776             10
0.6      57803           22       2147             8       11352           14      12268             10
0.7      68361           18       2333             8       10778           12      13819             4
0.8      70948           18       2256             8        9489           12      13725             4
0.9      65527           18       1942             8       10769           12      11705             4
1.0      65991           18       1971             8       10763           12      13263             4
Pseudo triclustering results
                              Bauman                                                    MIPT
1000000                                                          100000

 100000
                                                                  10000
  10000
                                                                   1000
  1000
                                                                    100
   100

     10                                                              10

      1
                                                                      1
              0       0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9    1
                                                                          0   0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9       1

                                 RSUH                                                   RSSU
  100000                                                         100000

   10000                                                          10000

    1000                                                           1000

     100                                                            100

      10                                                             10

          1                                                           1
                  0    0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9   1            0   0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9   1



      Number of pseudo-triclusters for different values of 𝜇
Examples. Biclusters
• =83,33%        Gen. pair: {3609, home}
G: {3609, 4566} M: {family, work, home}
•  =83,33%      Gen. pair: {30568, orthodox
  church}
G: {25092, 30568}      M: {music, monastery,
orthodox church}
• =100% Gen. pair: {4220, beauty}
G: {1269, 4220, 5337, 20787} M: {love, beauty}
Examples: Tricluster
• Measures:
µμ  : 100%;
Average 𝜌  : 54,92%
Users: {16313, 24835}
Interests: {sleeping, painting, walking, tattoo,
hamster, impressions}
Groups: {365, 457, 624,…,  17357688, 17365092}
Conclusion
• It is possible to use pseudo-triclustering method
  for tagging groups by interests in social
  networking sites and finding tricommunities. E.g.,
  if we have found a dense pseudo-tricluster
  (Users, Groups, Interests) we can mark Groups by
  user interests from Interests.
• It also make sense to use biclusters and tricluster
  for making recommendations. Missing pairs and
  triples seem to be good candidates to
  recommend potentially interesting users, groups
  and interests.
Conclusion
• The approach needs some improvements and fine
  tuning in order to increase the scalability and quality of
  communities
   – Strategies for approximate density calculation
   – Choosing good thresholds for n-clusters density and
     communities similarity
   – More sophisticated quality measures like recall and
     precision in Information Retrieval
• It needs comparison with other approaches like iceberg
  lattices (Stumme), stable concepts (Kuznetsov), fault-
  tolerant concepts (Boulicaut) and different n-clustering
  techniques from bioinformatics (Zaki, Mirkin, etc.)
• Current  version  also  requires  expert’s  feedback  on  the  
  output data analysis and interpretation
Questions?

More Related Content

What's hot

Using Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataUsing Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataAnalyticsWeek
 
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Gota Morota
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 
Tech talk ggplot2
Tech talk   ggplot2Tech talk   ggplot2
Tech talk ggplot2jalle6
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Introduction to ggplot2
Introduction to ggplot2Introduction to ggplot2
Introduction to ggplot2maikroeder
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...Taeksoo Kim
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded ImaginationDeep Learning JP
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clusteringDmitrii Ignatov
 
Transportation Problem with Pentagonal Intuitionistic Fuzzy Numbers Solved Us...
Transportation Problem with Pentagonal Intuitionistic Fuzzy Numbers Solved Us...Transportation Problem with Pentagonal Intuitionistic Fuzzy Numbers Solved Us...
Transportation Problem with Pentagonal Intuitionistic Fuzzy Numbers Solved Us...IJERA Editor
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial NetworksDong Heon Cho
 
Algorithm for Edge Antimagic Labeling for Specific Classes of Graphs
Algorithm for Edge Antimagic Labeling for Specific Classes of GraphsAlgorithm for Edge Antimagic Labeling for Specific Classes of Graphs
Algorithm for Edge Antimagic Labeling for Specific Classes of GraphsCSCJournals
 
Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...
Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...
Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...IJERA Editor
 
11.[8 17]numerical solution of fuzzy hybrid differential equation by third or...
11.[8 17]numerical solution of fuzzy hybrid differential equation by third or...11.[8 17]numerical solution of fuzzy hybrid differential equation by third or...
11.[8 17]numerical solution of fuzzy hybrid differential equation by third or...Alexander Decker
 
11.numerical solution of fuzzy hybrid differential equation by third order ru...
11.numerical solution of fuzzy hybrid differential equation by third order ru...11.numerical solution of fuzzy hybrid differential equation by third order ru...
11.numerical solution of fuzzy hybrid differential equation by third order ru...Alexander Decker
 
Numerical solution of fuzzy hybrid differential equation by third order runge...
Numerical solution of fuzzy hybrid differential equation by third order runge...Numerical solution of fuzzy hybrid differential equation by third order runge...
Numerical solution of fuzzy hybrid differential equation by third order runge...Alexander Decker
 

What's hot (20)

Using Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataUsing Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigData
 
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
Deep Learning Opening Workshop - Domain Adaptation Challenges in Genomics: a ...
 
Generative adversarial text to image synthesis
Generative adversarial text to image synthesisGenerative adversarial text to image synthesis
Generative adversarial text to image synthesis
 
Bq25399403
Bq25399403Bq25399403
Bq25399403
 
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
Application of Bayesian and Sparse Network Models for Assessing Linkage Diseq...
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Tech talk ggplot2
Tech talk   ggplot2Tech talk   ggplot2
Tech talk ggplot2
 
kcde
kcdekcde
kcde
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Introduction to ggplot2
Introduction to ggplot2Introduction to ggplot2
Introduction to ggplot2
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
 
[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination[DL輪読会]Generative Models of Visually Grounded Imagination
[DL輪読会]Generative Models of Visually Grounded Imagination
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clustering
 
Transportation Problem with Pentagonal Intuitionistic Fuzzy Numbers Solved Us...
Transportation Problem with Pentagonal Intuitionistic Fuzzy Numbers Solved Us...Transportation Problem with Pentagonal Intuitionistic Fuzzy Numbers Solved Us...
Transportation Problem with Pentagonal Intuitionistic Fuzzy Numbers Solved Us...
 
Basic Generative Adversarial Networks
Basic Generative Adversarial NetworksBasic Generative Adversarial Networks
Basic Generative Adversarial Networks
 
Algorithm for Edge Antimagic Labeling for Specific Classes of Graphs
Algorithm for Edge Antimagic Labeling for Specific Classes of GraphsAlgorithm for Edge Antimagic Labeling for Specific Classes of Graphs
Algorithm for Edge Antimagic Labeling for Specific Classes of Graphs
 
Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...
Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...
Relation between Type-II Discrete Sine Transform and Type -I Discrete Hartley...
 
11.[8 17]numerical solution of fuzzy hybrid differential equation by third or...
11.[8 17]numerical solution of fuzzy hybrid differential equation by third or...11.[8 17]numerical solution of fuzzy hybrid differential equation by third or...
11.[8 17]numerical solution of fuzzy hybrid differential equation by third or...
 
11.numerical solution of fuzzy hybrid differential equation by third order ru...
11.numerical solution of fuzzy hybrid differential equation by third order ru...11.numerical solution of fuzzy hybrid differential equation by third order ru...
11.numerical solution of fuzzy hybrid differential equation by third order ru...
 
Numerical solution of fuzzy hybrid differential equation by third order runge...
Numerical solution of fuzzy hybrid differential equation by third order runge...Numerical solution of fuzzy hybrid differential equation by third order runge...
Numerical solution of fuzzy hybrid differential equation by third order runge...
 

Viewers also liked

peer division multiplexing
peer division multiplexingpeer division multiplexing
peer division multiplexingajayj251
 
Advanced Production Accounting of a Flotation Plant
Advanced Production Accounting of a Flotation PlantAdvanced Production Accounting of a Flotation Plant
Advanced Production Accounting of a Flotation PlantAlkis Vazacopoulos
 
A new sewage system with best value procurement van de rijt & van den hoogen ...
A new sewage system with best value procurement van de rijt & van den hoogen ...A new sewage system with best value procurement van de rijt & van den hoogen ...
A new sewage system with best value procurement van de rijt & van den hoogen ...Jeroen Van de Rijt
 
Prezentare evenimente speciale
Prezentare evenimente specialePrezentare evenimente speciale
Prezentare evenimente specialeTatianaBogan
 
แบบฟอร์มเขียนโครงร่างโครงงา1
แบบฟอร์มเขียนโครงร่างโครงงา1แบบฟอร์มเขียนโครงร่างโครงงา1
แบบฟอร์มเขียนโครงร่างโครงงา1Fiction Lee'jslism
 
Education and the new literacies studies
Education and the new literacies studiesEducation and the new literacies studies
Education and the new literacies studiesSilvana Lúcia Avelar
 
Foreign language teaching and new literacies
Foreign language teaching and new literaciesForeign language teaching and new literacies
Foreign language teaching and new literaciesSilvana Lúcia Avelar
 
[DDBJing29]NBDC ヒトデータベースを介した Japanese Genotype-phenotype Archive のデータ共有の審査過程と...
[DDBJing29]NBDC ヒトデータベースを介した Japanese Genotype-phenotype Archive のデータ共有の審査過程と...[DDBJing29]NBDC ヒトデータベースを介した Japanese Genotype-phenotype Archive のデータ共有の審査過程と...
[DDBJing29]NBDC ヒトデータベースを介した Japanese Genotype-phenotype Archive のデータ共有の審査過程と...DNA Data Bank of Japan center
 
Портфолио Курмаевой С.И.
Портфолио Курмаевой С.И.Портфолио Курмаевой С.И.
Портфолио Курмаевой С.И.ipevm1979
 
MSH Design Brochure (draft / test)
MSH Design Brochure (draft / test)MSH Design Brochure (draft / test)
MSH Design Brochure (draft / test)MSHDinc
 

Viewers also liked (20)

Ppt otef jerusalem
Ppt otef jerusalemPpt otef jerusalem
Ppt otef jerusalem
 
Mas
MasMas
Mas
 
Proiect de lege PLx 698/11
Proiect de lege PLx 698/11Proiect de lege PLx 698/11
Proiect de lege PLx 698/11
 
Q6!
Q6!Q6!
Q6!
 
peer division multiplexing
peer division multiplexingpeer division multiplexing
peer division multiplexing
 
Q6
Q6Q6
Q6
 
47 amazing blog designs
47 amazing blog designs47 amazing blog designs
47 amazing blog designs
 
Advanced Production Accounting of a Flotation Plant
Advanced Production Accounting of a Flotation PlantAdvanced Production Accounting of a Flotation Plant
Advanced Production Accounting of a Flotation Plant
 
A new sewage system with best value procurement van de rijt & van den hoogen ...
A new sewage system with best value procurement van de rijt & van den hoogen ...A new sewage system with best value procurement van de rijt & van den hoogen ...
A new sewage system with best value procurement van de rijt & van den hoogen ...
 
Corpus fotográfico
Corpus fotográficoCorpus fotográfico
Corpus fotográfico
 
Prezentare evenimente speciale
Prezentare evenimente specialePrezentare evenimente speciale
Prezentare evenimente speciale
 
Park Hill Brand Report
Park Hill Brand ReportPark Hill Brand Report
Park Hill Brand Report
 
แบบฟอร์มเขียนโครงร่างโครงงา1
แบบฟอร์มเขียนโครงร่างโครงงา1แบบฟอร์มเขียนโครงร่างโครงงา1
แบบฟอร์มเขียนโครงร่างโครงงา1
 
Education and the new literacies studies
Education and the new literacies studiesEducation and the new literacies studies
Education and the new literacies studies
 
Foreign language teaching and new literacies
Foreign language teaching and new literaciesForeign language teaching and new literacies
Foreign language teaching and new literacies
 
[DDBJing29]NBDC ヒトデータベースを介した Japanese Genotype-phenotype Archive のデータ共有の審査過程と...
[DDBJing29]NBDC ヒトデータベースを介した Japanese Genotype-phenotype Archive のデータ共有の審査過程と...[DDBJing29]NBDC ヒトデータベースを介した Japanese Genotype-phenotype Archive のデータ共有の審査過程と...
[DDBJing29]NBDC ヒトデータベースを介した Japanese Genotype-phenotype Archive のデータ共有の審査過程と...
 
Портфолио Курмаевой С.И.
Портфолио Курмаевой С.И.Портфолио Курмаевой С.И.
Портфолио Курмаевой С.И.
 
MSH Design Brochure (draft / test)
MSH Design Brochure (draft / test)MSH Design Brochure (draft / test)
MSH Design Brochure (draft / test)
 
Apexcse
ApexcseApexcse
Apexcse
 
המעצבים עבודות חוץ
המעצבים עבודות חוץהמעצבים עבודות חוץ
המעצבים עבודות חוץ
 

Similar to Дмитрий Игнатов для ФИSNA

Analysing Online Social Network Data with Biclustering and Triclustering
Analysing Online Social Network Data with Biclustering and TriclusteringAnalysing Online Social Network Data with Biclustering and Triclustering
Analysing Online Social Network Data with Biclustering and TriclusteringAlexander Semeonov
 
11 Machine Learning Important Issues in Machine Learning
11 Machine Learning Important Issues in Machine Learning11 Machine Learning Important Issues in Machine Learning
11 Machine Learning Important Issues in Machine LearningAndres Mendez-Vazquez
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsGuillaume Costeseque
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"tuxette
 
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Thang Nguyen
 
Simulation Watch Window Part I.pdf
Simulation Watch Window Part I.pdfSimulation Watch Window Part I.pdf
Simulation Watch Window Part I.pdfBrij Consulting, LLC
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...Aalto University
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Searchkrisztianbalog
 
Action and content based Community Detection in Social Networks
Action and content based Community Detection in Social NetworksAction and content based Community Detection in Social Networks
Action and content based Community Detection in Social Networksritesh_11
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksDenis Dus
 
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...Association for Computational Linguistics
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic rankingFELIX75
 
DWDM-AG-day-1-2023-SEC A plus Half B--.pdf
DWDM-AG-day-1-2023-SEC A plus Half B--.pdfDWDM-AG-day-1-2023-SEC A plus Half B--.pdf
DWDM-AG-day-1-2023-SEC A plus Half B--.pdfChristinaGayenMondal
 
Recommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial ServicesRecommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial ServicesAndrea Gigli
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMustafa Yagmur
 
Searching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsSearching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsDmitrii Ignatov
 

Similar to Дмитрий Игнатов для ФИSNA (20)

Pseudo-triclustering
Pseudo-triclusteringPseudo-triclustering
Pseudo-triclustering
 
Analysing Online Social Network Data with Biclustering and Triclustering
Analysing Online Social Network Data with Biclustering and TriclusteringAnalysing Online Social Network Data with Biclustering and Triclustering
Analysing Online Social Network Data with Biclustering and Triclustering
 
Extracting biclusters of similar values with Triadic Concept Analysis
Extracting biclusters of similar values with Triadic Concept AnalysisExtracting biclusters of similar values with Triadic Concept Analysis
Extracting biclusters of similar values with Triadic Concept Analysis
 
Interval Pattern Structures: An introdution
Interval Pattern Structures: An introdutionInterval Pattern Structures: An introdution
Interval Pattern Structures: An introdution
 
11 Machine Learning Important Issues in Machine Learning
11 Machine Learning Important Issues in Machine Learning11 Machine Learning Important Issues in Machine Learning
11 Machine Learning Important Issues in Machine Learning
 
Traffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equationsTraffic flow modeling on road networks using Hamilton-Jacobi equations
Traffic flow modeling on road networks using Hamilton-Jacobi equations
 
Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"Reading revue of "Inferring Multiple Graphical Structures"
Reading revue of "Inferring Multiple Graphical Structures"
 
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
 
Simulation Watch Window Part I.pdf
Simulation Watch Window Part I.pdfSimulation Watch Window Part I.pdf
Simulation Watch Window Part I.pdf
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
 
AI Final report 1.pdf
AI Final report 1.pdfAI Final report 1.pdf
AI Final report 1.pdf
 
Evaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented SearchEvaluation Initiatives for Entity-oriented Search
Evaluation Initiatives for Entity-oriented Search
 
Action and content based Community Detection in Social Networks
Action and content based Community Detection in Social NetworksAction and content based Community Detection in Social Networks
Action and content based Community Detection in Social Networks
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural Networks
 
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
Tatyana Makhalova - 2015 - News clustering approach based on discourse text s...
 
probabilistic ranking
probabilistic rankingprobabilistic ranking
probabilistic ranking
 
DWDM-AG-day-1-2023-SEC A plus Half B--.pdf
DWDM-AG-day-1-2023-SEC A plus Half B--.pdfDWDM-AG-day-1-2023-SEC A plus Half B--.pdf
DWDM-AG-day-1-2023-SEC A plus Half B--.pdf
 
Recommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial ServicesRecommendation Systems in banking and Financial Services
Recommendation Systems in banking and Financial Services
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Searching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsSearching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensors
 

More from Andzhey Arshavskiy

More from Andzhey Arshavskiy (12)

dsl & bigdata
dsl & bigdatadsl & bigdata
dsl & bigdata
 
BigData in Banking
BigData in BankingBigData in Banking
BigData in Banking
 
Dsl public
Dsl publicDsl public
Dsl public
 
Digital Society Lab (about)
Digital Society Lab (about)Digital Society Lab (about)
Digital Society Lab (about)
 
Digital Society Laboratory (DSL)
Digital Society Laboratory (DSL)Digital Society Laboratory (DSL)
Digital Society Laboratory (DSL)
 
WHAT IS BIG DATA? AND HOW IT APPLIED IN MODERN MARKETING
WHAT IS BIG DATA? AND HOW IT APPLIED IN MODERN MARKETINGWHAT IS BIG DATA? AND HOW IT APPLIED IN MODERN MARKETING
WHAT IS BIG DATA? AND HOW IT APPLIED IN MODERN MARKETING
 
Ispras (трудаков, коршунов)
Ispras (трудаков, коршунов)Ispras (трудаков, коршунов)
Ispras (трудаков, коршунов)
 
Dmitry Gubanov presentation for ФИSNA
Dmitry Gubanov presentation for ФИSNADmitry Gubanov presentation for ФИSNA
Dmitry Gubanov presentation for ФИSNA
 
Digital Society Laboratory (Аршавский)
Digital Society Laboratory (Аршавский)Digital Society Laboratory (Аршавский)
Digital Society Laboratory (Аршавский)
 
мосты
мостымосты
мосты
 
Japan creativity.pps
Japan creativity.ppsJapan creativity.pps
Japan creativity.pps
 
Big data, Clouds & HPC
Big data, Clouds & HPCBig data, Clouds & HPC
Big data, Clouds & HPC
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Дмитрий Игнатов для ФИSNA

  • 1. Analysing Online Social Network Data with Biclustering and Triclustering Dmitry Gnatyshak1 Dmitry Ignatov1 Jonas Poelmans1,2 Alexander Semenov1 1NRU HSE, Moscow, Russia 2KU Leuven, Belgium BIR 2012 Nizhny Novgorod
  • 2. Motivation I • There is large amount of network data that can be represented as bipartite and tripartite graphs • Standard techniques like maximal bicliques search result in huge number of patterns (in the worst  case  exponential  w.r.t.  of  input  size)… • Therefore we need some relaxation of this notion and good measures of interestingness of biclique communities
  • 3. Motivation II • Applied lattice theory provides us with a notion of formal concept which is almost the same thing as biclique • L. C. Freeman, D. R. White. Using Galois Lattices to Represent Network Data Sociological Methodology 1993 (23). • Social Networks 18(3), 1996 – L. C. Freeman, Cliques, Galois Lattices, and the Structure of Human Social Groups. – V. Duquenne, Lattice analysis and the representation of handicap associations. – D. R. White. Statistical entailments and the Galois lattice. • J.W. Mohr, Vincent D. The duality of culture and practice: Poverty relief in New York City, 1888—1917 Theory and Society, 1997 • Camille Roth et al., Towards Concise Representation for Taxonomies of Epistemic Communities, CLA 4th Intl Conf on Concept Lattices and their Applications, 2006 • And many other papers on application to social network analysis with FCA
  • 4. Motivation III • Concept-based bicluster (Ignatov et al., 2010) is a scalable approximation of a formal concept (biclique) – Less number of patterns to analyze – Less computational time (polynomial vs exp.) – Manual tuning of bicluster (community) density threshold – Tolerance to missing (object, attribute) pairs • For analyzing three-way network data like folksonomies we proposed triclustering (Ignatov et al., 2011)
  • 5. Formal Concept Analysis [Wille, 1982, Ganter & Wille, 1999] Definition 1. Formal Context is a triple (G, M, I ), where G is a set of (formal) objects, M is a set of (formal) attributes, and I  G M is the incidence relation which shows that object g  G posseses an attribute m  M. Example Car House Laptop Bicycle Kate x x Mike x x Alex x x David x x x
  • 6. Formal Concept Analysis Definition 2. Derivation operators (defining Galois connection) AI := { m  M | gIm for all g  A } is the set of attributes common to all objects in A BI := { g  G | gIm for all m  B } is the set of objects that have all attributes from B Example Car House Laptop Bicycle {Kate, Mike}I = {Car} Kate x x {Laptop} I = {Mike, Alex, David} Mike x x Alex x x {Car, House} I = {}G David x x x {} IG =M
  • 7. Formal Concept Analysis Definition 3. (A, B) is a formal concept of (G, M, I) iff A  G, B  M, AI = B, and BI = A . A is the extent and B is the intent of the concept (A, B). B (G, M , I ) is a set of all concepts of the context (G, M, I) Example • A pair ({Kate, Mike},{Car}) is a formal concept Car House Laptop Bicycle Kate x x • ({Alex, David} ,{Laptop}) doesn‘t form a formal concept, because Mike x x {Laptop} I{Alex, David} Alex x x • ({Alex, David} {House, Laptop}) David x x x is a formal concept
  • 8. FCA and Graphs a b c d Kate x x Mike x x Alex x x David x x x Formal Context Bipartite graph Formal Concept Biclique (maximal rectangle)
  • 9. Formal Concept Analysis Definition 4. A formal concept (A,B) is said to be more general than (C,B), that is (A,B)  (C,D) iff A  C (equivalently D  B) The set of all concepts of the context (G, M, I) ordered by relation  forms a complete lattice B (G, M , I ) called concept lattice (Galois lattice). Example Car House Laptop Bicycle ({Alex, David, Mike} ,{Laptop}) Kate x x is more general than concept Mike x x ({Alex, David} {House, Laptop}) Alex x x David x x x
  • 10. Concept Lattice Diagram a b c d a - Car 1 Kate x x b - House 2 Mike x x 3 Alex x x c - Laptop 4 David x x x d - Bicycle
  • 12. Biclustering Example Car House Laptop Bicycle Kate x x Mike x x Alex x x David x x x Since (House, David) is in the context (HouseI, DavidI)= ({Alex, David}, {House,Laptop, Bicycle}) (HouseI, DavidI)=5/6
  • 13. Biclustering properties • Number of all biclusters for a context (G,M,I) not greater than |I| vs 2min{|G|,|M|} formal concepts. Usually |I| « 2min{|G|,|M|}, especially for sparse contexts. • Probably dense biclusters ((bicluster) min) are good representation of communities, because all users inside the extent of every dense bicluster have almost all interests from its intent.
  • 14. Triadic FCA and Folksonomies Definition 1. Triadic Formal Context is a quadruple (G, M, B, Y ), where G is a set of (formal) objects, M is a set of (formal) attributes, B is a set of conditions, and Y  G M B is the incidence relation which shows that object g  G posseses an attribute m  M under condition. Example. Folksonomy as triadic context (U, T, R, Y), where U is a set of users T is a set of tags R is a set of resources
  • 15. Concept forming operators in triadic case To define triclusters we propose box operators
  • 16. OAC-Triclustering [Ignatov et al., 2011] One dense tricluster VS 33 =27 formal triconcepts
  • 17. Pseudo Triclustering for Social Networks
  • 18. Algorithm Formal context 𝑈 , 𝐼, 𝑋 Formal context 𝑈 , 𝐺, 𝑌 (minimal density 𝑝 ) (minimal density 𝑝 ) 𝑆𝑒𝑡  𝑜𝑓  𝑏𝑖𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠   𝑆𝑒𝑡  𝑜𝑓  𝑏𝑖𝑐𝑙𝑢𝑠𝑡𝑒𝑟𝑠 𝐵𝑖𝑐𝑙 𝑈𝐼 𝐵𝑖𝑐𝑙 𝑈𝐺 Biclusters’  extent  similarity  𝜇 ≥ 𝐶 Set of pseudo-triclusters (Possible constraint: 𝜌 ≥ 𝜌 𝑚𝑖𝑛)
  • 19. Algorithm Let 𝐵𝑖𝑐𝑙 𝑈𝐼 be a set of user-interest biclusters and 𝐵𝑖𝑐𝑙 𝑈𝐺 be a set of user-group biclusters. For each 𝑈 , 𝐼 ∈ 𝐵𝑖𝑐𝑙 𝑈𝐼 and 𝑈 , 𝐺 ∈ 𝐵𝑖𝑐𝑙 𝑈𝐺 triple 𝑈 ∩ 𝑈 , 𝐼, 𝐺 is added to triclusters’ set if 𝑈 ∩ 𝑈 ≠ ∅ and ∩ µμ = ≥ 𝐶, 0 ≤ 𝐶 ≤ 1. ∪ Thus, µμ  is used as a measure of quality of these pseudo-triclusters. Another measure is an average density of biclusters: , , . Test setting: Intel Core i7-2600 system with 3.4 GHz and 8 GB RAM Constraints for the formal contexts used: 𝜌 ≥ 0.5.
  • 20. Data Pseudo-triclustering algorithm was tested on the data of Vkontakte, Russian social networking site. Students of two major technical and two universities for humanities and sociology were considered: Bauman MIPT RSUH RSSU # users 18542 4786 10266 12281 # interests 8118 2593 5892 3733 # groups 153985 46312 95619 102046
  • 21. Biclustering results Bauman MIPT RSUH RSSU 𝝆 UI UG UI UG UI UG UI UG Time # Time # Time # Time # Time # Time # Time # Time # 0.0 9188 8863 1874458 248077 863 2492 109012 46873 3958 5293 519772 116882 2588 4014 693658 145086 0.1 8882 8331 1296056 173786 827 2401 91187 38226 3763 4925 419145 93219 2450 3785 527135 110964 0.2 8497 6960 966000 120075 780 2015 74498 28391 3656 4003 330371 68709 2369 3220 402159 79802 0.3 8006 5513 788008 85227 761 1600 63888 21152 3361 3123 275394 50650 2284 2612 332523 58321 0.4 7700 4308 676733 59179 705 1270 56365 15306 3252 2399 232154 35434 2184 2037 281164 40657 0.5 7536 3777 654047 53877 668 1091 54868 13828 3189 2087 224808 32578 2179 1782 270605 37244 0.6 7324 2718 522110 18586 670 775 44850 5279 3075 1367 174657 10877 2159 1264 211897 12908 0.7 7250 2409 511711 15577 743 676 43854 4399 3007 1224 171554 9171 2084 1109 208632 10957 0.8 7217 2326 508368 14855 663 654 43526 4215 3032 1188 170984 8742 2121 1081 209084 10503 0.9 7246 2314 507983 14691 669 647 43216 4157 2985 1180 174781 8649 2096 1072 206902 10422 1.0 7236 2309 511466 14654 669 647 43434 4148 3057 1177 173240 8635 2086 1068 207198 10408
  • 22. Pseudo triclustering results Bauman MIPT RSUH RSSU µμ Time, ms Count Time, ms Count Time, ms Count Time, ms Count 0.0 3353426 230161 77562 24852 256801 35275 183595 55338 0.1 76758 10928 35137 5969 62736 5679 18725 5582 0.2 80647 8539 31231 4908 58695 5089 16466 3641 0.3 77956 6107 27859 3770 53789 3865 17448 2772 0.4 60929 31 2060 12 9890 14 13585 12 0.5 66709 24 2327 10 9353 14 12776 10 0.6 57803 22 2147 8 11352 14 12268 10 0.7 68361 18 2333 8 10778 12 13819 4 0.8 70948 18 2256 8 9489 12 13725 4 0.9 65527 18 1942 8 10769 12 11705 4 1.0 65991 18 1971 8 10763 12 13263 4
  • 23. Pseudo triclustering results Bauman MIPT 1000000 100000 100000 10000 10000 1000 1000 100 100 10 10 1 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 RSUH RSSU 100000 100000 10000 10000 1000 1000 100 100 10 10 1 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 Number of pseudo-triclusters for different values of 𝜇
  • 24. Examples. Biclusters • =83,33% Gen. pair: {3609, home} G: {3609, 4566} M: {family, work, home} •  =83,33% Gen. pair: {30568, orthodox church} G: {25092, 30568} M: {music, monastery, orthodox church} • =100% Gen. pair: {4220, beauty} G: {1269, 4220, 5337, 20787} M: {love, beauty}
  • 25. Examples: Tricluster • Measures: µμ  : 100%; Average 𝜌  : 54,92% Users: {16313, 24835} Interests: {sleeping, painting, walking, tattoo, hamster, impressions} Groups: {365, 457, 624,…,  17357688, 17365092}
  • 26. Conclusion • It is possible to use pseudo-triclustering method for tagging groups by interests in social networking sites and finding tricommunities. E.g., if we have found a dense pseudo-tricluster (Users, Groups, Interests) we can mark Groups by user interests from Interests. • It also make sense to use biclusters and tricluster for making recommendations. Missing pairs and triples seem to be good candidates to recommend potentially interesting users, groups and interests.
  • 27. Conclusion • The approach needs some improvements and fine tuning in order to increase the scalability and quality of communities – Strategies for approximate density calculation – Choosing good thresholds for n-clusters density and communities similarity – More sophisticated quality measures like recall and precision in Information Retrieval • It needs comparison with other approaches like iceberg lattices (Stumme), stable concepts (Kuznetsov), fault- tolerant concepts (Boulicaut) and different n-clustering techniques from bioinformatics (Zaki, Mirkin, etc.) • Current  version  also  requires  expert’s  feedback  on  the   output data analysis and interpretation