SlideShare a Scribd company logo
1 of 27
Dmitry V. Gnatyshak, Dmitry I. Ignatov*,
Sergei O. Kuznetsov
School of Applied Mathematics and Information Science & Intelligence Systems and Structural
Analysis Lab
NRU Higher School of Economics, Moscow, Russia
LORIA Orpailleur meeting, Nancy, France, 2013
Outline
1. Motivation and problem setting
2. FCA basic definitions
3. Triclustering methods
4. Experiments
5. Conclusion
2
Motivation
 A large amount of structured and unstructured data
generates triadic data.
 Example: folksonomy is a set of triples (user, object, tag)
 Examples:
 Bibsonomy.org
(user, bookmark, tag)
 Social networking sites
(user, group, interest)
 Delicious
(user, link, tag)
3
Main goals
1. Comparison of some triclustering methods
2. Development of a toolbox for triclustering experiments
3. New possibly better methods
4. Possible applications
4
FCA: basic definitions
Biology Mathematics Computer
Science
Chemistry
Kate x x
Mike x x x
Alex x x
Pete x x x
5

(R. Wille, 1982; B. Ganter, R. Wille, 1999)
FCA: basic definitions

6
Biology Mathematics Computer
Science
Chemistry
Kate x x
Mike x x x
Alex x x
Pete x x x
FCA: basic definitions

7
Biology Mathematics Computer
Science
Chemistry
Kate x x
Mike x x x
Alex x x
Pete x x x
Triadic FCA: basic definitions

8
(F. Lehmann, R. Wille, 1995)
OAC-triclusters
(based on box operators)

Box operators
9
(D. Ignatov et al., 2011) … …
…
…
OAC-triclusters
(based on prime operators)

Prime-operators of singletons
10
OAC-triclustering

11
TriBox

12
(A. Kramarenko & B. Mirkin, 2011)
Spectral Triclustering: SpecTric

13
(D. Ignatov & Z. Sekinaeva, 2011; Ignatov et al. 2013)
Spectral Triclustering: SpecTric
14
(D. Ignatov & Z. Sekinaeva, 2011; Ignatov et al. 2013)
TRIAS

15
(R. Jäschke, 2006)
Experiments
 Main goals:
 Fault-tolerance test
 Comparison by criteria: time, quantity, mean density,
coverage and diversity
 For TriBox and OAC-triclustering we implemented their
parallel versions
 They were included to the comparison
16
Data

17
Comparison Criteria

18
Results (fault-tolerance)
19
OAC-prime triclustering example
 IMDB
20
Results (time, quantity, average density, coverage,
diversity)
Method T,ms #
OAC (box) 407 73 9,88 100,00 0,00 0,00 0,00 0,00
OAC (prime) 312 2659 32,23 100,00 92,51 60,07 59,80 59,45
SepcTric 277 5 8,74 8,84 100,00 100,00 100,00 100,00
TriBox 6218 1011 74,00 96,02 97,42 66,25 79,53 84,80
TRIAS 29367 38356 100,00 100,00 99,99 99,93 4,07 3,51
IMDB
OAC (box) 2314 1500 1,84 100,00 15,65 9,67 0,70 7,87
OAC (prime) 547 1274 53,85 100,00 96,55 94,56 92,14 28,52
Spectric 98799 21 17,07 20,88 100,00 100,00 100,00 100,00
TriBox 197136 328 91,65 98,90 98,89 98,46 95,21 30,94
TRIAS 102554 1956 100,00 100,0 99,89 99,69 52,52 26,18
BibSonomy
OAC (box) 19297 398 4,16 100,00 79,59 67,28 42,83 79,54
OAC (prime) 13556 1289 94,66 100,00 99,74 88,58 99,51 99,53
SpecTric 5906563 2 50,00 100,00 100,00 100,00 100,00 100,00
TriBox Time> 24 hours
TRIAS 110554 1305 100,00 100,00 99,98 91,70 99,78 99,92
21
Method Time Quantity Average
density
Coverage Diversity Efficiency of
parallel
version
OAC(box)
average large
low
high~very low
very low~average
high
OAC (prime)
small large average high~average average~high low
SpecTric
Small for small
contexts
small low average~high 1 –
TriBox high average high high high high
TRIAS
very large 1 high~low high~low –
22
Results (time, quantity, average density,
coverage, diversity)
Conclusion
 There is no a winner according to the comparison criteria
 Method TriBox shows best results but it takes huge
computational time
 OAC-triclustering based on prime operators gives the
second best results and it is sufficiently fast
23
Conclusion
 There is no a winner according to the comparison criteria
 Details by methods:
 TRIAS
 High elapsed time
 Too large number of small well-interpreted triclusters
(triconcepts)
24
Conclusion
 OAC (box operators)
 Large triclusters of low density
 High density, small diversity
 An efficient parallelization
 OAC (prime-operators)
 High speed of computations
 Large number of dense well-interpreted triclusters
 Low efficiency of parallelization
25
Conclusion
 Spectral Triclustering
 High computational speed on small contexts
 Well-interpreted triclusters but of the low density
 Diversity is always equals to 1, but it causes too low coverage
 TriBox
 A moderate number of well-interpreted triclusters
 High elapsed time
 Efficient parallelization
 Reasonably high coverage and diversity
26
Merci beaucoup!
Questions?
27

More Related Content

Viewers also liked

001 dg0511 intro.indd
001 dg0511 intro.indd001 dg0511 intro.indd
001 dg0511 intro.inddtaghayyor
 
CoClus ICDM Workshop talk
CoClus ICDM Workshop talkCoClus ICDM Workshop talk
CoClus ICDM Workshop talkDmitrii Ignatov
 
Encore xmas brochure 2014
Encore xmas brochure 2014Encore xmas brochure 2014
Encore xmas brochure 2014Denis Koba
 
Online Recommender System for Radio Station Hosting: Experimental Results Rev...
Online Recommender System for Radio Station Hosting: Experimental Results Rev...Online Recommender System for Radio Station Hosting: Experimental Results Rev...
Online Recommender System for Radio Station Hosting: Experimental Results Rev...Dmitrii Ignatov
 

Viewers also liked (8)

001 dg0511 intro.indd
001 dg0511 intro.indd001 dg0511 intro.indd
001 dg0511 intro.indd
 
Clase 1 Coaching Poliglota
Clase 1 Coaching PoliglotaClase 1 Coaching Poliglota
Clase 1 Coaching Poliglota
 
CoClus ICDM Workshop talk
CoClus ICDM Workshop talkCoClus ICDM Workshop talk
CoClus ICDM Workshop talk
 
Students.rb #1 資料
Students.rb #1 資料Students.rb #1 資料
Students.rb #1 資料
 
Encore xmas brochure 2014
Encore xmas brochure 2014Encore xmas brochure 2014
Encore xmas brochure 2014
 
Pseudo-triclustering
Pseudo-triclusteringPseudo-triclustering
Pseudo-triclustering
 
Poliglota1 pdf
Poliglota1 pdfPoliglota1 pdf
Poliglota1 pdf
 
Online Recommender System for Radio Station Hosting: Experimental Results Rev...
Online Recommender System for Radio Station Hosting: Experimental Results Rev...Online Recommender System for Radio Station Hosting: Experimental Results Rev...
Online Recommender System for Radio Station Hosting: Experimental Results Rev...
 

Similar to Orpailleur -- triclustering talk

Design of experiments-Box behnken design
Design of experiments-Box behnken designDesign of experiments-Box behnken design
Design of experiments-Box behnken designGulamhushen Sipai
 
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...IRJET Journal
 
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTINGFAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTINGIJNSA Journal
 
3 jep publications 1 19-25
3 jep publications 1 19-253 jep publications 1 19-25
3 jep publications 1 19-25Alexander Decker
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision treesJulià Minguillón
 
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTINGFAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTINGIJNSA Journal
 
Learning Content and Usage Factors Simultaneously
Learning Content and Usage Factors SimultaneouslyLearning Content and Usage Factors Simultaneously
Learning Content and Usage Factors SimultaneouslyArnab Bhadury
 
Heuristics for the Maximal Diversity Selection Problem
Heuristics for the Maximal Diversity Selection ProblemHeuristics for the Maximal Diversity Selection Problem
Heuristics for the Maximal Diversity Selection ProblemIJMER
 
Initialization methods for the tsp with time windows using variable neighborh...
Initialization methods for the tsp with time windows using variable neighborh...Initialization methods for the tsp with time windows using variable neighborh...
Initialization methods for the tsp with time windows using variable neighborh...Konstantinos Giannakis
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksTomaso Aste
 
Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruningVasileiosMezaris
 
CCC-Bicluster Analysis for Time Series Gene Expression Data
CCC-Bicluster Analysis for Time Series Gene Expression DataCCC-Bicluster Analysis for Time Series Gene Expression Data
CCC-Bicluster Analysis for Time Series Gene Expression DataIRJET Journal
 
Accounting for uncertainty in species delineation during the analysis of envi...
Accounting for uncertainty in species delineation during the analysis of envi...Accounting for uncertainty in species delineation during the analysis of envi...
Accounting for uncertainty in species delineation during the analysis of envi...methodsecolevol
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingLionel Briand
 
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPINGTOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPINGijdkp
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiersamreshkr19
 
Extended pso algorithm for improvement problems k means clustering algorithm
Extended pso algorithm for improvement problems k means clustering algorithmExtended pso algorithm for improvement problems k means clustering algorithm
Extended pso algorithm for improvement problems k means clustering algorithmIJMIT JOURNAL
 

Similar to Orpailleur -- triclustering talk (20)

Design of experiments-Box behnken design
Design of experiments-Box behnken designDesign of experiments-Box behnken design
Design of experiments-Box behnken design
 
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
 
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTINGFAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
 
3 jep publications 1 19-25
3 jep publications 1 19-253 jep publications 1 19-25
3 jep publications 1 19-25
 
On cascading small decision trees
On cascading small decision treesOn cascading small decision trees
On cascading small decision trees
 
I0343047049
I0343047049I0343047049
I0343047049
 
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTINGFAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
FAST DETECTION OF DDOS ATTACKS USING NON-ADAPTIVE GROUP TESTING
 
Learning Content and Usage Factors Simultaneously
Learning Content and Usage Factors SimultaneouslyLearning Content and Usage Factors Simultaneously
Learning Content and Usage Factors Simultaneously
 
Heuristics for the Maximal Diversity Selection Problem
Heuristics for the Maximal Diversity Selection ProblemHeuristics for the Maximal Diversity Selection Problem
Heuristics for the Maximal Diversity Selection Problem
 
Hmtc1300663
Hmtc1300663Hmtc1300663
Hmtc1300663
 
CoopLoc Technical Presentation
CoopLoc Technical PresentationCoopLoc Technical Presentation
CoopLoc Technical Presentation
 
Initialization methods for the tsp with time windows using variable neighborh...
Initialization methods for the tsp with time windows using variable neighborh...Initialization methods for the tsp with time windows using variable neighborh...
Initialization methods for the tsp with time windows using variable neighborh...
 
Probabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering NetworksProbabilistic Modelling with Information Filtering Networks
Probabilistic Modelling with Information Filtering Networks
 
Fractional step discriminant pruning
Fractional step discriminant pruningFractional step discriminant pruning
Fractional step discriminant pruning
 
CCC-Bicluster Analysis for Time Series Gene Expression Data
CCC-Bicluster Analysis for Time Series Gene Expression DataCCC-Bicluster Analysis for Time Series Gene Expression Data
CCC-Bicluster Analysis for Time Series Gene Expression Data
 
Accounting for uncertainty in species delineation during the analysis of envi...
Accounting for uncertainty in species delineation during the analysis of envi...Accounting for uncertainty in species delineation during the analysis of envi...
Accounting for uncertainty in species delineation during the analysis of envi...
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
 
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPINGTOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
TOWARDS MORE ACCURATE CLUSTERING METHOD BY USING DYNAMIC TIME WARPING
 
Performance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various ClassifiersPerformance Evaluation: A Comparative Study of Various Classifiers
Performance Evaluation: A Comparative Study of Various Classifiers
 
Extended pso algorithm for improvement problems k means clustering algorithm
Extended pso algorithm for improvement problems k means clustering algorithmExtended pso algorithm for improvement problems k means clustering algorithm
Extended pso algorithm for improvement problems k means clustering algorithm
 

More from Dmitrii Ignatov

Interpretable Concept-Based Classification with Shapley Values
Interpretable Concept-Based Classification with Shapley ValuesInterpretable Concept-Based Classification with Shapley Values
Interpretable Concept-Based Classification with Shapley ValuesDmitrii Ignatov
 
AIST2019 – opening slides
AIST2019 – opening slidesAIST2019 – opening slides
AIST2019 – opening slidesDmitrii Ignatov
 
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...Dmitrii Ignatov
 
Personal Experiences of Publishing with Springer from both Editor and Author ...
Personal Experiences of Publishing with Springer from both Editor and Author ...Personal Experiences of Publishing with Springer from both Editor and Author ...
Personal Experiences of Publishing with Springer from both Editor and Author ...Dmitrii Ignatov
 
On the Family of Concept Forming Operators in Polyadic FCA
On the Family of Concept Forming Operators in Polyadic FCAOn the Family of Concept Forming Operators in Polyadic FCA
On the Family of Concept Forming Operators in Polyadic FCADmitrii Ignatov
 
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016Dmitrii Ignatov
 
Searching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsSearching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsDmitrii Ignatov
 
Experimental Economics and Machine Learning workshop
Experimental Economics and Machine Learning workshopExperimental Economics and Machine Learning workshop
Experimental Economics and Machine Learning workshopDmitrii Ignatov
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesDmitrii Ignatov
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clusteringDmitrii Ignatov
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clusteringDmitrii Ignatov
 
AIST 2016 Opening Slides
AIST 2016 Opening SlidesAIST 2016 Opening Slides
AIST 2016 Opening SlidesDmitrii Ignatov
 
Putting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReducePutting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReduceDmitrii Ignatov
 
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix FactorisationContext-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix FactorisationDmitrii Ignatov
 
Pattern Mining and Machine Learning for Demographic Sequences
Pattern Mining and Machine Learning for Demographic SequencesPattern Mining and Machine Learning for Demographic Sequences
Pattern Mining and Machine Learning for Demographic SequencesDmitrii Ignatov
 
RAPS: A Recommender Algorithm Based on Pattern Structures
RAPS: A Recommender Algorithm Based on Pattern StructuresRAPS: A Recommender Algorithm Based on Pattern Structures
RAPS: A Recommender Algorithm Based on Pattern StructuresDmitrii Ignatov
 
Поиск частых множеств признаков (товаров) и ассоциативные правила
Поиск частых множеств признаков (товаров) и ассоциативные правилаПоиск частых множеств признаков (товаров) и ассоциативные правила
Поиск частых множеств признаков (товаров) и ассоциативные правилаDmitrii Ignatov
 
Введение в рекомендательные системы. 3 case-study без NetFlix.
Введение в рекомендательные системы. 3 case-study без NetFlix.Введение в рекомендательные системы. 3 case-study без NetFlix.
Введение в рекомендательные системы. 3 case-study без NetFlix.Dmitrii Ignatov
 
Intro to Data Mining and Machine Learning
Intro to Data Mining and Machine LearningIntro to Data Mining and Machine Learning
Intro to Data Mining and Machine LearningDmitrii Ignatov
 

More from Dmitrii Ignatov (20)

Interpretable Concept-Based Classification with Shapley Values
Interpretable Concept-Based Classification with Shapley ValuesInterpretable Concept-Based Classification with Shapley Values
Interpretable Concept-Based Classification with Shapley Values
 
AIST2019 – opening slides
AIST2019 – opening slidesAIST2019 – opening slides
AIST2019 – opening slides
 
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...
 
Personal Experiences of Publishing with Springer from both Editor and Author ...
Personal Experiences of Publishing with Springer from both Editor and Author ...Personal Experiences of Publishing with Springer from both Editor and Author ...
Personal Experiences of Publishing with Springer from both Editor and Author ...
 
On the Family of Concept Forming Operators in Polyadic FCA
On the Family of Concept Forming Operators in Polyadic FCAOn the Family of Concept Forming Operators in Polyadic FCA
On the Family of Concept Forming Operators in Polyadic FCA
 
Sequence mining
Sequence miningSequence mining
Sequence mining
 
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016
 
Searching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensorsSearching for optimal patterns in Boolean tensors
Searching for optimal patterns in Boolean tensors
 
Experimental Economics and Machine Learning workshop
Experimental Economics and Machine Learning workshopExperimental Economics and Machine Learning workshop
Experimental Economics and Machine Learning workshop
 
Pattern-based classification of demographic sequences
Pattern-based classification of demographic sequencesPattern-based classification of demographic sequences
Pattern-based classification of demographic sequences
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clustering
 
A lattice-based consensus clustering
A lattice-based consensus clusteringA lattice-based consensus clustering
A lattice-based consensus clustering
 
AIST 2016 Opening Slides
AIST 2016 Opening SlidesAIST 2016 Opening Slides
AIST 2016 Opening Slides
 
Putting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReducePutting OAC-triclustering on MapReduce
Putting OAC-triclustering on MapReduce
 
Context-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix FactorisationContext-Aware Recommender System Based on Boolean Matrix Factorisation
Context-Aware Recommender System Based on Boolean Matrix Factorisation
 
Pattern Mining and Machine Learning for Demographic Sequences
Pattern Mining and Machine Learning for Demographic SequencesPattern Mining and Machine Learning for Demographic Sequences
Pattern Mining and Machine Learning for Demographic Sequences
 
RAPS: A Recommender Algorithm Based on Pattern Structures
RAPS: A Recommender Algorithm Based on Pattern StructuresRAPS: A Recommender Algorithm Based on Pattern Structures
RAPS: A Recommender Algorithm Based on Pattern Structures
 
Поиск частых множеств признаков (товаров) и ассоциативные правила
Поиск частых множеств признаков (товаров) и ассоциативные правилаПоиск частых множеств признаков (товаров) и ассоциативные правила
Поиск частых множеств признаков (товаров) и ассоциативные правила
 
Введение в рекомендательные системы. 3 case-study без NetFlix.
Введение в рекомендательные системы. 3 case-study без NetFlix.Введение в рекомендательные системы. 3 case-study без NetFlix.
Введение в рекомендательные системы. 3 case-study без NetFlix.
 
Intro to Data Mining and Machine Learning
Intro to Data Mining and Machine LearningIntro to Data Mining and Machine Learning
Intro to Data Mining and Machine Learning
 

Recently uploaded

Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

Orpailleur -- triclustering talk

  • 1. Dmitry V. Gnatyshak, Dmitry I. Ignatov*, Sergei O. Kuznetsov School of Applied Mathematics and Information Science & Intelligence Systems and Structural Analysis Lab NRU Higher School of Economics, Moscow, Russia LORIA Orpailleur meeting, Nancy, France, 2013
  • 2. Outline 1. Motivation and problem setting 2. FCA basic definitions 3. Triclustering methods 4. Experiments 5. Conclusion 2
  • 3. Motivation  A large amount of structured and unstructured data generates triadic data.  Example: folksonomy is a set of triples (user, object, tag)  Examples:  Bibsonomy.org (user, bookmark, tag)  Social networking sites (user, group, interest)  Delicious (user, link, tag) 3
  • 4. Main goals 1. Comparison of some triclustering methods 2. Development of a toolbox for triclustering experiments 3. New possibly better methods 4. Possible applications 4
  • 5. FCA: basic definitions Biology Mathematics Computer Science Chemistry Kate x x Mike x x x Alex x x Pete x x x 5  (R. Wille, 1982; B. Ganter, R. Wille, 1999)
  • 6. FCA: basic definitions  6 Biology Mathematics Computer Science Chemistry Kate x x Mike x x x Alex x x Pete x x x
  • 7. FCA: basic definitions  7 Biology Mathematics Computer Science Chemistry Kate x x Mike x x x Alex x x Pete x x x
  • 8. Triadic FCA: basic definitions  8 (F. Lehmann, R. Wille, 1995)
  • 9. OAC-triclusters (based on box operators)  Box operators 9 (D. Ignatov et al., 2011) … … … …
  • 10. OAC-triclusters (based on prime operators)  Prime-operators of singletons 10
  • 12. TriBox  12 (A. Kramarenko & B. Mirkin, 2011)
  • 13. Spectral Triclustering: SpecTric  13 (D. Ignatov & Z. Sekinaeva, 2011; Ignatov et al. 2013)
  • 14. Spectral Triclustering: SpecTric 14 (D. Ignatov & Z. Sekinaeva, 2011; Ignatov et al. 2013)
  • 16. Experiments  Main goals:  Fault-tolerance test  Comparison by criteria: time, quantity, mean density, coverage and diversity  For TriBox and OAC-triclustering we implemented their parallel versions  They were included to the comparison 16
  • 21. Results (time, quantity, average density, coverage, diversity) Method T,ms # OAC (box) 407 73 9,88 100,00 0,00 0,00 0,00 0,00 OAC (prime) 312 2659 32,23 100,00 92,51 60,07 59,80 59,45 SepcTric 277 5 8,74 8,84 100,00 100,00 100,00 100,00 TriBox 6218 1011 74,00 96,02 97,42 66,25 79,53 84,80 TRIAS 29367 38356 100,00 100,00 99,99 99,93 4,07 3,51 IMDB OAC (box) 2314 1500 1,84 100,00 15,65 9,67 0,70 7,87 OAC (prime) 547 1274 53,85 100,00 96,55 94,56 92,14 28,52 Spectric 98799 21 17,07 20,88 100,00 100,00 100,00 100,00 TriBox 197136 328 91,65 98,90 98,89 98,46 95,21 30,94 TRIAS 102554 1956 100,00 100,0 99,89 99,69 52,52 26,18 BibSonomy OAC (box) 19297 398 4,16 100,00 79,59 67,28 42,83 79,54 OAC (prime) 13556 1289 94,66 100,00 99,74 88,58 99,51 99,53 SpecTric 5906563 2 50,00 100,00 100,00 100,00 100,00 100,00 TriBox Time> 24 hours TRIAS 110554 1305 100,00 100,00 99,98 91,70 99,78 99,92 21
  • 22. Method Time Quantity Average density Coverage Diversity Efficiency of parallel version OAC(box) average large low high~very low very low~average high OAC (prime) small large average high~average average~high low SpecTric Small for small contexts small low average~high 1 – TriBox high average high high high high TRIAS very large 1 high~low high~low – 22 Results (time, quantity, average density, coverage, diversity)
  • 23. Conclusion  There is no a winner according to the comparison criteria  Method TriBox shows best results but it takes huge computational time  OAC-triclustering based on prime operators gives the second best results and it is sufficiently fast 23
  • 24. Conclusion  There is no a winner according to the comparison criteria  Details by methods:  TRIAS  High elapsed time  Too large number of small well-interpreted triclusters (triconcepts) 24
  • 25. Conclusion  OAC (box operators)  Large triclusters of low density  High density, small diversity  An efficient parallelization  OAC (prime-operators)  High speed of computations  Large number of dense well-interpreted triclusters  Low efficiency of parallelization 25
  • 26. Conclusion  Spectral Triclustering  High computational speed on small contexts  Well-interpreted triclusters but of the low density  Diversity is always equals to 1, but it causes too low coverage  TriBox  A moderate number of well-interpreted triclusters  High elapsed time  Efficient parallelization  Reasonably high coverage and diversity 26