Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Ontology-based Data Summarization to Develop Semantics-aware Recommender Systems. ESWC 2018

161 views

Published on

Within content-based recommendation systems, Linked Data have been already proposed as a valuable source of information to enhance the predictive power of recommender systems not only in terms of accuracy but also of diversity and novelty of results. In this direction, one of the main open issues in using Linked Data to feed a recommendation engine is related to feature selection: how to select only the most relevant subset of the original Linked Data thus avoiding both useless processing of data and the so-called “curse of dimensionality” problem. In this paper, we show how ABSTAT, an ontology-based (linked) data summarization framework, can drive the selection of properties/features useful to a recommender system. In particular, we compare a fully automated feature selection method based on ontology-based data summaries with more classical ones, and we evaluate the performance of these methods in terms of accuracy and aggregate diversity of a recommender system exploiting the top-k selected features. We set up an experimental testbed relying on datasets related to different knowledge domains. Results show the feasibility of a feature selection process driven by ontology-based data summaries for Linked Data-enabled recommender systems.

This is a joint work between the SisInf Lab at Polytechnic University of Bari, and the INSID&ES Lab at University of Milan-Bicocca, which was presented at ESWC 2018 Research Track.

The paper can be found at https://link.springer.com/chapter/10.1007%2F978-3-319-93417-4_9

Published in: Science
  • Be the first to comment

  • Be the first to like this

Using Ontology-based Data Summarization to Develop Semantics-aware Recommender Systems. ESWC 2018

  1. 1. Using Ontology-based Data Summarization to Develop Semantics-aware Recommender Systems Tommaso Di Noia*, Corrado Magarelli* Andrea Maurino°, Matteo Palmonari°, Anisa Rula°** *Polytechnic University of Bari °University of Milano-Bicocca **SDA, University of Bonn This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreements n. 732003 and n. 732590
  2. 2. Outline • Feature Selection for Semantics-aware Recommender Systems • Ontology-based Data Summarization with ABSTAT • Feature Selection (ABSTAT vs. Information Gain) • Experiments • Conclusions and Future Work 2
  3. 3. Outline • Feature Selection for Semantics-aware Recommender Systems • Ontology-based Data Summarization with ABSTAT • Feature Selection (ABSTAT vs. Information Gain) • Experiments • Conclusions and Future Work 3
  4. 4. Recommender Systems • Help users in dealing with information/choice overload • Help to match users with items 4
  5. 5. Several Recommender Systems perfectly work without using any content! (e.g.Amazon) Collaborative Filtering and Matrix Factorization are state of the art techniques for implementing Recommender Systems (ACM RecSys 2009, by Neflix Challenge winners) Why do we need content? Content can tackle some issues of collaborative filtering 5
  6. 6. Collaborative Filtering issues: sparsity Why do we need content? 6
  7. 7. Why do we need content? ? Collaborative Filtering issues: new item problem7
  8. 8. Why do we need content? Who knows the «customers who bought…»? Collaborative Filtering issues: poor explanations!8
  9. 9. Content-based Semantic Recommendations • Basic item KNN recommender system • Given an user u a non rated item i, the rating of i is predicted by: where: • N(i) = neighbors of the non rated item i • r(u) = the items rated by the user u, • r(u,j) = the rating value given to the item i by the user u Similarity functions: • Jaccard • Graph kernels • Cosine similarity in a vector space • … several variants … all based on subgraphs built using certain properties9
  10. 10. Content-based Semantic Recommendations Similarity functions: • Jaccard • Graph kernels • Cosine similarity in a vector space • … several variants … all based on subgraphs built using certain properties10
  11. 11. The Feature Selection Problem • Features = properties for similarity evaluation • Ontological properties? • Categorical properties? • Frequent properties? • Feature selection • Usually performed manually (ex-post) • With statistical measures [Musto&al.UMAP2016] • With ontology-based data summaries (this paper) • Fully automatic feature selection with ABSTAT profiles • (Manual pre-processing + frequency-based ranking with ABSTAT profiles + graph kernel similarity [Ragone&al.SAC2017] ) The course of dimensionality 11
  12. 12. Ontology-based Data Summarization vs. Statistical Techniques • Statistical measures • Download the full dataset • Compute statistical measures over the full dataset • Keep only the data of interest • Run the algorithm • Profiles (efficiently accessible via web) • Ask for top-k most useful properties • e.g., via API • Download only the relevant data • Run the algorithm 12
  13. 13. Outline • Feature Selection for Semantics-aware Recommender Systems • Ontology-based Data Summarization with ABSTAT • Feature Selection (ABSTAT vs. Information Gain) • Experiments • Conclusions and Future Work 13
  14. 14. Ontology-driven Knowledge Graph Summarization Profiling with ABSTAT Minimal Type Patterns: there exist entities that have Company as minimal type, which are linked to literals that have gYear as minimal type by the property foundingYear Occurrence of types and properties Frequency and instances: how many times this pattern occurs as minimal type pattern and as a pattern. Instances count considers pattern inference Cardinality descriptors: max/avg/min number of different subjects associated with a same object (and vice versa) For more details: abstat.disco.unimib.it and [ESWC2016-demo, SUMPRE2016, ESWC2018-demo] 14
  15. 15. Ontology-driven Knowledge Graph Summarization Profiling with ABSTAT Minimal Type Patterns: there exist entities that have Company as minimal type, which are linked to literals that have gYear as minimal type by the property foundingYear Occurrence of types and properties Frequency and instances: how many times this pattern occurs as minimal type pattern and as a pattern. Instances count considers pattern inference Cardinality descriptors: max/avg/min number of different subjects associated with a same object (and vice versa) For more details: abstat.disco.unimib.it and [ESWC2016-demo, SUMPRE2016, ESWC2018-demo] 15
  16. 16. ABSTAT: Cardinality Descriptors Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property 16
  17. 17. ABSTAT: Cardinality Descriptors Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 1 For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property 17
  18. 18. ABSTAT: Cardinality Descriptors Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 3 For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property 18
  19. 19. ABSTAT: Cardinality Descriptors Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property 2 1 1 1 1 3 1 19
  20. 20. ABSTAT: Cardinality Descriptors Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 1 For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property 20
  21. 21. ABSTAT: Cardinality Descriptors Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 2 For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property 21
  22. 22. ABSTAT: Cardinality Descriptors For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 2 1 2 2 1 2 22
  23. 23. ABSTAT: Cardinality Descriptors For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 [minO,avgO,maxO] Global cardinality descriptors Local cardinality descriptors Thing Thing [1,5,249] [1,1,13] cinematogaphy Film Person [1,14,249] [1,1,7] cinematogaphy [minS,avgS,maxS] 23
  24. 24. Cardinality Descriptors for Feature Selection Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property 2 1 1 1 1 3 1 24
  25. 25. Cardinality Descriptors for Feature Selection Subjects Objects MinS(π) = 1 MaxS(π) = 3 AvgS(π) = 1,4 ≈ 1 MinO(π) = 1 MaxO(π) = 2 AvgO(π) = 1,7 ≈ 2 For each pattern π • MinS, AvgS, MaxS • Min/Avg/Max number of distinct subjects associated with unique objects in the triples represented by π • MinO, AvgO, MaxO • Min/Avg/Max number of distinct objects associated with unique subjects in the triples represented by π • Local vs. global • Local: for patterns • Global: for properties, i.e., all triples with a property 2 1 1 1 1 3 1 + frequency! 25
  26. 26. Outline • Feature Selection for Semantics-aware Recommender Systems • Ontology-based Data Summarization with ABSTAT • Feature Selection (ABSTAT vs. Information Gain) • Experiments • Conclusions and Future Work 26
  27. 27. Feature Selection with ABSTAT FILTERING (local cardinality descriptor) RANKING (value*) SELECTION (k-properties) PATTERNS avgS > 1 DESC(frequency) k=2 *values of pattern frequency, local cardinality descriptor, or a combination of the first two. PROJECTION (property, MAX(value(*)) P, MAX(frequency) Properties 27
  28. 28. Feature Selection with ABSTAT DESC(frequency*maxS) k=5 *values of pattern frequency, local cardinality descriptor, or a combination of the first two. P, MAX(frequency*maxS) FILTERING (local cardinality descriptor) RANKING (value*) SELECTION (k-properties) PATTERNS PROJECTION (property, MAX(value(*)) 28
  29. 29. Feature Selection with IG • Different statistical measures tested: Information Gain, Information Gain Ratio, Chi-squared test • Information Gain: expected reduction in entropy occurring when a feature is present versus when it is absent. • For a feature fi , IG is defined as • where E( I ) is the entropy of the data, Iv is the number of items in which the feature fi (e.g., director for movies) has a value equal to v (e.g., F.F.Coppola in the movie domain), and E( Iv ) is the entropy computed on data where the feature fi assumes value v. 29
  30. 30. Feature Selection with IG: preprocessing • Manual pre-processing is required • Reduce redundant or irrelevant features that are expected to bring little value to the recommendation task, but, at the same time, pose scalability issues Dataset # of features before pre-processing # of feaatures after pre-processing Movielens 148 34 LastFM 271 25 LibraryThing 201 22 30
  31. 31. Outline • Feature Selection for Semantics-aware Recommender Systems • Ontology-based Data Summarization with ABSTAT • Feature Selection (ABSTAT vs. Information Gain) • Experiments • Conclusions and Future Work 31
  32. 32. Experimental Settings: Recommendation Method • Content-based using an item-based nearest neighbors algorithm [Di Noia & al.TIST2016] • Given a set of entities rated by the user (=user profile), • predict the rate only for the k nearest neighbors of the rated items • Jaccard item similarity Rating prediction on k most similar items to the items rated by the user 32
  33. 33. Experimental Settings: Recommendation Method • Jaccard similarity • Measures the values of the selected features that are shared between two items 33
  34. 34. Experimental Setting: Datasets & Measures Datasets • One-to-one mapping between RecSys benchmarks and DBpedia [Di Noia & al.TIST2016]: • MovieLens  DBpedia (3883 Movies) • Last.fm  DBpedia (17632 Artists) • The Library Thing  DBpedia (37231 Books) • DBpedia-2015-10, including infoboxes (392M triples) Metrics • Accuracy: • Precision@N: fraction of relevant items in the Top-N recommendations • MRR@N: average reciprocal rank of the first relevant recommended item • Diversity: • catalog coverage: percentage of items in the catalog recommended at least once • aggregate diversity: aggregate entropy 34
  35. 35. Experimental Setting: Datasets & Measures • Novelty • Recommend items in the long tail • Diversity • Avoid to recommend only items in a small subset of the catalog • Suggest diverse items in the recommendation list • Serendipity • Suggest unexpected but interesting items Is all about precision? 35
  36. 36. Experimental Settings: dbo vs. dbp properties • Number of features/properties: 5, 20 • Which DBpedia? dbo (DBpedia Ontology ) vs. dbp (infobox) properties • noRep best ranked property between dbo and dbp • withRep keep duplicates • Onlydbp in case of duplicates, only the dbp • Onlydbo in case of duplicates, only the dbo • IG vs. ABSTAT configurations: Name Filter by Ranking Intuitive AbsFreqAvgS AvgS > 1 Frequency Only properties that map at least two distinct subjects to one obejct (on average) ranked by frequency AbsFreq*MaxS NO FILTER Frequency*maxS Favors properties that are more frequent and map a higher number of distinct subjects to one object AbsMaxS NO FILTER MaxS Favors properties that map a higher number of distinct subjects to one object Tf-idf (baseline) NO FILTER Tfidf over patterns Favors properties that are more peculiar to the domain type36
  37. 37. MovieLens • ABSTAT-based FS almost always “statistically better” then statistical measures • Few “local” exceptions • Different configurations optimize different measures • AbsOccAvgS vs AbsMaxS • Tfidf ABSTAT baseline good for aggregate entropy Precision@10 MRR@10 catalogCoverage@10 aggrEntropy@10 Top K features 5 20 5 20 5 20 5 20 withrep.IG .0658 .1078 .2192 .3417 .3829 .5280 7.56 8.50 withrep.AbsFreqAvgS .1059 .1081 .3380 .3477 .5398 .5253 8.70 8.53 withrep.AbsFreq*MaxS .0967 .1074 .3274 .3541 .5962 .5247 8.87 8.54 withrep.AbsMaxS .0919 .1030 .3065 .3400 .6016 .5698 8.96 8.66 withrep.TfIdf .0565 .0851 .2267 .3326 .4347 .3360 8.36 7.80 norep.IG .0841 .1076 .2961 .3390 .3372 .5226 7.94 8.44 norep.AbsFreqAvgS .1066 .1076 .3388 .3400 .5344 .5208 8.68 8.45 norep.AbsMaxS .0885 .1063 .3075 .3467 .6234 .5550 8.99 8.60 norep.TfIdf .0823 .0856 .2994 .3123 .3520 .3908 7.83 7.99 dbo.IG .0841 .1076 .2961 .3390 .3372 .5226 7.94 8.44 dbo.AbsFreqAvgS .1066 .1067 .3388 .3402 .5344 .5208 8.68 8.51 dbo.AbsMaxS .0885 .1059 .3075 .3464 .6234 .5535 8.99 8.60 dbo.TfIdf .0823 .0856 .2994 .3123 .3520 .3908 7.83 7.99 dbp.IG .0688 .1046 .2134 .3336 .2799 .5065 6.54 8.31 dbp.AbsFreqAvgS .1065 .1059 .3408 .3360 .5426 .5105 8.64 8.31 dbp.AbsMaxS .0908 .1030 .3124 .3396 .6219 .5395 8.98 8.52 dbp.TfIdf .0549 .0745 .1924 .2687 .2530 .3575 6.33 7.41 value = “local” best value = “global” best = highlights on global best results 37
  38. 38. LastFM • ABSTAT-based FS better or comparable • In most cases, not “statistically better” • Reminder: still advantage for • not running statistical measures on the full dataset • no manual preprocessing • Tfidf ABSTAT baseline good for aggregate entropy Precision@10 MRR@10 catalogCoverage@10 aggrEntropy@10 Top K features 5 20 5 20 5 20 5 20 withrep.IG .0501 .1325 .2283 .4102 0.4290 0.5051 11 11.18 withrep.AbsFreqAvgS .1330 .1320 .4047 .4105 0.4812 0.5036 11.1 11.18 withrep.AbsFreq*MaxS .1102 .1227 .3649 .3749 0.5500 0.5332 11.4 11.36 withrep.AbsMaxS .0371 .1156 .1249 .3691 0.1680 0.5440 9.79 11.39 withrep.TfIdf .1017 .1158 .2960 .3584 0.4210 0.4602 10.86 10.97 norep.IG .0501 .1311 .2283 .4040 0.429 0.5018 11 11.17 norep.AbsFreqAvgS .1305 .1307 .3994 .4074 0.489 0.5019 11.11 11.15 norep.AbsFreq*MaxS .1062 .1228 .3546 .3708 0.5362 0.5161 11.4 11.29 norep.AbsMaxS .0392 .1227 .1952 .3715 0.452 0.5344 11.09 11.34 norep.TfIdf .1024 .1132 .3064 .3554 0.4026 0.4508 10.76 10.96 dbo.IG .0411 .1319 .1989 .4083 0.4425 0.5053 11.06 11.2 dbo.AbsFreqAvgS .1283 .1292 .3986 .4063 0.4915 0.4949 11.14 11.14 dbo.AbsFreq*MaxS .1062 .1214 .3546 .3710 0.5362 0.5109 11.4 11.27 dbo.AbsMaxS .0381 .1211 .1927 .3727 0.4291 0.5222 10.97 11.31 dbo.TfIdf .1024 .1132 .3064 .3554 0.4026 0.4508 10.76 10.96 dbp.IG .0678 .1319 .2553 .4083 0.4364 0.5053 10.83 11.2 dbp.AbsFreqAvgS .1319 .1316 .4026 .4113 0.4926 0.5055 11.14 11.2 dbp.AbsFreq*MaxS .1065 .1239 .3580 .3773 0.5444 0.527 11.42 11.36 dbp.AbsMaxS .0401 .1105 .1969 .3553 0.4528 0.5447 11.08 11.42 dbp.TfIdf .079 .1170 .2371 .3572 0.3894 0.4698 10.69 11.04 value = “local” best value = “global” best = highlights on global best results 38
  39. 39. The Library Thing • ABSTAT-based FS obtains global better but • IG better for Catalog Coverage with 20 feature • IG better for some duplicate property management strategies • Differences are stastistical relevant only in some cases • Tfidf ABSTAT baseline good for aggregate entropy Precision@10 MRR@10 catalogCoverage@10 aggrEntropy@10 Top K features 5 20 5 20 5 20 5 20 withrep.IG .0576 .0588 .2348 .2273 .3983 .4034 10.47 10.44 withrep.AbsOccAvgS .0458 .0568 .2003 .2343 .3670 .4014 10.28 10.50 withrep.AbsOcc*MaxS .0457 .0560 .2116 .2355 .3854 .3826 10.54 10.21 withrep.AbsMaxS .0571 .0567 .2319 .2360 .3689 .4011 10.24 10.29 withrep.TfIdf .0215 .0145 .1607 .1202 .1314 .2349 8.81 9.75 norep.IG .0571 .0579 .2346 .2274 .3988 .4037 10.47 10.44 norep.AbsOccAvgS .0561 .0593 .2328 .2329 .3982 .4030 10.54 10.48 norep.AbsOcc*MaxS .0459 .0570 .2119 .2372 .3852 .3809 10.54 10.15 norep.AbsMaxS .0541 .0567 .2301 .2365 .3653 .4008 10.24 10.29 norep.TfIdf .0215 .0138 .1608 .1211 .1314 .2877 8.81 10.26 dbo.IG .0571 .0579 .2346 .2274 .3988 .4037 10.47 10.44 dbo.AbsOccAvgS .0561 .0593 .2328 .2329 .3982 .4030 10.54 10.48 dbo.AbsOcc*MaxS .0459 .0570 .2119 .2372 .3852 .3809 10.54 10.15 dbo.AbsMaxS .0541 .0567 .2301 .2365 .3653 .4008 10.24 10.29 dbo.TfIdf .0579 .0605 .2374 .2477 .4086 .3991 10.55 10.20 dbp.IG .0586 .0586 .2350 .2299 .4027 .4043 10.49 10.4 dbp.AbsOccAvgS .0623 .0612 .2467 .2342 .3943 .4043 10.42 10.45 dbp.AbsOcc*MaxS .0464 .0606 .2126 .2504 .3862 .3797 10.53 10.07 dbp.AbsMaxS .0571 .0592 .2318 .2398 .3689 .4002 10.24 10.22 dbp.TfIdf .0215 .0132 .1608 .1218 .1314 .2696 8.81 9.96 value = “local” best value = “global” best = highlights on global best results 39
  40. 40. Outline • Feature Selection for Semantics-aware Recommender Systems • Ontology-based Data Summarization with ABSTAT • Feature Selection (ABSTAT vs. Information Gain) • Experiments • Conclusions and Future Work 40
  41. 41. Conclusions & Future Work • Conclusions • Fully automatic feature selection method with ontology-based knowledge graph summaries (ABSTAT) • Better or, in some cases, comparable to statistical measures, but without requiring computation over the full dataset • Additional evidence of informativeness of ABSTAT-based summaries • Future work • Add Tfidf in ABSTAT stats • Experiments with additional measures (e.g., graph-based measures with path longer than 1) • API-based suggestion of most salient properties for an input entity type 41
  42. 42. Contacts: palmonari@disco.unimib.it - tommaso.dinoia@poliba.it This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreements n. 732003 and n. 732590 Supporting Event and Weather- based Data Analytics and Marketing along the Shopper Journeywww.ew-shopp.eu Enabling the European Business Graph for Innovative Data Products and Services www.eubusinessgraph.eu/ Experiments & code: https://zenodo.org/record/12 05712# .WrRCypPwa3U http://ow.ly/zAA530d0wu0 ABSTAT (open source) code: https://bitbucket.org/disco_u nimib/abstat-core ABSTAT home: abstat.disco.unimib.it 42
  43. 43. Appendix: Explanations for Better/worst Performance Domain Type # Minimal Patterns Avg # Triples Variance Movies dbo:Film 57757 74.02 549.31 Books dbo:Book 41684 44.97 169.48 Music dbo:Artist 40491 80.50 981.51 43
  44. 44. Appendix: Explanations for Better/worst Performance 44 Top 20 selected features for the MovieLens dataset by using the different con- figurations of IG and AbsFreqAvgS.

×