In this work we propose and study an approach for collaborative filtering, which is based on Boolean matrix factorisation and exploits additional (context) information about users and items. To avoid similarity loss in case of Boolean representation we use an adjusted type of projection of a target user to the obtained factor space.
We have compared the proposed method with SVD-based approach on the MovieLens dataset. The experiments demonstrate that the proposed method has better MAE and Precision and comparable Recall and F-measure. We also report an increase of quality in the context information presence.
Hybridisation Techniques for Cold-Starting Context-Aware Recommender SystemsMatthias Braunhofer
Context-Aware Recommender Systems (CARSs) suffer from the cold-start problem, i.e., the inability to provide accurate recommendations for new users, items or contextual situations. In this research, we aim at solving this problem by exploiting various hybridisation techniques, from simple heuristic-based solutions to complex adaptive solutions, in order to take advantage of the strengths of different CARS algorithms while avoiding their weaknesses in a given (cold-start) situation. Our initial research based on offline experiments using various contextually-tagged rating datasets has shown that basic CARS algorithms perform very differently in different recommendation scenarios, and that they can be effectively hybridised to achieve an overall optimal performance. Further research is now required to find the optimal method for hybridisation.
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...Matthias Braunhofer
Context-Aware Recommender System (CARS) models are trained on datasets of context-dependent user preferences (ratings and context information). Since the number of context-dependent preferences increases exponentially with the number of contextual factors, and certain contextual in- formation is still hard to acquire automatically (e.g., the user’s mood or for whom the user is buying the searched item) it is fundamental to identify and acquire those factors that truly influence the user preferences and the ratings. In particular, this ensures that (i) the user effort in specifying contextual information is kept to a minimum, and (ii) the system’s performance is not negatively impacted by irrelevant contextual information. In this paper, we propose a novel method which, unlike existing ones, directly estimates the impact of context on rating predictions and adaptively identifies the contextual factors that are deemed to be useful to be elicited from the users. Our experimental evaluation shows that it compares favourably to various state-of-the-art context selection methods.
Cold-Start Management with Cross-Domain Collaborative Filtering and TagsMatthias Braunhofer
Recommender systems suffer from the new user problem, i.e., the difficulty to make accurate predictions for users that have rated only few items. Moreover, they usually compute recommendations for items just in one domain, such as movies, music, or books. In this paper we deal with such a cold-start situation exploiting cross-domain recommendation techniques, i.e., we suggest items to a user in one target domain by using ratings of other users in a, completely disjoint, auxiliary domain. We present three rating prediction models that make use of information about how users tag items in an auxiliary domain, and how these tags correlate with the ratings to improve the rating prediction task in a different target domain. We show that the proposed techniques can effectively deal with the considered cold-start situation, given that the tags used in the two domains overlap.
Recommendation algorithm using reinforcement learningArithmer Inc.
Slide for study session given by Lu Juanjuan at Arithmer inc.
It is a summary of recent methods for recommendation system using reinforcement learning.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
This is slides used at Arithmer seminar given by Dr. Masaaki Uesaka at Arithmer inc.
It is a summary of recent methods for quality assurance of machine learning model.
Arithmer Seminar is weekly held, where professionals from within our company give lectures on their respective expertise.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
In this presentation we present a novel context-aware mobile recommender system for places of interest (POIs). Unlike existing systems, which learn users' preferences solely from their past ratings, it considers also their personality - using the Five Factor Model. Personality is acquired by asking users to complete a brief and entertaining questionnaire as part of the registration process, and is then exploited in: (1) an active learning module that actively acquires ratings-in-context for POIs that users are likely to have experienced, hence reducing the stress and annoyance to rate (or skip rating) items that the users don’t know; and (2) in the recommendation model that builds up on matrix factorization and therefore can be trained even if the users haven’t rated any items yet.
Hybridisation Techniques for Cold-Starting Context-Aware Recommender SystemsMatthias Braunhofer
Context-Aware Recommender Systems (CARSs) suffer from the cold-start problem, i.e., the inability to provide accurate recommendations for new users, items or contextual situations. In this research, we aim at solving this problem by exploiting various hybridisation techniques, from simple heuristic-based solutions to complex adaptive solutions, in order to take advantage of the strengths of different CARS algorithms while avoiding their weaknesses in a given (cold-start) situation. Our initial research based on offline experiments using various contextually-tagged rating datasets has shown that basic CARS algorithms perform very differently in different recommendation scenarios, and that they can be effectively hybridised to achieve an overall optimal performance. Further research is now required to find the optimal method for hybridisation.
Parsimonious and Adaptive Contextual Information Acquisition in Recommender S...Matthias Braunhofer
Context-Aware Recommender System (CARS) models are trained on datasets of context-dependent user preferences (ratings and context information). Since the number of context-dependent preferences increases exponentially with the number of contextual factors, and certain contextual in- formation is still hard to acquire automatically (e.g., the user’s mood or for whom the user is buying the searched item) it is fundamental to identify and acquire those factors that truly influence the user preferences and the ratings. In particular, this ensures that (i) the user effort in specifying contextual information is kept to a minimum, and (ii) the system’s performance is not negatively impacted by irrelevant contextual information. In this paper, we propose a novel method which, unlike existing ones, directly estimates the impact of context on rating predictions and adaptively identifies the contextual factors that are deemed to be useful to be elicited from the users. Our experimental evaluation shows that it compares favourably to various state-of-the-art context selection methods.
Cold-Start Management with Cross-Domain Collaborative Filtering and TagsMatthias Braunhofer
Recommender systems suffer from the new user problem, i.e., the difficulty to make accurate predictions for users that have rated only few items. Moreover, they usually compute recommendations for items just in one domain, such as movies, music, or books. In this paper we deal with such a cold-start situation exploiting cross-domain recommendation techniques, i.e., we suggest items to a user in one target domain by using ratings of other users in a, completely disjoint, auxiliary domain. We present three rating prediction models that make use of information about how users tag items in an auxiliary domain, and how these tags correlate with the ratings to improve the rating prediction task in a different target domain. We show that the proposed techniques can effectively deal with the considered cold-start situation, given that the tags used in the two domains overlap.
Recommendation algorithm using reinforcement learningArithmer Inc.
Slide for study session given by Lu Juanjuan at Arithmer inc.
It is a summary of recent methods for recommendation system using reinforcement learning.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
This is slides used at Arithmer seminar given by Dr. Masaaki Uesaka at Arithmer inc.
It is a summary of recent methods for quality assurance of machine learning model.
Arithmer Seminar is weekly held, where professionals from within our company give lectures on their respective expertise.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
In this presentation we present a novel context-aware mobile recommender system for places of interest (POIs). Unlike existing systems, which learn users' preferences solely from their past ratings, it considers also their personality - using the Five Factor Model. Personality is acquired by asking users to complete a brief and entertaining questionnaire as part of the registration process, and is then exploited in: (1) an active learning module that actively acquires ratings-in-context for POIs that users are likely to have experienced, hence reducing the stress and annoyance to rate (or skip rating) items that the users don’t know; and (2) in the recommendation model that builds up on matrix factorization and therefore can be trained even if the users haven’t rated any items yet.
Recommender systems aim to predict the content that a user would like based on observations of the online behaviour of its users. Research in the Information Access group addresses different aspects of this problem, varying from how to measure recommendation results, how recommender systems relate to information retrieval models, and how to build effective recommender systems (note: last Friday, we won the ACM RecSys 2013 News Recommender Systems challenge). We would like to develop a general methodology to diagnose weaknesses and strengths of recommender systems. In this talk, I discuss the initial results of an analysis of the core component of collaborative filtering recommenders: the similarity metric used to find the most similar users (neighbours) that will provide the basis for the recommendation to be made. The purpose is to shed light on the question why certain user similarity metrics have been found to perform better than others. We have studied statistics computed over the distance distribution in the neighbourhood as well as properties of the nearest neighbour graph. The features identified correlate strongly with measured prediction performance - however, we have not yet discovered how to deploy this knowledge to actually improve recommendations made.
Alleviating cold-user start problem with users' social network data in recomm...Eduardo Castillejo Gil
This work explores the possibility of using relevant data from users’
social network to alleviate the cold-user problems in a recommender
system domain. The proposed solution extracts the most valuable
node in the graph generated by check in a venue with an Android
application using the Foursquare API. By obtaining the recommendations to this node we estimate the probability of some categories
to be similar to users tastes...
Workload-aware materialization for efficient variable elimination on Bayesian...Cigdem Aslay
Bayesian networks are general, well-studied probabilistic models that capture dependencies among a set of variables. Variable Elimination is a fundamental algorithm for probabilistic inference over Bayesian networks. In this paper, we propose a novel materialization method, which can lead to significant efficiency gains when processing inference queries using the Variable Elimination algorithm. In particular, we address the problem of choosing a set of intermediate results to precompute and materialize, so as to maximize the expected efficiency gain over a given query workload. For the problem we consider, we provide an optimal polynomial-time algorithm and discuss alternative methods. We validate our technique using real-world Bayesian networks. Our experimental results confirm that a modest amount of materialization can lead to significant improvements in the running time of queries, with an average gain of 70%, and reaching up to a gain of 99%, for a uniform workload of queries. Moreover, in comparison with existing junction tree methods that also rely on materialization, our approach achieves competitive efficiency during inference using significantly lighter materialization.
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorialAlexandros Karatzoglou
The slides from the Learning to Rank for Recommender Systems tutorial given at ACM RecSys 2013 in Hong Kong by Alexandros Karatzoglou, Linas Baltrunas and Yue Shi.
A Probabilistic U-Net for Segmentation of Ambiguous ImagesSeunghyun Hwang
Review : A Probabilistic U-Net for Segmentation of Ambiguous Images
- by Seunghyun Hwang (Yonsei University, Severance Hospital, Center for Clinical Data Science)
This is the first lecture on Applied Machine Learning. The course focuses on the emerging and modern aspects of this subject such as Deep Learning, Recurrent and Recursive Neural Networks (RNN), Long Short Term Memory (LSTM), Convolution Neural Networks (CNN), Hidden Markov Models (HMM). It deals with several application areas such as Natural Language Processing, Image Understanding etc. This presentation provides the landscape.
Generative Adversarial Networks : Basic architecture and variantsananth
In this presentation we review the fundamentals behind GANs and look at different variants. We quickly review the theory such as the cost functions, training procedure, challenges and go on to look at variants such as CycleGAN, SAGAN etc.
Experimental Economics and Machine Learning workshopDmitrii Ignatov
This presentation summarises recent activities on EEML workshop organisation. In fact, this is a successful event which attracts economists and computers scientists who would like to use recent advances in machine learning and data mining to understand human behavior in different domains related to Economics and Social Science.
Recommender systems aim to predict the content that a user would like based on observations of the online behaviour of its users. Research in the Information Access group addresses different aspects of this problem, varying from how to measure recommendation results, how recommender systems relate to information retrieval models, and how to build effective recommender systems (note: last Friday, we won the ACM RecSys 2013 News Recommender Systems challenge). We would like to develop a general methodology to diagnose weaknesses and strengths of recommender systems. In this talk, I discuss the initial results of an analysis of the core component of collaborative filtering recommenders: the similarity metric used to find the most similar users (neighbours) that will provide the basis for the recommendation to be made. The purpose is to shed light on the question why certain user similarity metrics have been found to perform better than others. We have studied statistics computed over the distance distribution in the neighbourhood as well as properties of the nearest neighbour graph. The features identified correlate strongly with measured prediction performance - however, we have not yet discovered how to deploy this knowledge to actually improve recommendations made.
Alleviating cold-user start problem with users' social network data in recomm...Eduardo Castillejo Gil
This work explores the possibility of using relevant data from users’
social network to alleviate the cold-user problems in a recommender
system domain. The proposed solution extracts the most valuable
node in the graph generated by check in a venue with an Android
application using the Foursquare API. By obtaining the recommendations to this node we estimate the probability of some categories
to be similar to users tastes...
Workload-aware materialization for efficient variable elimination on Bayesian...Cigdem Aslay
Bayesian networks are general, well-studied probabilistic models that capture dependencies among a set of variables. Variable Elimination is a fundamental algorithm for probabilistic inference over Bayesian networks. In this paper, we propose a novel materialization method, which can lead to significant efficiency gains when processing inference queries using the Variable Elimination algorithm. In particular, we address the problem of choosing a set of intermediate results to precompute and materialize, so as to maximize the expected efficiency gain over a given query workload. For the problem we consider, we provide an optimal polynomial-time algorithm and discuss alternative methods. We validate our technique using real-world Bayesian networks. Our experimental results confirm that a modest amount of materialization can lead to significant improvements in the running time of queries, with an average gain of 70%, and reaching up to a gain of 99%, for a uniform workload of queries. Moreover, in comparison with existing junction tree methods that also rely on materialization, our approach achieves competitive efficiency during inference using significantly lighter materialization.
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorialAlexandros Karatzoglou
The slides from the Learning to Rank for Recommender Systems tutorial given at ACM RecSys 2013 in Hong Kong by Alexandros Karatzoglou, Linas Baltrunas and Yue Shi.
A Probabilistic U-Net for Segmentation of Ambiguous ImagesSeunghyun Hwang
Review : A Probabilistic U-Net for Segmentation of Ambiguous Images
- by Seunghyun Hwang (Yonsei University, Severance Hospital, Center for Clinical Data Science)
This is the first lecture on Applied Machine Learning. The course focuses on the emerging and modern aspects of this subject such as Deep Learning, Recurrent and Recursive Neural Networks (RNN), Long Short Term Memory (LSTM), Convolution Neural Networks (CNN), Hidden Markov Models (HMM). It deals with several application areas such as Natural Language Processing, Image Understanding etc. This presentation provides the landscape.
Generative Adversarial Networks : Basic architecture and variantsananth
In this presentation we review the fundamentals behind GANs and look at different variants. We quickly review the theory such as the cost functions, training procedure, challenges and go on to look at variants such as CycleGAN, SAGAN etc.
Experimental Economics and Machine Learning workshopDmitrii Ignatov
This presentation summarises recent activities on EEML workshop organisation. In fact, this is a successful event which attracts economists and computers scientists who would like to use recent advances in machine learning and data mining to understand human behavior in different domains related to Economics and Social Science.
NIPS 2016, Tensor-Learn@NIPS, and IEEE ICDM 2016Dmitrii Ignatov
Some photo impressions from NIPS & ICDM 2016 in Barcelona mixed with workshops like Learning with Tensors (http://tensor-learn.org/) and related stuff.
Pattern-based classification of demographic sequencesDmitrii Ignatov
We have proposed prefix-based gapless sequential patterns for classification of demographic sequences. In comparison to black-box machine learning techniques, this one provides interpretable patterns suitable for treatment by professional demographers. As for the language, we have used Pattern Structures as an extension of Formal Concept Analysis for the case of complex data like sequences, graphs, intervals, etc.
A short introduction into Sequential Pattern Mining in Russia. We consider frequent and frequent closed sequences along with two algorithms (SPADE and PrefixSpan). A demographic case study is provided as well. One can find links and references to relevant literature and software. We mainly follow Han & Kamber Data Mining book (2nd edition, Chapter 8.3).
Краткое введение в Sequential Pattern Mining на русском языке. Рассматриваются алгоритмы для поиска частых и частых замкнутых последовательностей (SPADE и PrefixSpan) Кейс-стади на примере демографических последовательностей. Приведены ссылки на библиотеки и реализации некоторых базовых алгоритмов. Основное изложение по мотивам учебника Джиавея Хана и Мишелин Камбер.
Поиск частых множеств признаков (товаров) и ассоциативные правилаDmitrii Ignatov
Краткое введение в анализ ассоциативных правил в терминах Анализа Формальных Понятий. Примеры задач: поиск документов почти-дубликатов, анализ посещаемости сайтов, контекстная реклама.
On the Family of Concept Forming Operators in Polyadic FCADmitrii Ignatov
Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago. And many researchers work in Data Mining and Formal Concept Analysis using the notions of closed sets, Galois and closure operators, closure systems. However, up-to-date even though that different researchers actively work on mining triadic and n-ary relations, a proper closure operator for enumeration of triconcepts, i.e. maximal triadic cliques of tripartite hypergaphs, was not introduced. In this talk we show that the previously introduced operators for obtaining triconcepts are not always consistent, describe their family and study their properties. We also introduce the notion of maximal switching generator to explain why such concept-forming operators are not closure operators due to violation of monotonicity property.
Pattern Mining and Machine Learning for Demographic SequencesDmitrii Ignatov
In this talk, we present the results of our first studies in application of pattern mining and machine learning to analysis of demographic sequences in Russia based on data of 11 generations from 1930 till 1984. The main goal is not prediction and data mining methods themselves, but rather extraction of interesting patterns and knowledge acquisition from substantial datasets of demographic data. We use decision trees as techniques for demographic events prediction and emerging patterns for searching significant and potentially useful sequences.
AIST is a scientific conference on Analysis of Images, Social Networks, and Texts. The conference is intended for computer scientists and practitioners whose research interests involve Internet mathematics and other related fields of data science. Similar to the previous year, the conference will be focused on applications of data mining and machine learning techniques to various problem domains: image processing, analysis of social networks, and natural language processing. We hope that the participants will benefit from the interdisciplinary nature of the conference and exchange experience.
In our previous work an efficient one-pass online algorithm for triclustering of binary data (triadic formal contexts) was proposed. This algorithm is a modified version of the basic algorithm for OAC- triclustering approach; it has linear time and memory complexities. In this paper we parallelise it via map-reduce framework in order to make it suitable for big datasets. The results of computer experiments show the efficiency of the proposed algorithm; for example, it outperforms the online counterpart on Bibsonomy dataset with ≈ 800, 000 triples.
This paper presents an interesting idea how to compute a consensus of several k-partitions of a set by means of finding an antichain in the concept lattice of an appropriate formal context.
Searching for optimal patterns in Boolean tensorsDmitrii Ignatov
This is our slides for a spotlight talk at Learning with Tensors workshop at NIPS 2016. We have shortly summarise comparison of five different triclustering algorithms (TRIAS, TriBox, OACPrime, OACBox, and SpecTric).
RAPS: A Recommender Algorithm Based on Pattern StructuresDmitrii Ignatov
We propose a new algorithm for recommender systems with numeric
ratings which is based on Pattern Structures (RAPS). As the input the algorithm
takes rating matrix, e.g., such that it contains movies rated by users. For a target
user, the algorithm returns a rated list of items (movies) based on its previous ratings
and ratings of other users.We compare the results of the proposed algorithm
in terms of precision and recall measures with Slope One, one of the state-of-the-art
item-based algorithms, on Movie Lens dataset and RAPS demonstrates the
best or comparable quality.
A One-Pass Triclustering Approach: Is There any Room for Big Data?Dmitrii Ignatov
An efficient one-pass online algorithm for triclustering of binary data (triadic formal contexts) is proposed. This algorithm is a modified version of the basic algorithm for OAC-triclustering approach, but it has linear time and memory complexities with respect to the cardinality
of the underlying ternary relation and can be easily parallelized in order to be applied for the analysis of big datasets. The results of computer experiments show the efficiency of the proposed algorithm.
Boolean matrix factorisation for collaborative filteringDmitrii Ignatov
We propose a new approach for Collaborative filtering which
is based on Boolean Matrix Factorisation (BMF) and Formal Concept
Analysis. In a series of experiments on real data (MovieLens dataset) we
compare the approach with an SVD-based one in terms of Mean Average
Error (MAE). One of the experimental consequences is that it is enough
to have a binary-scaled rating data to obtain almost the same quality
in terms of MAE by BMF as for the SVD-based algorithm in case of
non-scaled data.
Recommendation and Information Retrieval: Two Sides of the Same Coin?Arjen de Vries
Status update on our current understanding of how collaborative filtering relates far more closely to information retrieval than usually thought. Includes work by Jun Wang and Alejandro Bellogín. This presentation has been given at the Siks PhD student course on computational intelligence, May 24th, 2013
Online Recommender System for Radio Station Hosting: Experimental Results Rev...Dmitrii Ignatov
We present a new recommender system developed for the Russian interactive radio network FMhost based on a previously proposed model. The underlying model combines a collaborative user-based approach with information from tags of listened tracks in order to match user and radio station profiles.
It follows an adaptive online learning strategy based on the user history. We compare the proposed algorithms and an industry standard technique based on singular value decomposition (SVD)
in terms of precision, recall, and NDCG measures; experiments show that in our case the fusion-based approach shows the best results.
Contextual Information Elicitation in Travel Recommender SystemsMatthias Braunhofer
Context-Aware Recommender Systems are advisory applications that exploit users’ preference knowledge contained in datasets of context-dependent user ratings, i.e., ratings augmented with the description of the contextual situation detected when the user experienced the item and rated it. Since the space of context-dependent ratings increases exponentially in size with the number of contextual factors, and because certain contextual information is still hard to acquire automatically (e.g., the user’s mood or the travellers’ group composition), it is fundamental to identify and acquire only those factors that truly influence the user preferences and consequently the ratings and the recommendations. In this paper, we propose a novel method that estimates the impact of a contextual factor on rating predictions and adaptively elicits from the users only the relevant ones. Our experimental evaluation, on two travel-related datasets, shows that our method compares favorably to other state-of-the-art context selection methods.
Splay Method of Model Acquisition Assessmentijtsrd
We know that the study of an object under investigation using a mathematical model is called modeling. The purpose of modeling will be to determine the properties of the object under study, to model it and to assess its condition. Turapov U. U | Isroilov U. B | Raxmatov A. SH | Egamov S. M | Isabekov B. I "Splay-Method of Model Acquisition Assessment" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-1 , December 2020, URL: https://www.ijtsrd.com/papers/ijtsrd38141.pdf Paper URL : https://www.ijtsrd.com/other-scientific-research-area/other/38141/splaymethod-of-model-acquisition-assessment/turapov-u-u
Recommender systems analyze patterns of user interest in
products to provide personalized recommendations. They seek to predict the rating or preference that user would
give to an item. Some of the most successful realizations of latent factor models are based on matrix factorization...
The decision taken in the selection of scholarship recipients for students is one of the responsibilities held by the stakeholders at the high school leadership level. The decision-making stage consists of compliance with the terms or criteria set by the government as the scholarship provider. Implementation of decision support methods for selection of scholarship recipients is required. This can help the leadership to make the selection better. Many methods in decision support systems can solve and make decisions better, including preference selection index. The use of preference selection index applied in the decision support system will result in a more effective decision.
Introduction to SEM (Structural Equation Models) - invited talk at the seminar "Analyzing and Interpreting Data" organized by the Finnish Doctoral Programme in Education and Learning (15 May 2013) in Vuosaari, Helsinki, Finland. Acknowledgements to Barbara Byrne for an excellent intro book of SEM.
The Comparative Study of Gray Model and Markov Model in Pavement Performance ...IJERA Editor
Pavement performance prediction is an essential component of pavement maintenance management system, which directly affects the choice of maintenance measures and funds. Firstly, this paper uses the Gray theoretical model to predict the status of certain highway pavement damaged. Secondly, the Grey theory and Markov prediction method are combined to forecast it. Finally, the comparison of the results between Grey theory and Markov prediction method analyzes the similarities and differences .The results show that Grey theoretical model is more suitable for recent forecasting, while combination method is suitable for longer-term forecasting.
Interpretable Concept-Based Classification with Shapley ValuesDmitrii Ignatov
The slides contain our talk on Shapley values as an interpretable Machine learning technique for JSM-method, a rule-based classification and reasoning technique, for ranking particular attributes of an undetermined example under classification.
https://doi.org/10.1007/978-3-030-57855-8_7
These are opening slides of the 8th International Conference on Analysis of Images, Social Networks and Texts (AIST 2019). We summarise general facts on AIST conf. series. See http://aistconf.org website for more details.
Turning Krimp into a Triclustering Technique on Sets of Attribute-Condition P...Dmitrii Ignatov
Mining ternary relations or triadic Boolean tensors is one of the recent trends in knowledge discovery that allows one to take into account various modalities of input object-attribute data.
For example, in movie databases like IMBD, an analyst may find not only movies grouped by specific genres but see their common keywords. In the so called folksonomies, users can be grouped according to their shared resources and used tags. In gene expression analysis, genes can be grouped along with samples of tissues and time intervals providing comprehensible patterns. However, pattern explosion effects even with one more dimension are seriously aggravated. In this paper, we continue our previous study on searching for a smaller collection of ``optimal'' patterns in triadic data with respect to a set of quality criteria such as patterns' cardinality, density, diversity, coverage, etc. We show how a simple data preprocessing has enabled us to use the frequent itemset mining algorithm.
Social Learning in Networks: Extraction Deterministic RulesDmitrii Ignatov
In this talk, we want to introduce experimental
economics to the field of data mining and vice versa. It continues
related work on mining deterministic behavior rules of human
subjects in data gathered from experiments. Game-theoretic
predictions partially fail to work with this data. Equilibria also
known as game-theoretic predictions solely succeed with experienced
subjects in specific games – conditions, which are rarely
given. Contemporary experimental economics offers a number of
alternative models apart from game theory. In relevant literature,
these models are always biased by philosophical plausibility
considerations and are claimed to fit the data. An agnostic
data mining approach to the problem is introduced in this
paper – the philosophical plausibility considerations follow after
the correlations are found. No other biases are regarded apart
from determinism. The dataset of the paper “Social Learning in
Networks” by Choi et al 2012 is taken for evaluation. As a result,
we come up with new findings. As future work, the design of a
new infrastructure is discussed.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
The increased availability of biomedical data, particularly in the public domain, offers the opportunity to better understand human health and to develop effective therapeutics for a wide range of unmet medical needs. However, data scientists remain stymied by the fact that data remain hard to find and to productively reuse because data and their metadata i) are wholly inaccessible, ii) are in non-standard or incompatible representations, iii) do not conform to community standards, and iv) have unclear or highly restricted terms and conditions that preclude legitimate reuse. These limitations require a rethink on data can be made machine and AI-ready - the key motivation behind the FAIR Guiding Principles. Concurrently, while recent efforts have explored the use of deep learning to fuse disparate data into predictive models for a wide range of biomedical applications, these models often fail even when the correct answer is already known, and fail to explain individual predictions in terms that data scientists can appreciate. These limitations suggest that new methods to produce practical artificial intelligence are still needed.
In this talk, I will discuss our work in (1) building an integrative knowledge infrastructure to prepare FAIR and "AI-ready" data and services along with (2) neurosymbolic AI methods to improve the quality of predictions and to generate plausible explanations. Attention is given to standards, platforms, and methods to wrangle knowledge into simple, but effective semantic and latent representations, and to make these available into standards-compliant and discoverable interfaces that can be used in model building, validation, and explanation. Our work, and those of others in the field, creates a baseline for building trustworthy and easy to deploy AI models in biomedicine.
Bio
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research explores socio-technological approaches for responsible discovery science, which includes collaborative multi-modal knowledge graphs, privacy-preserving distributed data mining, and AI methods for drug discovery and personalized medicine. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon Europe, the European Open Science Cloud, the US National Institutes of Health, and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Context-Aware Recommender System Based on Boolean Matrix Factorisation
1. Context-Aware Recommender System Based on
Boolean Matrix Factorisation
Marat Akhmatnurov and Dmitry I. Ignatov
National Research University Higher School of Economics, Moscow, Russia
Faculty of Computer Science
October 13–16, CLA 2015
Clermont-Ferrand, France
2. Outline
Problem Statement
Contextual information
Singular Value Decomposition
Related work
Boolean Matrix Factorisation
Quality evaluation
Conclusion and future work
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 2 / 29
4. Contextual information
[Adomavicius & Tuzhilin, 2005]
I =
[
R Cuser
Citem O
]
,
Movies Sex Age
m1 m2 m3 m4 m5 m6 M F 0-20 21-45 46+
u1 5 5 5 2 + +
u2 5 5 3 5 + +
u3 4 4 5 4 + +
u4 3 5 5 5 + +
u5 2 5 4 + +
u6 5 3 4 5 + +
u7 5 4 5 4 + +
Drama + + + + +
Action + + + +
Comedy + +
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 4 / 29
5. Singular Value Decomposition
SVD is de facto standard in RS domain [Koren et al., 2009]
Singular Value Decomposition (SVD) is a decomposition of a
rectangular matrix A ∈ Rm×n(m > n) into a product of three
matrices
A = U
(
Σ
0
)
VT
,
where U ∈ Rm×m and V ∈ Rn×n are orthogonal matrices, and
Σ ∈ Rn×n is a diagonal matrix such that Σ = diag(σ1, . . . , σn) and
σ1 ≥ σ2 ≥ . . . ≥ σn ≥ 0. The columns of the matrix U and V are
called singular vectors, and the numbers σi are singular values.
2mn2 + 2n3 floating-point operations [Trefthen et al., 1997]
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 5 / 29
6. Related work
What about Formal Concept Analysis?
• du Boucher-Ryan et al., Collaborative recommending using
Formal Concept Analysis. Knowledge-Based Systems (2006)
• J¨aschke et al. Folksonomy (Bibsonomy) recommendations and
mining, since 2006
• Ignatov et al., Concept-based recommendations for internet
advertisement. CLA 2008
• Symeonidis et al., Nearest-biclusters collaborative filtering
based on constant and coherent values. Information Retrieval
(2008)
• Ignatov et al., Concept-Based Biclustering for Internet
Advertisement. IEEE ICDMW 2012
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 6 / 29
7. Related work
What about Formal Concept Analysis?
• Jelassi et al., A personalized recommender system based on
users’ information in folksonomies. WWW 2013
• Alqadah et al., Biclustering neighborhood-based collaborative
filtering method for top-n recommender systems. Knowledge
and Information Systems (2014)
• Ignatov et al. Boolean Matrix Factorisation for Collaborative
Filtering: An FCA-Based Approach. AIMSA 2014 (FCA meets
IR @ ECIR 2013)
• Ignatov et al. RAPS: A recommender algorithm based on
pattern structures. FCA4AI@IJCAI 2015
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 7 / 29
8. Boolean Matrix Factorisation
Formal Concept Analysis [Wille, 1982; Ganter & Wille, 1999]
A formal context K is a triple (G, M, I), where G is a set of
objects, M is a set of attributes, I ⊆ G × M is an incidence
relation. We write gIm when the object g ∈ G has the attribute
m ∈ M
Derivation (Galois) operators:
For A ⊆ G and for B ⊆ M we have
A′
= {m ∈ M | gIm for all g ∈ A} ,
B′
= {g ∈ G | gIm for all m ∈ B} .
A formal concept of the formal context K = (G, M, I) is a pair
(A, B) such that A ∈ G, B ∈ M, A′
= B and B′
= A.
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 8 / 29
9. Boolean Matrix Factorisation
Formal Concept Analysis
B(G, M, I) is the set of all formal concepts of a context
K = (G, M, I).
F = {(A1, B1), . . . (Ak, Bk)} ⊆ B(G, M, I)
(PF )il =
{
1, i ∈ Al
0, otherwise
, l = 1, k,
(QF )lj =
{
1, j ∈ Bl
0, otherwise
, l = 1, k.
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 9 / 29
10. Boolean Matrix Factorisation
[Belohlavek & Vychodil, 2010]
Boolean matrix factorisation is a decomposition of the input binary
matrix I = {0, 1}m×n into a product of two binary matrices
P = {0, 1}m×k and Q = {0, 1}k×n by the following rule:
(P ◦ Q)ij =
k∨
l=1
Pil ∧ Qlj
Theorem 1 (Universality of formal concepts as factors)
For every binary matrix I there is F ⊆ B(G, M, I) such that
I = PF ◦ QF .
Theorem 2 (Optimality of formal concepts as factors)
Let I = P ◦ Q is a decomposition of I = {0, 1}m×n, where
P = {0, 1}m×k and Q = {0, 1}k×n. Then there exists
F ⊆ B(G, M, I) such that |F| ≤ k, I = PF ◦ QF .
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 10 / 29
11. Boolean Matrix Factorisation
Searching for formal concept as factors
• Greedy algorithm (Belohlavek & Vychodil, 2010);
O(k|G||M|3), where k is the number of found factors.
• Close-by-one (CbO) algorithm (Kuznetsov S.O., 1993);
O(|G||M|2|L|)
• CbO modification with balanced factors (concepts)
W =
2|A||B|
|A|2 + |B|2
, where (A, B) ∈ B(G, M, I)
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 11 / 29
16. Quality evaluation of the proposed approach
Data
MovieLens-100k dataset
• 100 000 ratings (5-star scale)
• 943 users
• Gender
• Age
• Occupation (21 categories)
• ZIP
• 1682 movies
• 19 genres
Five star ratings are converted to binary scale:
Ium =
{
1, rum > 3,
0, else
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 16 / 29
17. Quality evaluation
Criteria
• Mean Absolute Error
MAE =
∑
(u,m)∈U×M∩Itest
|ˆrum − rum|
|Itest|
• Precision
P =
TP
TP + FP
• Recall
R =
TP
TP + FN
• F-measure (F1-measure)
F =
2PR
P + R
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 17 / 29
25. Quality evaluation
Comparison with SVD
0 20 40 60 80 100
0.2
0.25
0.3
0.35
0.4
Number of neighbours
MAE
0 20 40 60 80 100
0
0.1
0.2
0.3
0.4
Number of neighbours
F−measure
0 20 40 60 80 100
0.2
0.4
0.6
0.8
1
Number of neighbours
Precision
0 20 40 60 80 100
0
0.1
0.2
0.3
0.4
Number of neighbours
Recall
BMF80
+context
SVD85
+context
SVD85
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 25 / 29
26. Conclusion
• We have proposed a rather straightforward BMF-based
approach for recommendations with contextual information
• MAE of our BMF-based approach is sufficiently lower than
MAE of SVD-based approach for almost the same number of
factors at fixed coverage level.
• The Precision of BMF-based approach is slightly lower when
the number of neighbours is about a couple of dozens and
comparable for the remaining observed range.
• The Recall is lower that results in lower F-measure.
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 26 / 29
27. Conclusion
• The greedy algorithm of Belohlavek & Vychodyl results in
more factors with larger extent, but it is faster and shows
almost the same quality as the balanced factors search by
CbO.
• Our weighted projection alleviates the information loss of
Boolean projection and results in a substantial quality gain.
• Contextual information demonstrates a small quality increase
(about 1-2%) in terms of MAE and Precision, but not in
Recall and Precision.
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 27 / 29
28. Future work
• Incorporation of time and location as contextual information
• Treatment of contextual information by means of Triadic FCA
and triclustering
• Comparison, usage, extension of the following works:
• [Jelassi et al., 2013] Recommendations in personalised
folksomies
• [Belohlávek et al, 2013] Factorization of three-way binary data
using triadic concepts
• [Trnecka et al., 2014] Multi-Relational Boolean Factor
Analysis
• [Belholavek et al., 2015; Metzler et al., 2015; Nourine et. al.,
2015] Efficient Boolean matrix factorisation algorithms
Akhmatnurov & Ignatov Higher School of Economics CLA 2015 28 / 29