Large volumes of socio-economic data are collected about the developing world. However, licensing fees charged by publishers, along with a myriad of different ways of exchanging the data mean it is often unavailable to local researchers and citizens.
This presentation looks at ways in which country and regional level data can be made more accessible. It looks at the open data environment, and some of the freely available sources of data on developing countries. It examines some emerging technologies and standards that facilitate the dissemination and exchange of socio-economic data, including Linked Data and the Statistical Data and Metadata Exchange (SDMX) initiative, and mobile devices.
A Scientist's Perspective on Open Access and Data Management by Leigh WinowieckiCIAT
The following is a presentation from CIAT's 2014 Annual Program Review in Cali, Colombia by Leigh Winowiecki a Soil Scientist, which debates the issue of scientific data, publishing through open access,and the crediting, sharing and reuse of data among scientists and the greater scientific community.
This presentation was provided by Catherine Ahearn of PubPub, MIT Knowledge Futures Group, during the NISO event "Long Form Content: Ebooks, Print Volumes and the Concerns of Those Who Use Both," held on March 20, 2019.
This presentation was provided by Kevin Hawkins of The University of North Texas Libraries, during the NISO event "Long Form Content: Ebooks, Print Volumes and the Concerns of Those Who Use Both," held on March 20, 2019.
Large volumes of socio-economic data are collected about the developing world. However, licensing fees charged by publishers, along with a myriad of different ways of exchanging the data mean it is often unavailable to local researchers and citizens.
This presentation looks at ways in which country and regional level data can be made more accessible. It looks at the open data environment, and some of the freely available sources of data on developing countries. It examines some emerging technologies and standards that facilitate the dissemination and exchange of socio-economic data, including Linked Data and the Statistical Data and Metadata Exchange (SDMX) initiative, and mobile devices.
A Scientist's Perspective on Open Access and Data Management by Leigh WinowieckiCIAT
The following is a presentation from CIAT's 2014 Annual Program Review in Cali, Colombia by Leigh Winowiecki a Soil Scientist, which debates the issue of scientific data, publishing through open access,and the crediting, sharing and reuse of data among scientists and the greater scientific community.
This presentation was provided by Catherine Ahearn of PubPub, MIT Knowledge Futures Group, during the NISO event "Long Form Content: Ebooks, Print Volumes and the Concerns of Those Who Use Both," held on March 20, 2019.
This presentation was provided by Kevin Hawkins of The University of North Texas Libraries, during the NISO event "Long Form Content: Ebooks, Print Volumes and the Concerns of Those Who Use Both," held on March 20, 2019.
This presentation was provided by Melissa Milazzo and Gina Donato of Elsevier, during the NISO event "Long Form Content: Ebooks, Print Volumes and the Concerns of Those Who Use Both," held on March 20, 2019.
Advancing Patron Privacy on Vendor Systems with a Shared UnderstandingPeter Murray
Presentation given to the NISO Consensus Framework to Support Patron Privacy in Digital Library and Information Systems working group meeting on May 21, 2015.
Big data refer to the ongoing accumulation of massive, often complex and always-changing data sets – for instance, machine-generated data from sensors or cell phone GPS signals. Or it may be data from social media sites.
Open data are data sets made available to the public to use and reuse. Those sets may come from Big Data but they don’t have to.The act of opening data is like extending an invitation to anyone to freely take the data and turn it into something useful.
Recommender systems analyze patterns of user interest in
products to provide personalized recommendations. They seek to predict the rating or preference that user would
give to an item. Some of the most successful realizations of latent factor models are based on matrix factorization...
Recommendations are everywhere : music, movies, books, social medias, e-commerce web sites… The Web is leaving the era of search and entering one of discovery. This quick introduction will help you to understand this vast topic and why you should use it.
Measuring Impact: Towards a data citation metricEdward Baker
How the ViBRANT and eMonocot projects are building tools, including a modified implementation of Bourne and Fink's 'Scholar Factor', the Biodiversity Data Journal, and Scratchpad's user metrics and statistics modules.
This was a presentation for the Connecticut Library Association 2016. It introduces how the Connecticut Digital Archive came to be, the challenges of the CTDA and how it is moving forward.
FAIRy stories: tales from building the FAIR Research CommonsCarole Goble
Plenary Lecture Presented at INCF Neuroinformatics 2019 https://www.neuroinformatics2019.org
Title: FAIRy stories: tales from building the FAIR Research Commons
Findable Accessable Interoperable Reusable. The “FAIR Principles” for research data, software, computational workflows, scripts, or any kind of Research Object is a mantra; a method; a meme; a myth; a mystery. For the past 15 years I have been working on FAIR in a range of projects and initiatives in the Life Sciences as we try to build the FAIR Research Commons. Some are top-down like the European Research Infrastructures ELIXIR, ISBE and IBISBA, and the NIH Data Commons. Some are bottom-up, supporting FAIR for investigator-led projects (FAIRDOM), biodiversity analytics (BioVel), and FAIR drug discovery (Open PHACTS, FAIRplus). Some have become movements, like Bioschemas, the Common Workflow Language and Research Objects. Others focus on cross-cutting approaches in reproducibility, computational workflows, metadata representation and scholarly sharing & publication. In this talk I will relate a series of FAIRy tales. Some of them are Grimm. There are villains and heroes. Some have happy endings; all have morals.
PAARL's 1st Marina G. Dayrit Lecture Series held at UP's Melchor Hall, 5F, Proctor & Gamble Audiovisual Hall, College of Engineering, on 3 March 2017, with Albert Anthony D. Gavino of Smart Communications Inc. as resource speaker on the topic "Using Big Data to Enhance Library Services"
Credit scoring has been used to categorize customers based on various characteristics to evaluate their credit worthiness. Increasingly, machine learning techniques are being deployed for customer segmentation, classification and scoring. In this talk, we will discuss various machine learning techniques that can be used for credit risk applications. Through a case study built in R, we will illustrate the nuances of working with practical data sets which includes categorical and numerical data, different techniques that can be used to evaluate and explore customer profiles, visualizing high dimensional data sets and machine learning techniques for customer segmentation.
This presentation was provided by Melissa Milazzo and Gina Donato of Elsevier, during the NISO event "Long Form Content: Ebooks, Print Volumes and the Concerns of Those Who Use Both," held on March 20, 2019.
Advancing Patron Privacy on Vendor Systems with a Shared UnderstandingPeter Murray
Presentation given to the NISO Consensus Framework to Support Patron Privacy in Digital Library and Information Systems working group meeting on May 21, 2015.
Big data refer to the ongoing accumulation of massive, often complex and always-changing data sets – for instance, machine-generated data from sensors or cell phone GPS signals. Or it may be data from social media sites.
Open data are data sets made available to the public to use and reuse. Those sets may come from Big Data but they don’t have to.The act of opening data is like extending an invitation to anyone to freely take the data and turn it into something useful.
Recommender systems analyze patterns of user interest in
products to provide personalized recommendations. They seek to predict the rating or preference that user would
give to an item. Some of the most successful realizations of latent factor models are based on matrix factorization...
Recommendations are everywhere : music, movies, books, social medias, e-commerce web sites… The Web is leaving the era of search and entering one of discovery. This quick introduction will help you to understand this vast topic and why you should use it.
Measuring Impact: Towards a data citation metricEdward Baker
How the ViBRANT and eMonocot projects are building tools, including a modified implementation of Bourne and Fink's 'Scholar Factor', the Biodiversity Data Journal, and Scratchpad's user metrics and statistics modules.
This was a presentation for the Connecticut Library Association 2016. It introduces how the Connecticut Digital Archive came to be, the challenges of the CTDA and how it is moving forward.
FAIRy stories: tales from building the FAIR Research CommonsCarole Goble
Plenary Lecture Presented at INCF Neuroinformatics 2019 https://www.neuroinformatics2019.org
Title: FAIRy stories: tales from building the FAIR Research Commons
Findable Accessable Interoperable Reusable. The “FAIR Principles” for research data, software, computational workflows, scripts, or any kind of Research Object is a mantra; a method; a meme; a myth; a mystery. For the past 15 years I have been working on FAIR in a range of projects and initiatives in the Life Sciences as we try to build the FAIR Research Commons. Some are top-down like the European Research Infrastructures ELIXIR, ISBE and IBISBA, and the NIH Data Commons. Some are bottom-up, supporting FAIR for investigator-led projects (FAIRDOM), biodiversity analytics (BioVel), and FAIR drug discovery (Open PHACTS, FAIRplus). Some have become movements, like Bioschemas, the Common Workflow Language and Research Objects. Others focus on cross-cutting approaches in reproducibility, computational workflows, metadata representation and scholarly sharing & publication. In this talk I will relate a series of FAIRy tales. Some of them are Grimm. There are villains and heroes. Some have happy endings; all have morals.
PAARL's 1st Marina G. Dayrit Lecture Series held at UP's Melchor Hall, 5F, Proctor & Gamble Audiovisual Hall, College of Engineering, on 3 March 2017, with Albert Anthony D. Gavino of Smart Communications Inc. as resource speaker on the topic "Using Big Data to Enhance Library Services"
Credit scoring has been used to categorize customers based on various characteristics to evaluate their credit worthiness. Increasingly, machine learning techniques are being deployed for customer segmentation, classification and scoring. In this talk, we will discuss various machine learning techniques that can be used for credit risk applications. Through a case study built in R, we will illustrate the nuances of working with practical data sets which includes categorical and numerical data, different techniques that can be used to evaluate and explore customer profiles, visualizing high dimensional data sets and machine learning techniques for customer segmentation.
Slides accompanying a talk delivered by Dan Gillean at PASIG 2016, held at the Museum of Modern Art in New York, NY October 26-28, 2016.
These slides explore the roles that standards play in digital preservation, and introduce some of the key standards that Archivematica was designed with in mind, and which the system uses to help you capture technical, preservation, and administrative metadata when generating Archival Information Packages (AIPs) and Dissemination Information Packages (DIPs).
For more information about Archivematica, see: https://www.archivematica.org
My presentation given at the Association of Subscription Agents annual conference, Feb 2013.
It was titled Understanding how researchers and practitioners use STM information, but the specific theme was understanding how to design information products and services for researchs and practitioners against a background of information abundance (aka information overload).
Personalized Search-Building a prototype to infer the user's interestTom Burgmans
In the world of Search, understanding the intend of the user is often seen as the holy grail. When a user performs multiple search and click actions while having a conversation with the search engine, then this behavior reveals a piece of her/his interest. A search engine that is aware of the user’s interest is able to add a personal layer in its responses and this could add a new dimension of accuracy and value to a search implementation. But what technology does it take to build it? What data is needed? How well does it really work? This presentation describes the journey to find a practical implementation of a recommendation engine. It answers all the questions above and more. We’ll guide you through the lessons learned while creating an engine that generates potentially interesting items for the user based on collaborative filtering and anomaly detection. We’ll demonstrate a prototype where even a minimal set of user actions could lead to a personalized search experience.
"The greater promise of Big Data lies not in doing old things in slightly new ways. Instead, it lies in doing new things that were previously not possible. One major class of new things is adding intelligence to large-scale systems. In this session I will present a survey of how machine learning can be applied to real-life situations without having to get a PhD in advanced mathematics. These systems can be built today from open source components to increase business revenues by understanding what customers need and want. I will provide real world examples of best practices and pitfalls in machine learning including practical ways to build maintainable, high performance systems." - Ted Dunning
[WI 2014]Context Recommendation Using Multi-label ClassificationYONG ZHENG
Context-aware recommender systems (CARS) are extensions of traditional recommenders that also take into account contextual condition of a user to whom a recommendation is made. The recommendation problem is, however, still focused on recommending a set of items to a target user. In this paper, we consider the problem of recommending to a user the appropriate contexts in which an item should be selected. We believe that context recommenders can be used as another set of tools to assist users' decision making. We formulate the context recommendation problem and discuss the motivation behind and possible applications of the concept. We identify two general classes of algorithms to solve this problem: direct context prediction and indirect context recommendation. Furthermore, we present and evaluate several direct context prediction algorithms based on multi-label classification (MLC). Our experiments demonstrate that the proposed approaches outperform the baseline methods, and also that personalization is required to enhance the effectiveness of context recommenders.
Replication of Recommender Systems ResearchAlan Said
Course held at the 2017 ACM RecSys Summer School at the Free University of Bozen-Bolzano by Alejandro Bellogin (@abellogin) and Alan Said (@alansaid).
http://recommenders.net/rsss2017/
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...Alan Said
Video available here http://www.youtube.com/watch?v=1jHxGCl8RXc
Recommender systems research is often based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender.
However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy.
Additionally, algorithmic implementations can diverge from the standard formulation due to manual tuning and modifications that work better in some situations.
In this work we compare common recommendation algorithms as implemented in three popular recommendation frameworks.
To provide a fair comparison, we have complete control of the evaluation dimensions being benchmarked: dataset, data splitting, evaluation strategies, and metrics.
We also include results using the internal evaluation mechanisms of these frameworks.
Our analysis points to large differences in recommendation accuracy across frameworks and strategies, i.e. the same baselines may perform orders of magnitude better or worse across frameworks.
Our results show the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results.
The Magic Barrier of Recommender Systems - No Magic, Just RatingsAlan Said
Recommender Systems need to deal with different types of users who represent their preferences in various ways. This difference in user behaviour has a deep impact on the final performance of the recommender system, where some users may receive either better or worse recommendations depending, mostly, on the quantity and the quality of the information the system knows about the user. Specifically, the inconsistencies of the user impose a lower bound on the error the system may achieve when predicting ratings for that particular user.
In this work, we analyse how the consistency of user ratings (coherence) may predict the performance of recommendation methods. More specifically, our results show that our definition of coherence is correlated with the so-called magic barrier of recommender systems, and thus, it could be used to discriminate between easy users (those with a low magic barrier) and difficult ones (those with a high magic barrier).
We report experiments where the rating prediction error for the more coherent users is lower than that of the less coherent ones.
We further validate these results by using a public dataset, where the magic barrier is not available, in which we obtain similar performance improvements.
A Top-N Recommender System Evaluation Protocol Inspired by Deployed SystemsAlan Said
he evaluation of recommender systems is crucial for their development. In today's recommendation landscape there are many standardized recommendation algorithms and approaches, however, there exists no standardized method for experimental setup of evaluation -- not even for widely used measures such as precision and root-mean-squared error. This creates a setting where comparison of recommendation results using the same datasets becomes problematic. In this paper, we propose an evaluation protocol specifically developed with the recommendation use-case in mind, i.e. the recommendation of one or several items to an end user. The protocol attempts to closely mimic a scenario of a deployed (production) recommendation system, taking specific user aspects into consideration and allowing a comparison of small and large scale recommendation systems. The protocol is evaluated on common recommendation datasets and compared to traditional recommendation settings found in research literature. Our results show that the proposed model can better capture the quality of a recommender system than traditional evaluation does, and is not affected by characteristics of the data (e.g. size. sparsity, etc.).
Information Retrieval and User-centric Recommender System EvaluationAlan Said
Poster describing the ERCIM-funded project on IR- and user-centric recommender system evaluation currently being undertaken in the Information Access group at CWI.
Presented at UMAP 2013.
User-Centric Evaluation of a K-Furthest Neighbor Collaborative Filtering Reco...Alan Said
Collaborative filtering recommender systems often use nearest neighbor methods to identify candidate items. In this paper we present an inverted neighborhood model, k-Furthest Neighbors, to identify less ordinary neighborhoods for the purpose of creating more diverse recommendations. The approach is evaluated two-fold, once in a traditional information retrieval evaluation setting where the model is trained and validated on a split train/test set, and once through an online user study (N=132) to identify users’ erceived quality of the recommender. A standard k-nearest neighbor recommender is used as a baseline in both evaluation settings. our evaluation shows that even though the proposed furthest neighbor model is outperformed in the traditional evaluation setting, the perceived usefulness of the algorithm shows no significant difference in the results of the user study.
A 3D Approach to Recommender System EvaluationAlan Said
In this work we describe an approach at multi-objective recommender system evaluation based on a previously introduced 3D benchmarking model. The benchmarking model takes user-centric, business-centric and technical constraints into consideration in order to provide a means of comparison of recommender algorithms in similar scenarios. We present a comparison of three recommendation algorithms deployed in a user study using this 3D model and compare to standard evaluation methods. The proposed approach simplifies benchmarking of recommender systems and allows for simple multi-objective comparisons.
Best Practices in Recommender System ChallengesAlan Said
Recommender System Challenges such as the Netflix Prize, KDD Cup, etc. have contributed vastly to the development and adoptability of recommender systems. Each year a number of challenges or contests are organized covering different aspects of recommendation. In this tutorial and panel, we present some of the factors involved in successfully organizing a challenge, whether for reasons purely related to research, industrial challenges, or to widen the scope of recommender systems applications.
Estimating the Magic Barrier of Recommender Systems: A User StudyAlan Said
Recommender systems are commonly evaluated by trying to predict known, withheld, ratings for a set of users. Measures such as the Root-Mean-Square Error are used to estimate the quality of the recommender algorithms. This process does however not acknowledge the inherent rating inconsistencies of users. In this paper we present the first results from a noise measurement user study for estimating the magic barrier of recommender systems conducted on a commercial movie recommendation community. The magic barrier is the expected squared error of the optimal recommendation algorithm, or, the lowest error we can expect from any recommendation algorithm. Our results show that the barrier can be estimated by collecting the opinions of users on already rated items.
Users and Noise: The Magic Barrier of Recommender SystemsAlan Said
Recommender systems are crucial components of most commercial websites to keep users satisfied and to increase revenue. Thus, a lot of effort is made to improve recommendation accuracy. But when is the best possible performance of the recommender reached? The magic barrier, refers to some unknown level of prediction accuracy a recommender system can attain. The magic barrier reveals whether there is still room for improving prediction accuracy or indicates that further improvement is meaningless. In this work, we present a mathematical characterization of the magic barrier based on the assumption that user ratings are afflicted with inconsistencies - noise. In a case study with a commercial movie recommender, we investigate the inconsistencies of the user ratings and estimate the magic barrier in order to assess the actual quality of the recommender system.
Using Social- and Pseudo-Social Networks to Improve Recommendation QualityAlan Said
Short paper presentation at the workshop on Intelligent Techniques from Web Personalization (ITWP2011) at the International Joint Conference on Artificial Intelligence - IJCAI-11, IJCAI2011
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
2. Abstract
• The amount of data in the digital universe is estimated to hit 1.2
Zettabytes (1 billion terabytes) during 2010.
• These data quantities make discovering relevant information a difficult
task.
• Recommender Systems are an integral tool for assisting users in
information discovery.
• By combining wisdom of crowds, content, user profiles, etc.
Recommender Systems find relevant data for us.
“We are leaving the age of information and entering the age of recommendation”
Chris Anderson, The Long Tail
3/18/2022 Talis 2
4. Introduction
• IMDb, one of the first online recommender systems, turned 20 on October
17th 2010.
• Ever since, recommender systems have, through relatively simple
techniques, produced adequately good results
• Is adequately good good enough?
– How can recommender systems be improved?
– What do we need to improve them?
3/18/2022 Talis 4
5. Recommender System Types
Introduction
• Semantic recommenders – explicit information
– Content
– Keywords
– Genre
– etc.
• Social recommenders – implicit information (collaborative filtering)
– Item-based user-user similarities, i.e. which users like similar things
– Content-ignorant
• Hybrid recommenders
– Combinations of content- and CF-based
• Context-aware recommenders
– Aware of the current situation
3/18/2022 Talis 5
7. Social recommenders
Most common recommender
systems approach use
Collaborative Filtering
How does collaborative filtering
work?
• Calculates similarities between all users
• Finds users similar to you
• Fills in your ”gaps” based on similar
users, usually by a k-nearest neighbor
algorithm
3/18/2022 Talis 7
Recommend a book for user C
8. Social recommenders
Most common recommender
systems approach use
Collaborative Filtering
How does collaborative filtering
work?
• Calculates similarities between all users
• Finds users similar to you
• Fills in your ”gaps” based on similar
users, usually by a k-nearest neighbor
algorithm
3/18/2022 Talis 8
Recommend a book for user C
9. Social recommenders
Most common recommender
systems approach use
Collaborative Filtering
How does collaborative filtering
work?
• Calculates similarities between all users
• Finds users similar to you
• Fills in your ”gaps” based on similar
users, usually by a k-nearest neighbor
algorithm
3/18/2022 Talis 9
Recommend a book for user C
10. Hybrid models
Hybrid recommender systems
combine semantic recommenders
with collaborative filtering ones.
3/18/2022 Talis 10
Recommend a book for user C
11. Hybrid models
Hybrid recommender systems
combine semantic recommenders
with collaborative filtering ones.
3/18/2022 Talis 11
Recommend a book for user C
13. What is context?
Context-awareness in RecSys
”Any information that can be used to
characterise the situation of entities”,
Dey 2001
1. Item context
• Seasonal (Christmas, Oscar’s)
• Relation (movie sequel, director, actor)
2. User context
• Surroundings (weather, location)
• Company (alone, with friends)
• Mood/emotions
• any user related factor
3/18/2022 Talis 13
14. Why Context?
Context-awareness in RecSys
3/18/2022 Talis 14
+
• Filters relevant information
• Ad hoc recommendations
• Aware of changes
-
• What is context?
• Where do we find it?
15. Applying Context-awareness
Current state of the art research
presents two types of context-
awareness:
• Context-aware collaborative
filtering
– Performs standard CF on virtual,
contextual, items or users
– Benefits: simple
– Drawbacks: statically defined context
3/18/2022 Talis 15
16. Applying Context-awareness
Current state of the art research
presents two types of context-
awareness:
• Context-aware collaborative
filtering
– Performs standard CF on virtual,
contextual, items or users
– Benefits: simple
– Drawbacks: statically defined context
• Tensor factorization for context-
awareness
– Models the data as a tensor
– Applies higiher-order factorization
techniques (HoSVD, PARAFAC,
HyPLSA, etc) to model context in a
latent space
– Benefits: no prior context
identification necessary
– Drawbacks: adds complexity
3/18/2022 Talis 16
17. My work
3/18/2022 Talis 17
Semantic recommenders
Social recommenders
Context-aware recommenders
18. Where does this fit at Talis?
• Library data
– Loan events – CF
– Book meta data – semantic recommenders
– Time of loan event – context-awareness
3/18/2022 Talis 18
19. Distributed higher order
recommender system
• Use matrix factorization techniques
to make a tensor factorization
approximation in MapReduce
• By matricizing the tensor, standard
matrix factorization approaches can
be run in parallel
• What is matrix factorization?
– Decomposition of a matrix into its
building blocks (SVD example)
• A = UΣVT where A is the matrix, Σ is a
diagonal matrix and U and V are unitary
matrices.
• By only taking the k first diagonal values in
Σ and multiplying the resulting matrix
back with U and V we obtain a k ranked
approximation of the initial A matrix
3/18/2022 Talis 19
book
user