The intent-oriented search diversification methods developed in the field so far tend to build on generative views of the retrieval system to be diversified. Core algorithm components –in particular redundancy assessment– are expressed in terms of the probability to observe documents, rather than the probability that the documents be relevant. This has been sometimes described as a view considering the selection of a single document in the underlying task model. In this paper we propose an alternative formulation of aspect-based diversification algorithms which explicitly includes a formal relevance model. We develop means for the effective computation of the new formulation, and we test the resulting algorithm empirically. We report experiments on search and recommendation tasks showing competitive or better performance than the original diversification algorithms. The relevance-based formulation has further interesting properties, such as unifying two well-known state of the art algorithms into a single version. The relevance-based approach opens alternative possibilities for further formal connections and developments as natural extensions of the framework. We illustrate this by modeling tolerance to redundancy as an explicit configurable parameter, which can be set to better suit the characteristics of the IR task, or the evaluation metrics, as we illustrate empirically.
ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...Pablo Castells
Slides of the paper presentation at RecSys 2011.
Abstract: The Recommender Systems community is paying increasing attention to novelty and diversity as key qualities beyond accuracy in real recommendation scenarios. Despite the raise of interest and work on the topic in recent years, we find that a clear common methodological and conceptual ground for the evaluation of these dimensions is still to be consolidated. Different evaluation metrics have been reported in the literature but the precise relation, distinction or equivalence between them has not been explicitly studied. Furthermore, the metrics reported so far miss important properties such as taking into consideration the ranking of recommended items, or whether items are relevant or not, when assessing the novelty and diversity of recommendations.
We present a formal framework for the definition of novelty and diversity metrics that unifies and generalizes several state of the art metrics. We identify three essential ground concepts at the roots of novelty and diversity: choice, discovery and relevance, upon which the framework is built. Item rank and relevance are introduced through a probabilistic recommendation browsing model, building upon the same three basic concepts. Based on the combination of ground elements, and the assumptions of the browsing model, different metrics and variants unfold. We report experimental observations which validate and illustrate the properties of the proposed metrics.
Personalized Diversification of Search Resultsdavidvallet
Search personalization and diversification are often seen as oppos-ing alternatives to cope with query uncertainty, where, given an ambiguous query, it is either preferable to adapt the search result to a specific aspect that may interest the user (personalization) or to regard multiple aspects in order to maximize the probability that some query aspect is relevant to the user (diversification). In this work, we question this antagonistic view, and hypothesize that these two directions may in fact be effectively combined and enhance each other. We research the introduction of the user as an explicit random variable in state of the art diversification methods, thus developing a generalized framework for personalized diversi-fication. In order to evaluate our hypothesis, we conduct an evalu-ation with real users using crowdsourcing services. The obtained results suggest that the combination of personalization and diver-sification achieves competitive performance, improving the base-line, plain personalization, and plain diversification approaches in terms of both diversity and accuracy measures.
Semantic technologies for attribute based access: measurable security for the...Josef Noll
This presentation provides an intro into the need for "measurable security" when envisioning an Internet for each of us ("People"), powered by sensors and devices ("Things"), and providing Services tailored to your needs.
It handles the challenge of information security, postulating that different applications need different security mechanisms: "To inform somebody about a train arrival time" requires less security than "controlling an industrial plant by automated processes, based on input from sensors".
A poultry yield prediction model have then designed using a data mining and machine learning technique called Classification and Regression Tree (CART) algorithm. The developed model has been optimized and pruned using the Reduced Error Pruning (REP) algorithm to improve prediction accuracy. An algorithm to make the prediction model flexible and capable of making predictions irrespective of poultry size or population has been proposed. The model can be used by poultry farmers to predict yield even before a breeding season. The model can also be used to help farmers take decisions to ensure desirable yield at the end of the breeding season.
A Complete Analysis of Human Action Recognition Proceduresijtsrd
Due to concerns like backdrop cluttering, incomplete obstruction, scale disparities, viewpoint, illumination, and appearance, identifying activities of humans from a sequence of video or still photos are a complex issue. Multiple movement recognition structures is necessary for numerous applications, such as a video investigation mechanism, human computer interface HCI , and robotics for characterising human behaviour. In this work, we bestow a comprehensive assessment of recent and advanced designs involved in the classification of human activity. We outline a classification of human activity approaches and go through their benefits and drawbacks. Specifically, we classify human activities categorization approaches into two broad classes based on if or not they make use of information from several modes. Next, each of these classes is broken down into its subclasses, which illustrate how each category models human activity. Monisa Nazir | Shalini Bhadola | Kirti Bhaia | Rohini Sharma "A Complete Analysis of Human Action Recognition Procedures" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-5 , August 2022, URL: https://www.ijtsrd.com/papers/ijtsrd50522.pdf Paper URL: https://www.ijtsrd.com/computer-science/cognitive-science/50522/a-complete-analysis-of-human-action-recognition-procedures/monisa-nazir
ACM RecSys 2011 - Rank and Relevance in Novelty and Diversity Metrics for Rec...Pablo Castells
Slides of the paper presentation at RecSys 2011.
Abstract: The Recommender Systems community is paying increasing attention to novelty and diversity as key qualities beyond accuracy in real recommendation scenarios. Despite the raise of interest and work on the topic in recent years, we find that a clear common methodological and conceptual ground for the evaluation of these dimensions is still to be consolidated. Different evaluation metrics have been reported in the literature but the precise relation, distinction or equivalence between them has not been explicitly studied. Furthermore, the metrics reported so far miss important properties such as taking into consideration the ranking of recommended items, or whether items are relevant or not, when assessing the novelty and diversity of recommendations.
We present a formal framework for the definition of novelty and diversity metrics that unifies and generalizes several state of the art metrics. We identify three essential ground concepts at the roots of novelty and diversity: choice, discovery and relevance, upon which the framework is built. Item rank and relevance are introduced through a probabilistic recommendation browsing model, building upon the same three basic concepts. Based on the combination of ground elements, and the assumptions of the browsing model, different metrics and variants unfold. We report experimental observations which validate and illustrate the properties of the proposed metrics.
Personalized Diversification of Search Resultsdavidvallet
Search personalization and diversification are often seen as oppos-ing alternatives to cope with query uncertainty, where, given an ambiguous query, it is either preferable to adapt the search result to a specific aspect that may interest the user (personalization) or to regard multiple aspects in order to maximize the probability that some query aspect is relevant to the user (diversification). In this work, we question this antagonistic view, and hypothesize that these two directions may in fact be effectively combined and enhance each other. We research the introduction of the user as an explicit random variable in state of the art diversification methods, thus developing a generalized framework for personalized diversi-fication. In order to evaluate our hypothesis, we conduct an evalu-ation with real users using crowdsourcing services. The obtained results suggest that the combination of personalization and diver-sification achieves competitive performance, improving the base-line, plain personalization, and plain diversification approaches in terms of both diversity and accuracy measures.
Semantic technologies for attribute based access: measurable security for the...Josef Noll
This presentation provides an intro into the need for "measurable security" when envisioning an Internet for each of us ("People"), powered by sensors and devices ("Things"), and providing Services tailored to your needs.
It handles the challenge of information security, postulating that different applications need different security mechanisms: "To inform somebody about a train arrival time" requires less security than "controlling an industrial plant by automated processes, based on input from sensors".
A poultry yield prediction model have then designed using a data mining and machine learning technique called Classification and Regression Tree (CART) algorithm. The developed model has been optimized and pruned using the Reduced Error Pruning (REP) algorithm to improve prediction accuracy. An algorithm to make the prediction model flexible and capable of making predictions irrespective of poultry size or population has been proposed. The model can be used by poultry farmers to predict yield even before a breeding season. The model can also be used to help farmers take decisions to ensure desirable yield at the end of the breeding season.
A Complete Analysis of Human Action Recognition Proceduresijtsrd
Due to concerns like backdrop cluttering, incomplete obstruction, scale disparities, viewpoint, illumination, and appearance, identifying activities of humans from a sequence of video or still photos are a complex issue. Multiple movement recognition structures is necessary for numerous applications, such as a video investigation mechanism, human computer interface HCI , and robotics for characterising human behaviour. In this work, we bestow a comprehensive assessment of recent and advanced designs involved in the classification of human activity. We outline a classification of human activity approaches and go through their benefits and drawbacks. Specifically, we classify human activities categorization approaches into two broad classes based on if or not they make use of information from several modes. Next, each of these classes is broken down into its subclasses, which illustrate how each category models human activity. Monisa Nazir | Shalini Bhadola | Kirti Bhaia | Rohini Sharma "A Complete Analysis of Human Action Recognition Procedures" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-6 | Issue-5 , August 2022, URL: https://www.ijtsrd.com/papers/ijtsrd50522.pdf Paper URL: https://www.ijtsrd.com/computer-science/cognitive-science/50522/a-complete-analysis-of-human-action-recognition-procedures/monisa-nazir
SIGIR 2011 Poster - Intent-Oriented Diversity in Recommender SystemsPablo Castells
Diversity as a relevant dimension of retrieval quality is receiving increasing attention in the Information Retrieval and Recommender Systems (RS) fields. The problem has nonetheless been approached under different views and formulations in IR and RS respectively, giving rise to different models, methodologies, and metrics, with little convergence between both fields. In this poster we explore the adaptation of diversity metrics, techniques, and principles from adhoc IR to the recommendation task, by introducing the notion of user profile aspect as an analogue of query intent. As a particular approach, user aspects are automatically extracted from latent item features. Empirical results support the proposed approach and provide further insights.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
SIGIR 2011 Poster - Intent-Oriented Diversity in Recommender SystemsPablo Castells
Diversity as a relevant dimension of retrieval quality is receiving increasing attention in the Information Retrieval and Recommender Systems (RS) fields. The problem has nonetheless been approached under different views and formulations in IR and RS respectively, giving rise to different models, methodologies, and metrics, with little convergence between both fields. In this poster we explore the adaptation of diversity metrics, techniques, and principles from adhoc IR to the recommendation task, by introducing the notion of user profile aspect as an analogue of query intent. As a particular approach, user aspects are automatically extracted from latent item features. Empirical results support the proposed approach and provide further insights.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Key Trends Shaping the Future of Infrastructure.pdf
SIGIR 2012 - Explicit Relevance Models in Intent-Oriented Information Retrieval Diversification
1. 35th Annual International ACM SIGIR Conference on Research
and Development in Information Retrieval (SIGIR 2012)
Explicit Relevance Models
in Intent-Aware IR Diversification
Saúl Vargas, Pablo Castells and David Vallet
Universidad Autónoma de Madrid
http://ir.ii.uam.es
Portland, OR, 13 August 2012
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
2. Outline
Context: IR diversification formulation and algorithms
Proposed approach: relevance-based reformulation
of diversification algorithms
Experiments
Adjustable tolerance to redundancy
Conclusion
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
3. IR diversity – Brief recap
Nutrition /
Health
Appliance
Chemical
element
Golf
Mining /
Metallurgy
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
4. IR diversity – Brief recap
Nutrition /
Health
Appliance
Diversity as a means to address uncertainty in user queries
– The same query may have different intents or aspects in the Chemical
information need underneath element
Revision of document relevance independence
– Marginal utility of additional relevant documents decreases fast
Golf
Trade diminishing marginal utility for increased intent coverage
– Thus maximize the number of users who obtain at least some
useful document Mining /
Metallurgy
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
5. IR diversification – Problem statement
Given a query 𝑞 on a collection
Find 𝑆 ⊂ of given size maximizing: NP-hard
𝑝 some 𝑑 ∈ 𝑆 relevant 𝑞
Agrawal 2009, Santos 2010, Chen 2006, …
𝑅− 𝑆 𝑆
Baseline arg max 𝝋 𝒅, 𝑺 𝒒 Diversified Greedy
ranking 𝑑∈𝑅−𝑆 ranking approx
𝑝(𝑑|𝑞)
𝝋 𝒅, 𝑺 𝒒 ∝ 𝑝 𝑑 is relevant ∧ no 𝑑 ′ ∈ 𝑆 is relevant 𝑞
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
6. IR diversity – Instantiations of objective function
State of the art aspect-based approaches
IA-Select scheme (Agrawal 2009)
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝒛 𝑞 𝑝 𝒛 𝑑 𝑝 𝑑 𝑞 1 − 𝑝 𝒛 𝑑′ 𝑝 𝑑 𝑞
𝑧 𝑑 ′ ∈𝑆
Explicit query aspects
xQuAD scheme (Santos 2010)
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑑, ¬ 𝑆 𝑞
= 1− 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝒛 𝑞 𝑝 𝑑 𝑞, 𝒛 1 − 𝑝 𝑑′ 𝑞, 𝒛
𝑧 𝑑 ′ ∈𝑆
Explicit query aspects
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
7. IR diversity – Instantiations of objective function
State of the art aspect-based approaches
IA-Select scheme (Agrawal 2009)
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝑧 𝑞 𝑝 𝑧 𝑑 𝑝 𝑑 𝑞 1 − 𝑝 𝑧 𝑑′ 𝑝 𝑑 𝑞
𝑧 𝑑 ′ ∈𝑆
Query aspect
xQuAD scheme (Santos 2010)
coverage
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑑, ¬ 𝑆 𝑞
= 1− 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝑑 𝑞, 𝑧 1 − 𝑝 𝑑 ′ 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
8. IR diversity – Instantiations of objective function
State of the art aspect-based approaches
IA-Select scheme (Agrawal 2009)
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝑧 𝑞 𝑝 𝑧 𝑑 𝑝 𝑑 𝑞 1 − 𝑝 𝑧 𝑑′ 𝑝 𝑑 𝑞
𝑧 𝑑 ′ ∈𝑆
Document “relevance”
xQuAD scheme (Santos 2010)
for query aspect
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑑, ¬ 𝑆 𝑞
= 1− 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝑑 𝑞, 𝑧 1 − 𝑝 𝑑 ′ 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
9. IR diversity – Instantiations of objective function
State of the art aspect-based approaches
IA-Select scheme (Agrawal 2009)
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝑧 𝑞 𝑝 𝑧 𝑑 𝑝 𝑑 𝑞 1 − 𝑝 𝑧 𝑑′ 𝑝 𝑑 𝑞
𝑧 𝑑 ′ ∈𝑆
xQuAD scheme (Santos 2010) Redundancy
penalization
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑑, ¬ 𝑆 𝑞
= 1− 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝑑 𝑞, 𝑧 1 − 𝑝 𝑑 ′ 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
10. IR diversity – Instantiations of objective function
State of the art aspect-based approaches
IA-Select scheme (Agrawal 2009)
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝑧 𝑞 𝑝 𝑧 𝑑 𝑝 𝑑 𝑞 1 − 𝑝 𝑧 𝑑′ 𝑝 𝑑 𝑞
𝑧 𝑑 ′ ∈𝑆
xQuAD scheme (Santos 2010)
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑑, ¬ 𝑆 𝑞
= 1− 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝑑 𝑞, 𝑧 1 − 𝑝 𝑑 ′ 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
Mixture with baseline 𝜆 Degree of diversification
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
11. IR diversity – Instantiations of objective function
𝜑 𝑑, 𝑆 𝑞 ∝ 𝑝 𝑑 is 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 ∧ no 𝑑 ′ ∈ 𝑆 is 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 𝑞
IA-Select scheme (Agrawal 2009)
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝑧 𝑞 𝑝 𝑧 𝑑 𝑝 𝑑 𝑞 1 − 𝑝 𝑧 𝑑′ 𝑝 𝑑 𝑞
𝑧 𝑑 ′ ∈𝑆
Probability to
xQuAD scheme (Santos 2010) observe documents
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑑, ¬ 𝑆 𝑞
= 1− 𝜆 𝑝 𝑑 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝑑 𝑞, 𝑧 1 − 𝑝 𝑑 ′ 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
12. IR diversity – Relevance-based instantiation of objective function
𝜑 𝑑, 𝑆 𝑞 ∝ 𝑝 𝑑 is 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 ∧ no 𝑑 ′ ∈ 𝑆 is 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 𝑞
IA-Select scheme – relevance-based Our proposal
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝑧 𝑞 𝑝 𝒓 𝑑, 𝑞, 𝑧 1 − 𝑝 𝒓 𝑑 ′ , 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
Probability
xQuAD scheme – relevance-based of relevance
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝒓 𝑑 𝑞 + 𝜆 𝑝 𝒓 𝑑 , ¬ 𝒓 𝑆 𝑞
= 1 − 𝜆 𝑝 𝒓 𝑑, 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝒓 𝑑, 𝑞, 𝑧 1 − 𝑝 𝒓 𝑑′ , 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
13. IR diversity – Relevance-based instantiation of objective function
𝜑 𝑑, 𝑆 𝑞 ∝ 𝑝 𝑑 is 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 ∧ no 𝑑 ′ ∈ 𝑆 is 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 𝑞
IA-Select scheme – relevance-based
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝑧 𝑞 𝑝 𝑟 𝑑, 𝑞, 𝑧 1 − 𝑝 𝑟 𝑑 ′ , 𝑞, 𝑧
𝑧 More literal interpretation
𝑑 ′ ∈𝑆
of initial problem statement
xQuAD scheme – relevance-based
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝒓 𝑑 𝑞 + 𝜆 𝑝 𝒓 𝑑 , ¬ 𝒓 𝑆 𝑞
= 1 − 𝜆 𝑝 𝑟 𝑑, 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝑟 𝑑, 𝑞, 𝑧 1 − 𝑝 𝑟 𝑑 ′ , 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
14. IR diversity – Relevance-based instantiation of objective function
𝜑 𝑑, 𝑆 𝑞 ∝ 𝑝 𝑑 is relevant ∧ no 𝑑′ ∈ 𝑆 is relevant 𝑞
IA-Select scheme – relevance-based
𝜑 𝑑, 𝑆 𝑞 = 𝑝 𝑧 𝑞 𝑝 𝑟 𝑑, 𝑞, 𝑧 1 − 𝑝 𝑟 𝑑 ′ , 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
Equivalent
xQuAD scheme – relevance-based
for 𝜆 = 1
𝜑 𝑑, 𝑆 𝑞 = 1 − 𝜆 𝑝 𝑟 𝑑 𝑞 + 𝜆 𝑝 𝑟 𝑑 , ¬ 𝑟 𝑆 𝑞
= 1 − 𝜆 𝑝 𝑟 𝑑, 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝑟 𝑑, 𝑞, 𝑧 1 − 𝑝 𝑟 𝑑 ′ , 𝑞, 𝑧
𝑧 𝑑 ′ ∈𝑆
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
15. Relevance distirbution vs. document distribution
𝑝 𝑟 𝑑,· vs. 𝑝 𝑑 · – The difference does matter (in this context)
1
𝑝 𝑑 𝑞, 𝑧 = 1
𝑑
𝑝 𝑟 𝑑, 𝑞, 𝑧 = E nr relevant docs ≥ 1
𝑑
Different potential behavior
E.g. stronger redundancy penalization
Potential rank
0 equivalences do
𝑑 not apply here
1 − 𝜆 𝑝 𝑟 𝑑, 𝑞 + 𝜆 𝑝 𝑧 𝑞 𝑝 𝑟 𝑑, 𝑞, 𝑧 1 − 𝑝 𝑟 𝑑′ , 𝑞, 𝑧
IRG 𝑧 ′
Explicit Relevance Models in Intent-Aware IR Diversification
𝑑 ∈𝑆
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
16. Relevance-based greedy diversification
Relevance-based reformulation of diversification algorithm
1. Need to estimate 𝑝 𝑟 𝑑, 𝑞, 𝑧
2. Does it work? Test empirically
3. Further development: parameterized tolerance to redundancy
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
17. Aspect-based relevance model
Estimate 𝒑 𝒓 𝒅, 𝒒, 𝒛
Cannot use odds, logs, constant removal… or any other rank-preserving step
(we need the specific values)
𝑝 𝑟 𝑑, 𝑞 Positional relevance 𝑝 𝑟 rank 𝑑, 𝑞
Estimate 𝑝 𝑧 𝑑 or 𝑝 𝑧 𝑞 depending
𝑝 𝑧 𝑑
on available observations:
𝑝 𝑟 𝑑, 𝑞, 𝑧 𝑝 𝑧 𝑞 • 𝑧 as document classes (e.g. ODP)
• 𝑧 as subqueries (e.g. reformulations)
𝑝(𝑧)
Then derive the other two parameters
𝑝 𝑑 𝑞 Normalized baseline IR system score
(as in e.g. Bache 2009)
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
18. Positional relevance distribution estimate
𝒑 𝒓 𝒅, 𝒒 ∼ 𝑝 𝑟 rank 𝑑, 𝑞 = 𝒑 𝒓 𝒌
1E+00
1E-01 𝑝 𝑟 𝑘
pLSA
1E-02
p(r|k)
Lemur Precision
1E-03 estimates
1E-04 Click log
AOL statistics
1E-05
0 20 40 60 80 100 120 140 160 180 200
𝑘
k
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
19. Relevance-based greedy diversification
Relevance-based reformulation of diversification algorithm
1. Need to estimate 𝑝 𝑟 𝑑, 𝑞, 𝑧
2. Does it work? Test empirically
3. Further development: parameterized tolerance to redundancy
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
20. Experiments
Search diversity
Collection: ClueWeb09 category B (50M documents)
Query/subtopic set: TREC 2009/10 diversity task (100 queries)
Baseline ranking: Lemur Indri search engine (Web service) Diversified top n : 100
Query aspect space:
a) ODP categories level 4 (~7K categories)
b) TREC subtopics (oracle for reference)
Specific parameter estimates:
𝑝 𝑧 𝑞 Uniform
ODP categories: semi-supervised text classification by Textwise
𝑝 𝑧 𝑑
TREC subtopics: Indri search system run on 𝑧 as if a query
i. P@k estimates with TREC relevance judgments (2-fold 2009/10 cross validation)
𝑝 𝑟 𝑘
ii. Click statistics from AOL log (thus different IR system)
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
21. Experiments – Search diversity on TREC
xQuAD scheme
Based on 𝑝 𝑟 𝑑, 𝑞, 𝑧
𝑝 𝑟 𝑘 from qrels
Based on 𝑝 𝑑 𝑞, 𝑧
ODP categories TREC subtopics
ERR-IA
ERR-IA
λ λ
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
22. Experiments – Search diversity on TREC
-nDCG@20 ERR-IA@20 nDCGIA@20 S-recall@20
Lemur - 0.2587 0.1630 0.2396 0.4636
IA-Select - 0.2651 0.1681 0.2423 0.4483
categories
a) ODP
xQuAD 0.9 0.2675 0.1656 0.2451 0.4864
Rel-based i. Qrels 0.1 0.2858△▲ 0.1828△▲ 0.2655△▲ 0.4898▲△
xQuAD ii. Clicks 0.4 0.2841▲△ 0.1831△△ 0.2605△▲ 0.4830▲▽
IA-Select - 0.3541 0.2346 0.3213 0.5787
subtopics
b) TREC
xQuAD 1.0 0.3445 0.2241 0.3127 0.5704
Rel-based i. Qrels 1.0 0.3543△△ 0.2349△△ 0.3192▽△ 0.5782▽△
xQuAD ii. Clicks 1.0 0.3512▽△ 0.2320▽△ 0.3166▽△ 0.5748▽△
“informally” maximizing ERR-IA by 0.1 steps for each diversifier
Best value in bold green
▲▼ 𝑝 < 0.05
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
23. Experiments
Recommendation diversity
Collection: 6K users, 4K movies, 1M ratings
Dataset 1: MovieLens 1M
Subtopic set: 10 movie genres
Collection: 1K users, 175K artists, 20M playcounts
Dataset 2: Last.fm crawl
Subtopic set: 120K social tags on artists by Last.fm users
Queries users
Adaptation of IR diversity paradigm Documents items (movies, music artists)
Subtopics item features (genres, tags)
(Vargas, Castells & Vallet SIGIR 2011)
Relevance judgments test ratings from data split
a) pLSA
Baseline rankings: Diversified top n: 100
b) Popularity-based recommendation
Specific parameter estimates:
𝑝 𝑧 𝑞 Uniform
𝑝 𝑧 𝑑 Uniform on 𝑑 (based on binary aspect/item association)
𝑝 𝑟 𝑘 P@k estimates with 2-fold cross-validation on test users
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
24. Experiments – Recommendation diversity on MovieLens and Last.fm
pLSA recommender MovieLens 1M Last.fm
ERR-IA
by item popularity
Recommendation
ERR-IA
Based on 𝑝 𝑟 𝑑, 𝑞, 𝑧
Based on 𝑝 𝑑 𝑞, 𝑧
λ λ
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
25. Relevance-based greedy diversification
Relevance-based reformulation of diversification algorithm
1. Need to estimate 𝑝 𝑟 𝑑, 𝑞, 𝑧
2. Does it work? Test empirically
3. Further development: parameterized tolerance to redundancy
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
26. Adjustable tolerance to redundancy
Generalization of relevance-based diversification scheme
Formally support adjustable redundancy penalization
Approach: generalize relevance to browsing model
Tolerance to
redundancy
𝜑 𝑑, 𝑆 𝑞 = 1 − λ 𝑝 𝑟 𝑑, 𝑞 + λ 𝑝 𝑟 𝑑 , ¬ 𝒔𝒕𝒐𝒑 𝑆 𝑞 =⋯
= 1 − λ 𝑝 𝑟 𝑑, 𝑞 + λ 𝑝 𝑧 𝑞 𝑝 𝑟 𝑑, 𝑧, 𝑞 1 − 𝑝 𝑟 𝑑 ′ , 𝑧, 𝑞 𝒑 𝒔𝒕𝒐𝒑 𝒓
𝑐 𝑑 ′ ∈𝑆
Adjustable redundancy tolerance parameter 𝑝 𝑠𝑡𝑜𝑝 𝑟 ∈ [0,1]
– High 𝑝 𝑠𝑡𝑜𝑝 𝑟 for aggresive penalization, low for e.g. high-recall searches
– In this view, original formulations would implicitly assume 𝑝 𝑠𝑡𝑜𝑝 𝑟 = 1,
i.e. a single relevant document is sought
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
27. Adjustable tolerance to redundancy
Empirical observation: 𝑝 𝑠𝑡𝑜𝑝 𝑟 vs. in -nDCG
Search task Recommendation task
Lemur on TREC / Subtopics pLSA on MovieLens / Genres
1 1
𝑝 𝑠𝑡𝑜𝑝 𝑟
𝑝 𝑠𝑡𝑜𝑝 𝑟
0 1 0 1
best -nDCG value of column
For each
worst -nDCG value of column
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012
28. Conclusion
Alternative, relevance-based formulation of greedy aspect-based diversification
– Unifies two previous aspect-based algorithms
– More literal expression of formal problem statement (and metrics?)
𝑝 𝑟 𝑑, 𝑞, 𝑧 vs. 𝑝 𝑑 𝑞, 𝑧
– Literal value estimates needed (rather than rank-equivalent approximations)
– Estimate based on positional relevance (relevance or click data needed)
Seems to perform well empirically
– Light requirements on relevance or click data for training positional relevance
– Improvement trend, but needs to be tested under further optimizations
Formal support for redundancy tolerance adjustment
IRG
Explicit Relevance Models in Intent-Aware IR Diversification
35th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2012)
IR Group @ UAM Portland, OR, 13 August 2012