The document proposes a quantitative study to measure the correlation between individual performance of knowledge workers and their email communications. It aims to analyze email content and metadata to identify independent variables like task completion, cohesion, trust, and conflict resolution. These would be correlated with dependent variables including task value and acknowledged contributions. The study population would be knowledge workers whose email communications facilitate problem solving. It would use the Enron email corpus for data collection and apply content analysis and correlation analysis to test hypotheses about relationships between individual and team performance metrics identified in email data.
Query Recommendation by using Collaborative Filtering ApproachIRJET Journal
This document proposes a system called QDMiner to mine query facets from the top search results for a query. It uses collaborative filtering techniques to recommend the top-k results that are most relevant to a user's interests.
QDMiner first retrieves the top search results from a search engine. It then mines frequent lists from the HTML tags and free text within the results to identify query facets. It groups common lists and ranks the facets and items based on their appearances. QDMiner represents the search results in two models: the Unique Website Model and Context Similarity Model, to order the query facets.
To recommend results, QDMiner uses collaborative filtering techniques including item-based and user-based
A Hybrid Approach for Personalized Recommender System Using Weighted TFIDF on...Editor IJCATR
Recommender systems are gaining a great popularity with the emergence of e-commerce and social media on the internet. These recommender systems enable users’ access products or services that they would otherwise not be aware of due to the wealth of information on the internet. Two traditional methods used to develop recommender systems are content-based and collaborative filtering. While both methods have their strengths, they also have weaknesses; such as sparsity, new item and new user problem that leads to poor recommendation quality. Some of these weaknesses can be overcome by combining two or more methods to form a hybrid recommender system. This paper deals with issues related to the design and evaluation of a personalized hybrid recommender system that combines content-based and collaborative filtering methods to improve the precision of recommendation. Experiments done using MovieLens dataset shows the personalized hybrid recommender system outperforms the two traditional methods implemented separately.
Competitiveness of Top 100 U.S. Universities: A Benchmark Study Using Data En...ertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/benchmark-study-using-data-envelopment-analysis/
This study presents a comprehensive benchmarking study of the top 100 U.S. Universities. The methodologies used to come up with insights into the domain are Data Envelopment Analysis (DEA) and information visualization. Various approaches to evaluating academic institutions have appeared in the literature, including a DEA literature dealing with the ranking of universities. Our study contributes to this literature by the extensive incorporation of information visualization and subsequently the discovery of new insights.
Rule-based expert systems for supporting university studentsertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/rule-based-expert-systems-for-supporting-university-students/
There are more than 15 million college students in the US alone. Academic advising for courses and scholarships is typically performed by human advisors, bringing an immense managerial workload to faculty members, as well as other staff at universities. This paper reports and discusses the development of two educational expert systems at a private international university. The first expert system is a course advising system which recommends courses to undergraduate students. The second system suggests scholarships to undergraduate students based on their eligibility. While there have been reported systems for course advising, the literature does not seem to contain any references to expert systems for scholarship recommendation and eligibility checking. Therefore the scholarship recommender that we developed is first of its kind. Both systems have been implemented and tested using Oracle Policy Automation (OPA) software.
Industrial Benchmarking through Information Visualization and Data Envelopmen...ertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/industrial-benchmarking-through-information-visualization-and-data-envelopment-analysis-a-new-framework/
We present a benchmarking study on the companies in the Turkish food industry based on their financial data. Our aim is to develop a comprehensive benchmarking framework using Data Envelopment Analysis (DEA) and information visualization. Besides DEA, a traditional tool for financial benchmarking based on financial ratios is also incorporated. The consistency/inconsistency between the two methodologies is investigated using information visualization tools. In addition, k-means clustering, a fundamental method from machine learning, is applied to understand the relationship between k-means clustering and DEA.
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/modelling-the-supply-chain-perception-gaps/
This study applies the research of perception gap analysis to supply chain integration and develops a generic model, the 3-Level Gaps Model, with the goal of contributing to harmonization and integration in the supply chain. The model suggests that significant perception gaps may exist among supply chain members with regards to the importance of different performance criteria. The concept of the model is conceived through an empirical and inductive approach, combining the research discipline of supply chain relationship and perception gap analysis. First hand data has been collected through a survey across a key buyer in the motor insurance industry and its eight suppliers. Rigorous statistical analysis testified the research hypotheses, which in turn verified the validity and relevance of the developed 3-Level Gaps Model. The research reveals the significant existence of supply chain perception gaps at all three levels as defined, which could be the root-causes to underperformed supply chain.
IRJET- Analysis of Rating Difference and User InterestIRJET Journal
This document summarizes a research paper that proposes a collaborative filtering recommendation algorithm that incorporates rating differences and user interests. It first adds a rating difference factor to the traditional collaborative filtering algorithm. It then calculates user interests based on item attributes and the similarity between user interests. Recommendations are made by weighting user rating differences and interest similarities. The proposed algorithm is shown to reduce error rates and improve accuracy compared to traditional collaborative filtering.
The document proposes a quantitative study to measure the correlation between individual performance of knowledge workers and their email communications. It aims to analyze email content and metadata to identify independent variables like task completion, cohesion, trust, and conflict resolution. These would be correlated with dependent variables including task value and acknowledged contributions. The study population would be knowledge workers whose email communications facilitate problem solving. It would use the Enron email corpus for data collection and apply content analysis and correlation analysis to test hypotheses about relationships between individual and team performance metrics identified in email data.
Query Recommendation by using Collaborative Filtering ApproachIRJET Journal
This document proposes a system called QDMiner to mine query facets from the top search results for a query. It uses collaborative filtering techniques to recommend the top-k results that are most relevant to a user's interests.
QDMiner first retrieves the top search results from a search engine. It then mines frequent lists from the HTML tags and free text within the results to identify query facets. It groups common lists and ranks the facets and items based on their appearances. QDMiner represents the search results in two models: the Unique Website Model and Context Similarity Model, to order the query facets.
To recommend results, QDMiner uses collaborative filtering techniques including item-based and user-based
A Hybrid Approach for Personalized Recommender System Using Weighted TFIDF on...Editor IJCATR
Recommender systems are gaining a great popularity with the emergence of e-commerce and social media on the internet. These recommender systems enable users’ access products or services that they would otherwise not be aware of due to the wealth of information on the internet. Two traditional methods used to develop recommender systems are content-based and collaborative filtering. While both methods have their strengths, they also have weaknesses; such as sparsity, new item and new user problem that leads to poor recommendation quality. Some of these weaknesses can be overcome by combining two or more methods to form a hybrid recommender system. This paper deals with issues related to the design and evaluation of a personalized hybrid recommender system that combines content-based and collaborative filtering methods to improve the precision of recommendation. Experiments done using MovieLens dataset shows the personalized hybrid recommender system outperforms the two traditional methods implemented separately.
Competitiveness of Top 100 U.S. Universities: A Benchmark Study Using Data En...ertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/benchmark-study-using-data-envelopment-analysis/
This study presents a comprehensive benchmarking study of the top 100 U.S. Universities. The methodologies used to come up with insights into the domain are Data Envelopment Analysis (DEA) and information visualization. Various approaches to evaluating academic institutions have appeared in the literature, including a DEA literature dealing with the ranking of universities. Our study contributes to this literature by the extensive incorporation of information visualization and subsequently the discovery of new insights.
Rule-based expert systems for supporting university studentsertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/rule-based-expert-systems-for-supporting-university-students/
There are more than 15 million college students in the US alone. Academic advising for courses and scholarships is typically performed by human advisors, bringing an immense managerial workload to faculty members, as well as other staff at universities. This paper reports and discusses the development of two educational expert systems at a private international university. The first expert system is a course advising system which recommends courses to undergraduate students. The second system suggests scholarships to undergraduate students based on their eligibility. While there have been reported systems for course advising, the literature does not seem to contain any references to expert systems for scholarship recommendation and eligibility checking. Therefore the scholarship recommender that we developed is first of its kind. Both systems have been implemented and tested using Oracle Policy Automation (OPA) software.
Industrial Benchmarking through Information Visualization and Data Envelopmen...ertekg
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/industrial-benchmarking-through-information-visualization-and-data-envelopment-analysis-a-new-framework/
We present a benchmarking study on the companies in the Turkish food industry based on their financial data. Our aim is to develop a comprehensive benchmarking framework using Data Envelopment Analysis (DEA) and information visualization. Besides DEA, a traditional tool for financial benchmarking based on financial ratios is also incorporated. The consistency/inconsistency between the two methodologies is investigated using information visualization tools. In addition, k-means clustering, a fundamental method from machine learning, is applied to understand the relationship between k-means clustering and DEA.
Download Link > https://ertekprojects.com/gurdal-ertek-publications/blog/modelling-the-supply-chain-perception-gaps/
This study applies the research of perception gap analysis to supply chain integration and develops a generic model, the 3-Level Gaps Model, with the goal of contributing to harmonization and integration in the supply chain. The model suggests that significant perception gaps may exist among supply chain members with regards to the importance of different performance criteria. The concept of the model is conceived through an empirical and inductive approach, combining the research discipline of supply chain relationship and perception gap analysis. First hand data has been collected through a survey across a key buyer in the motor insurance industry and its eight suppliers. Rigorous statistical analysis testified the research hypotheses, which in turn verified the validity and relevance of the developed 3-Level Gaps Model. The research reveals the significant existence of supply chain perception gaps at all three levels as defined, which could be the root-causes to underperformed supply chain.
IRJET- Analysis of Rating Difference and User InterestIRJET Journal
This document summarizes a research paper that proposes a collaborative filtering recommendation algorithm that incorporates rating differences and user interests. It first adds a rating difference factor to the traditional collaborative filtering algorithm. It then calculates user interests based on item attributes and the similarity between user interests. Recommendations are made by weighting user rating differences and interest similarities. The proposed algorithm is shown to reduce error rates and improve accuracy compared to traditional collaborative filtering.
This document presents a tag recommendation model for collaborative bookmarking systems. The team proposes using Lucene indexing and clustering approaches to suggest the most relevant tags for a given URL and its description. They describe extracting tags from the URL, user's previous tags, description text, and related words to then rank and recommend tags using a weighted clustering approach. The proposed architecture crawls URLs to extract content, indexes it with Lucene, and identifies candidate tags from multiple sources before applying clustering and weighting to select the most relevant tags.
A comprehensive survey of link mining and anomalies detectioncsandit
This document provides an overview of link mining and its application to anomalies detection. It discusses the emergence of link mining, key link mining tasks including object-related, graph-related and link-related tasks. Challenges of link mining are described along with applications. Different types of anomalies are defined and three main approaches to anomalies detection - supervised, semi-supervised and unsupervised - are outlined along with common methods like nearest neighbor, clustering, statistical and information-based approaches.
Behavioural Modelling Outcomes prediction using Casual FactorsIJMER
This document describes a new approach to machine science that uses crowdsourcing to generate predictive models of human behavioral outcomes. Users collectively formulate predictive survey questions and provide responses, which are then used to build predictive models without requiring domain expertise. Two experiments implementing this approach successfully generated models to predict monthly household electricity usage and user body mass index based on the crowdsourced survey data. The approach harnesses the "wisdom of crowds" to both discover predictive factors and collect data, representing an innovative way to generate predictive models without expert involvement at each step of the process.
1. The document discusses various IEEE 2012-2013 software projects in domains like Java, J2ME, J2EE, .NET, MATLAB and NS2.
2. SBGC provides technical guidance and support for students' IEEE projects, including project reports, materials, certificates etc.
3. A variety of IEEE projects are offered for students from different engineering departments like ECE, EEE, CSE etc. and levels like B.E, M.Tech, MBA.
IRJET- Review on Different Recommendation Techniques for GRS in Online Social...IRJET Journal
This document reviews different recommendation techniques for group recommender systems (GRS) in online social networks. It discusses traditional recommender approaches like content-based filtering and collaborative filtering. It also reviews related work applying opinion dynamics models and weight matrices to GRS. The document concludes that using a smart weights matrix to consider relationships between group members' preferences in a recommendation process improves aggregation and ensures consensus, providing the best way to recommend items to a complete group.
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...Aleksi Aaltonen
Presentation at the University of Miami on 3 December 2021 on how Stack Overflow improved the retention of new contributors whose initial question is rejected (closed) as substandard. The presentation is based on a paper coauthored with Sunil Wattal.
Ontological and clustering approach for content based recommendation systemsvikramadityajakkula
This document proposes a novel content-based recommendation system that uses ontological graphs and dynamic weighted ranking. It builds an adaptive ranking mechanism based on user selections and preferences to improve recommendation accuracy over time. The system segments data into ontological groups and identifies relationships between entities. It then calculates similarity between entities using feature vectors and ranks entities based on weights assigned to their connections in the ontological graph. These weights are updated dynamically based on user feedback to personalize recommendations for each user. The paper describes testing this approach in a recipe recommendation tool called RecipeMiner, which produced coherent recommendations that adapted to user preferences.
The document proposes a novel ranking approach called Manifold Ranking with Sink Points (MRSP) that addresses relevance, importance, and diversity simultaneously. MRSP uses manifold ranking over data objects to find the most relevant and important objects. It then designates ranked objects as "sink points" to prevent redundant objects from receiving high ranks. The approach is applied to update summarization and query recommendation tasks, demonstrating strong performance compared to existing methods.
Recommender System in light of Big DataKhadija Atiya
This document summarizes a research paper that investigates using singular value decomposition (SVD) to address challenges faced by recommender systems in light of big data. It discusses how collaborative filtering recommender systems are impacted by issues like scalability, sparsity, and cold starts with large datasets. The document then provides background on SVD and how it can be applied to collaborative filtering as a model-based approach. It describes an implementation of SVD-based recommender system using Apache Hadoop and Spark on a large dataset to validate its applicability for big data and evaluate the tools. The results showed SVD approach provides comparable performance to previous experiments on smaller datasets.
A Study of Neural Network Learning-Based Recommender Systemtheijes
This document summarizes a study that proposes a neural network learning model for recommender systems. The study aims to improve collaborative filtering methods by estimating user preferences based on learned correlations between users through a neural network. The proposed method was tested on MovieLens data and showed improved precision of 6.7% compared to other techniques. Additionally, the study found that precision and recall improved further, by 3.5% and 2.4% respectively, when including film genre information in the neural network learning. The document concludes the proposed technique can utilize diverse data sources and perform well regardless of data complexity compared to other recommender system methods.
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...IJDKP
This summary provides the key details about a new soft hierarchical clustering algorithm called Fuzzy Hierarchical Co-clustering (FHCC) that is proposed for collaborative filtering recommendation. FHCC simultaneously generates hierarchical clusters of users and products based on a user-product rating matrix to detect potential user-product joint groups. It uses a fuzzy set approach to allow each user and product to belong to multiple clusters rather than a single cluster. The algorithm works by initially forming singleton co-clusters of individual users and products, then repeatedly merging the most similar pair of co-clusters until a single cluster remains. A hybrid similarity measure is used to calculate similarity between co-clusters based on their user, product, and rating components. The algorithm is intended to provide
A SURVEY OF LINK MINING AND ANOMALIES DETECTIONIJDKP
This document discusses link mining and its application in detecting anomalies. It begins by defining link mining as focusing on discovering explicit links between objects, as opposed to data mining which aims to find patterns within datasets. The document then surveys different types of anomalies that can be detected through link mining, including contextual, point, collective, online, and distributed anomalies. It also discusses challenges in link mining like logical vs statistical dependencies and the skewed class distribution problem in link prediction. Applications of link mining mentioned include social networks, epidemiology, and bibliographic analysis. Overall, the document provides an overview of the emerging field of link mining and its relevance for detecting unusual or anomalous links within linked datasets.
1. The document discusses the prospects for using learning analytics to achieve adaptive learning models. It describes adaptive learning and different levels of adaptive technologies, including platforms that react to individual user data and those that leverage aggregated data across users.
2. It outlines the pathway to achieving adaptive learning analytics, including using LMS analytics dashboards, predictive analytics, and adaptive learning analytics. Case studies and examples of existing applications are provided.
3. A proof of concept reference model for learning analytics is proposed, including a basic analytics process and an advanced process using predictive and adaptive algorithms. Linked open data for connecting curriculum standards and digital resources is also discussed.
Enabling reuse of arguments and opinions in open collaboration systems PhD vi...jodischneider
This document summarizes a PhD thesis on enabling the reuse of arguments and opinions in open collaboration systems. It discusses three research questions: 1) opportunities and requirements for argumentation support, 2) common arguments used in these systems, and 3) structuring arguments to support reuse. The methodology involved analyzing discussions from Wikipedia and open collaboration projects using argumentation theories like Walton's schemes and factors analysis. The goal is to develop semantic structures and visualizations to help people understand diverse opinions and make collaborative decisions. A prototype system tested with users found structuring discussions by key factors helped people evaluate arguments more effectively.
This document summarizes a research paper that proposes a hybrid recommendation approach for tourism systems using classification based on association rules and fuzzy logic. The paper describes common problems with recommender systems like sparsity and performance issues. It then presents a hybrid method combining clustering, associative classification, and fuzzy logic to address these problems. The method was implemented and evaluated in a tourism recommender system, with results showing it can improve recommendation quality by reducing limitations of other approaches.
This document summarizes a research paper that proposes a novel approach for dynamic personalized recommendation. It utilizes information from user ratings and profiles to develop dynamic features that describe user preferences over multiple phases of interest. An adaptive weighting algorithm then makes recommendations by weighting these dynamic features based on the amount of rating data available. The proposed approach was tested on public datasets and performed well for dynamic recommendation compared to existing algorithms.
This document presents a case study on applying a data analytics approach to conducting a systematic literature review on master data management. It outlines the steps taken, including defining review questions, searching multiple databases and sources, combining and preprocessing the data, and performing descriptive and text analyses. The analyses addressed questions about trends in publications over time, primary databases, publication types, and frequent keywords. This provided insights into the progress and topics within the master data management research domain. The presented structured approach aims to improve the replicability of systematic literature reviews.
Integrated expert recommendation model for online communitiesst02IJwest
Online communities have become vital places for Web 2.0 users to share knowledg
e and experiences.
Recently, finding expertise user in community has become an important research issue. This paper
proposes a novel cascaded model for expert recommendation using aggregated knowledge extracted from
enormous contents and social network fe
atures. Vector space model is used to compute the relevance of
published content with respect
to a specific query while PageRank
algorithm is applied to rank candidate
experts. The experimental results sho
w that the proposed model is
an effective recommen
dation which can
guarantee that the most candidate experts are both highly relevant to the specific queries and highly
influential in corresponding areas
Product Recommendation Systems based on Hybrid Approach TechnologyIRJET Journal
This document discusses hybrid recommendation systems for e-commerce. It begins with an introduction to recommendation systems and their use by e-commerce companies. It then discusses different types of recommendation techniques, including content-based filtering, collaborative filtering, and hybrid approaches. Specifically, it describes using a hybrid approach that combines content-based filtering and time sequence collaborative filtering algorithms. The document concludes that this hybrid method can provide more accurate product recommendations by combining time sequence information and content features.
A Survey on Recommendation System based on Knowledge Graph and Machine LearningIRJET Journal
This document provides an overview of recommendation systems based on knowledge graphs and machine learning. It first defines key concepts like recommendation systems, knowledge graphs, meta paths, and knowledge graph embedding. It then discusses standard recommendation approaches like content-based filtering, collaborative filtering, and hybrid filtering. The document focuses on knowledge graph-based recommendation systems, how they address issues with traditional approaches, and how machine learning can be used alongside knowledge graphs. It reviews several papers on using knowledge graphs for recommendations and proposes a comparative study. The document also outlines a proposed recommendation system and potential future research directions in the domain.
IRJET- Predicting Review Ratings for Product MarketingIRJET Journal
This document discusses predicting review ratings for product marketing using big data analysis. It proposes using Hadoop tools like HDFS, MapReduce, Hive and Pig to analyze large amounts of product review data from sources like blogs in order to provide more accurate predictions of review ratings. The system would gather reviews, convert unstructured data to structured data, analyze the data using Hadoop queries to determine popular products and trends. It would then display the results as bar charts and pie charts comparing review ratings for products. Experiments show the Hadoop-based system provides results faster than traditional databases for large datasets over 100MB in size.
This document presents a tag recommendation model for collaborative bookmarking systems. The team proposes using Lucene indexing and clustering approaches to suggest the most relevant tags for a given URL and its description. They describe extracting tags from the URL, user's previous tags, description text, and related words to then rank and recommend tags using a weighted clustering approach. The proposed architecture crawls URLs to extract content, indexes it with Lucene, and identifies candidate tags from multiple sources before applying clustering and weighting to select the most relevant tags.
A comprehensive survey of link mining and anomalies detectioncsandit
This document provides an overview of link mining and its application to anomalies detection. It discusses the emergence of link mining, key link mining tasks including object-related, graph-related and link-related tasks. Challenges of link mining are described along with applications. Different types of anomalies are defined and three main approaches to anomalies detection - supervised, semi-supervised and unsupervised - are outlined along with common methods like nearest neighbor, clustering, statistical and information-based approaches.
Behavioural Modelling Outcomes prediction using Casual FactorsIJMER
This document describes a new approach to machine science that uses crowdsourcing to generate predictive models of human behavioral outcomes. Users collectively formulate predictive survey questions and provide responses, which are then used to build predictive models without requiring domain expertise. Two experiments implementing this approach successfully generated models to predict monthly household electricity usage and user body mass index based on the crowdsourced survey data. The approach harnesses the "wisdom of crowds" to both discover predictive factors and collect data, representing an innovative way to generate predictive models without expert involvement at each step of the process.
1. The document discusses various IEEE 2012-2013 software projects in domains like Java, J2ME, J2EE, .NET, MATLAB and NS2.
2. SBGC provides technical guidance and support for students' IEEE projects, including project reports, materials, certificates etc.
3. A variety of IEEE projects are offered for students from different engineering departments like ECE, EEE, CSE etc. and levels like B.E, M.Tech, MBA.
IRJET- Review on Different Recommendation Techniques for GRS in Online Social...IRJET Journal
This document reviews different recommendation techniques for group recommender systems (GRS) in online social networks. It discusses traditional recommender approaches like content-based filtering and collaborative filtering. It also reviews related work applying opinion dynamics models and weight matrices to GRS. The document concludes that using a smart weights matrix to consider relationships between group members' preferences in a recommendation process improves aggregation and ensures consensus, providing the best way to recommend items to a complete group.
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...Aleksi Aaltonen
Presentation at the University of Miami on 3 December 2021 on how Stack Overflow improved the retention of new contributors whose initial question is rejected (closed) as substandard. The presentation is based on a paper coauthored with Sunil Wattal.
Ontological and clustering approach for content based recommendation systemsvikramadityajakkula
This document proposes a novel content-based recommendation system that uses ontological graphs and dynamic weighted ranking. It builds an adaptive ranking mechanism based on user selections and preferences to improve recommendation accuracy over time. The system segments data into ontological groups and identifies relationships between entities. It then calculates similarity between entities using feature vectors and ranks entities based on weights assigned to their connections in the ontological graph. These weights are updated dynamically based on user feedback to personalize recommendations for each user. The paper describes testing this approach in a recipe recommendation tool called RecipeMiner, which produced coherent recommendations that adapted to user preferences.
The document proposes a novel ranking approach called Manifold Ranking with Sink Points (MRSP) that addresses relevance, importance, and diversity simultaneously. MRSP uses manifold ranking over data objects to find the most relevant and important objects. It then designates ranked objects as "sink points" to prevent redundant objects from receiving high ranks. The approach is applied to update summarization and query recommendation tasks, demonstrating strong performance compared to existing methods.
Recommender System in light of Big DataKhadija Atiya
This document summarizes a research paper that investigates using singular value decomposition (SVD) to address challenges faced by recommender systems in light of big data. It discusses how collaborative filtering recommender systems are impacted by issues like scalability, sparsity, and cold starts with large datasets. The document then provides background on SVD and how it can be applied to collaborative filtering as a model-based approach. It describes an implementation of SVD-based recommender system using Apache Hadoop and Spark on a large dataset to validate its applicability for big data and evaluate the tools. The results showed SVD approach provides comparable performance to previous experiments on smaller datasets.
A Study of Neural Network Learning-Based Recommender Systemtheijes
This document summarizes a study that proposes a neural network learning model for recommender systems. The study aims to improve collaborative filtering methods by estimating user preferences based on learned correlations between users through a neural network. The proposed method was tested on MovieLens data and showed improved precision of 6.7% compared to other techniques. Additionally, the study found that precision and recall improved further, by 3.5% and 2.4% respectively, when including film genre information in the neural network learning. The document concludes the proposed technique can utilize diverse data sources and perform well regardless of data complexity compared to other recommender system methods.
FHCC: A SOFT HIERARCHICAL CLUSTERING APPROACH FOR COLLABORATIVE FILTERING REC...IJDKP
This summary provides the key details about a new soft hierarchical clustering algorithm called Fuzzy Hierarchical Co-clustering (FHCC) that is proposed for collaborative filtering recommendation. FHCC simultaneously generates hierarchical clusters of users and products based on a user-product rating matrix to detect potential user-product joint groups. It uses a fuzzy set approach to allow each user and product to belong to multiple clusters rather than a single cluster. The algorithm works by initially forming singleton co-clusters of individual users and products, then repeatedly merging the most similar pair of co-clusters until a single cluster remains. A hybrid similarity measure is used to calculate similarity between co-clusters based on their user, product, and rating components. The algorithm is intended to provide
A SURVEY OF LINK MINING AND ANOMALIES DETECTIONIJDKP
This document discusses link mining and its application in detecting anomalies. It begins by defining link mining as focusing on discovering explicit links between objects, as opposed to data mining which aims to find patterns within datasets. The document then surveys different types of anomalies that can be detected through link mining, including contextual, point, collective, online, and distributed anomalies. It also discusses challenges in link mining like logical vs statistical dependencies and the skewed class distribution problem in link prediction. Applications of link mining mentioned include social networks, epidemiology, and bibliographic analysis. Overall, the document provides an overview of the emerging field of link mining and its relevance for detecting unusual or anomalous links within linked datasets.
1. The document discusses the prospects for using learning analytics to achieve adaptive learning models. It describes adaptive learning and different levels of adaptive technologies, including platforms that react to individual user data and those that leverage aggregated data across users.
2. It outlines the pathway to achieving adaptive learning analytics, including using LMS analytics dashboards, predictive analytics, and adaptive learning analytics. Case studies and examples of existing applications are provided.
3. A proof of concept reference model for learning analytics is proposed, including a basic analytics process and an advanced process using predictive and adaptive algorithms. Linked open data for connecting curriculum standards and digital resources is also discussed.
Enabling reuse of arguments and opinions in open collaboration systems PhD vi...jodischneider
This document summarizes a PhD thesis on enabling the reuse of arguments and opinions in open collaboration systems. It discusses three research questions: 1) opportunities and requirements for argumentation support, 2) common arguments used in these systems, and 3) structuring arguments to support reuse. The methodology involved analyzing discussions from Wikipedia and open collaboration projects using argumentation theories like Walton's schemes and factors analysis. The goal is to develop semantic structures and visualizations to help people understand diverse opinions and make collaborative decisions. A prototype system tested with users found structuring discussions by key factors helped people evaluate arguments more effectively.
This document summarizes a research paper that proposes a hybrid recommendation approach for tourism systems using classification based on association rules and fuzzy logic. The paper describes common problems with recommender systems like sparsity and performance issues. It then presents a hybrid method combining clustering, associative classification, and fuzzy logic to address these problems. The method was implemented and evaluated in a tourism recommender system, with results showing it can improve recommendation quality by reducing limitations of other approaches.
This document summarizes a research paper that proposes a novel approach for dynamic personalized recommendation. It utilizes information from user ratings and profiles to develop dynamic features that describe user preferences over multiple phases of interest. An adaptive weighting algorithm then makes recommendations by weighting these dynamic features based on the amount of rating data available. The proposed approach was tested on public datasets and performed well for dynamic recommendation compared to existing algorithms.
This document presents a case study on applying a data analytics approach to conducting a systematic literature review on master data management. It outlines the steps taken, including defining review questions, searching multiple databases and sources, combining and preprocessing the data, and performing descriptive and text analyses. The analyses addressed questions about trends in publications over time, primary databases, publication types, and frequent keywords. This provided insights into the progress and topics within the master data management research domain. The presented structured approach aims to improve the replicability of systematic literature reviews.
Integrated expert recommendation model for online communitiesst02IJwest
Online communities have become vital places for Web 2.0 users to share knowledg
e and experiences.
Recently, finding expertise user in community has become an important research issue. This paper
proposes a novel cascaded model for expert recommendation using aggregated knowledge extracted from
enormous contents and social network fe
atures. Vector space model is used to compute the relevance of
published content with respect
to a specific query while PageRank
algorithm is applied to rank candidate
experts. The experimental results sho
w that the proposed model is
an effective recommen
dation which can
guarantee that the most candidate experts are both highly relevant to the specific queries and highly
influential in corresponding areas
Product Recommendation Systems based on Hybrid Approach TechnologyIRJET Journal
This document discusses hybrid recommendation systems for e-commerce. It begins with an introduction to recommendation systems and their use by e-commerce companies. It then discusses different types of recommendation techniques, including content-based filtering, collaborative filtering, and hybrid approaches. Specifically, it describes using a hybrid approach that combines content-based filtering and time sequence collaborative filtering algorithms. The document concludes that this hybrid method can provide more accurate product recommendations by combining time sequence information and content features.
A Survey on Recommendation System based on Knowledge Graph and Machine LearningIRJET Journal
This document provides an overview of recommendation systems based on knowledge graphs and machine learning. It first defines key concepts like recommendation systems, knowledge graphs, meta paths, and knowledge graph embedding. It then discusses standard recommendation approaches like content-based filtering, collaborative filtering, and hybrid filtering. The document focuses on knowledge graph-based recommendation systems, how they address issues with traditional approaches, and how machine learning can be used alongside knowledge graphs. It reviews several papers on using knowledge graphs for recommendations and proposes a comparative study. The document also outlines a proposed recommendation system and potential future research directions in the domain.
IRJET- Predicting Review Ratings for Product MarketingIRJET Journal
This document discusses predicting review ratings for product marketing using big data analysis. It proposes using Hadoop tools like HDFS, MapReduce, Hive and Pig to analyze large amounts of product review data from sources like blogs in order to provide more accurate predictions of review ratings. The system would gather reviews, convert unstructured data to structured data, analyze the data using Hadoop queries to determine popular products and trends. It would then display the results as bar charts and pie charts comparing review ratings for products. Experiments show the Hadoop-based system provides results faster than traditional databases for large datasets over 100MB in size.
Over the past 10 years, research systems have evolved from systems that focused on how to structure and record information on research, to systems capable of allowing significant insights to be derived based upon years of high quality information. In 2015, the maturity of the information now collected within many Current Research Information Systems, and the insights that this can provide is of equal or greater value than the insights that could be gleaned from established externally provided research metrics platforms alone. The ability to intersect these external and internal worlds provides new levels of strategic insight not previously available. With the addition of platforms that track altmetrics, and their ability to connect university publications data with a constant flow of real time attention level metrics, an image of a dynamic network of systems emerges, connected together by ever turning ‘cogs’ pushing and translating information. Add to this, the success of ORCID as pervasive researcher identifier infrastructure, and CASRAI as the emerging social contract for information exchange, and it becomes possible to extend this network back from the systems that track and record research information, through to the platforms through which research knowledge is created. The ‘Mechanics’ of this network of systems is more than just getting the ‘plumbing’ right. As research information moves through the network, its audience and purpose changes, the requirements for contextual metadata can also change. This presentation will explore the lived experience of Research Data Mechanics at Digital Science though illustrating how connections between Figshare, Altmetric, Symplectic Elements, and Dimensions can both enhance research system capability and reduce the burden on researchers, and research administration.
The document describes an evaluation of existing relational keyword search systems. It notes discrepancies in how prior studies evaluated systems using different datasets, query workloads, and experimental designs. The evaluation aims to conduct an independent assessment that uses larger, more representative datasets and queries to better understand systems' real-world performance and tradeoffs between effectiveness and efficiency. It outlines schema-based and graph-based search approaches included in the new evaluation.
The document discusses research information management systems (RIMs) and the role of libraries in supporting them. It describes RIMs as systems that collate fragmented institutional research data to reduce administrative burdens. Key functions of RIMs include automated data capture, integration with internal and external data sources, and providing analytics. The document argues that RIMs benefit institutions by centralizing research information for assessment, funding applications, and increasing visibility. Libraries are well-positioned to advise on RIMs and play a lead role in planning due to expertise in data management, bibliographic standards, and understanding researcher needs.
The document discusses research information management systems (RIMs) and the role of libraries in supporting them. It describes RIMs as systems that collate fragmented institutional research data to reduce administrative burdens. Key functions of RIMs include automated data capture, integration with internal and external data sources, and providing analytics. The document argues that RIMs benefit institutions by centralizing research information for assessment, funding applications, and increasing visibility. Libraries are well-positioned to advise on RIMs and play a lead role in planning institutional research data collection and management.
An Improvised Fuzzy Preference Tree Of CRS For E-Services Using Incremental A...IJTET Journal
This document describes a proposed algorithm for improving recommendation systems for e-services. It involves the following key steps:
1. Clustering customer transaction histories to group similar purchase patterns and derive customer-based recommendations.
2. Using incremental association rule mining on the transaction data to detect frequently purchased item sets and relationships between items.
3. Developing a fuzzy model to classify customers and provide dynamic recommendations tailored to different customer types. The recommendations will be based on matching customer preferences and purchase histories to specific product sets.
4. The algorithm clusters transactions, mines association rules incrementally as new data is added, and generates recommendations by classifying customers and matching them to relevant product clusters. This provides a personalized and
On the benefit of logic-based machine learning to learn pairwise comparisonsjournalBEEI
This document describes a study that used a logic-based machine learning approach called APARELL to learn user preferences from pairwise comparisons in a recommender system. APARELL learns description logic rules from examples of users' pairwise item preferences and background knowledge about item properties. It was implemented in a used car recommender system. A user study evaluated the pairwise preference elicitation method compared to a standard list interface. Results showed the pairwise interface was significantly better in recommending items that matched users' preferences.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
This document summarizes a research paper on developing user profiles from search engine queries to enable personalized search results. It discusses how current search engines generally return the same results regardless of individual user interests. The paper proposes methods to construct user profiles capturing both positive and negative preferences from search histories and click-through data. Experimental results showed profiles including both preferences performed best by improving query clustering and separating similar vs. dissimilar queries. Future work aims to use profiles for collaborative filtering and predicting new query intents.
Perception Determined Constructing Algorithm for Document ClusteringIRJET Journal
This document discusses an approach to document clustering called "Semantic Lingo" which aims to identify key concepts in documents and automatically generate an ontology based on these concepts to better conceptualize the documents. It provides background on challenges with traditional document clustering techniques and search engines. The proposed approach uses semantic information from domain ontologies to improve web search clustering quality by addressing issues like synonyms, polysemy and high dimensionality. It also discusses using text segments within documents that focus on one or more topics to aid multi-topic document clustering.
An empirical performance evaluation of relational keyword search systemsBrowse Jobs
The document presents an empirical performance evaluation of relational keyword search systems. It evaluates 7 relational keyword search techniques and finds that many existing techniques do not provide acceptable performance for realistic retrieval tasks. In particular, memory consumption prevents many search techniques from scaling to datasets with more than tens of thousands of vertices. The evaluation also explores how factors varied in previous studies have relatively little impact on performance. The work confirms that existing systems have unacceptable performance and underscores the need for standardization in evaluating these retrieval systems.
Query- And User-Dependent Approach for Ranking Query Results in Web DatabasesIOSR Journals
This document presents a new approach for ranking query results from web databases called Query and User Dependent Ranking. The approach considers two aspects: query similarity, which accounts for different users having similar queries, and user similarity, which accounts for different queries having similar user preferences. A prototype application was built to test the model. Empirical results found the approach to be efficient and suitable for real-world applications. The approach uses a workload file that is updated with ranking functions as new queries are made to track user and query similarities over time.
A Social Network-Empowered Research Analytics Framework For Project SelectionNat Rice
This document proposes a social network-empowered research analytics framework to assist government funding agencies in selecting research projects. It builds researcher profiles using data from research proposals, publications, citations, and a research social network to capture relevance, productivity, and connectivity. An algorithm then matches proposals and reviewers based on these profiles to optimize reviewer assignments. The framework was implemented and tested by China's largest funding agency, generating cost savings and improved proposal evaluation.
Improving Ranking Web Documents using User’s Feedbacks...............................................................1
Fatemeh Ehsanifar and Hasan Naderi
A Survey on Sparse Representation based Image Restoration ............................................................... 11
Dr. S. Sakthivel and M. Parameswari
Simultaneous Use of CPU and GPU to Real Time Inverted Index Updating in Microblogs
.................................................................................................................................................................... 25
Sajad Bolhasani and Hasan Naderi
A Survey on Prioritization Methodologies to Prioritize Non-Functional Requirements ........................ 32
Saranya. B., Subha. R and Dr. Palaniswami. S.
A Review on Various Visual Cryptography Schemes ................................................................................ 45
Nagesh Soradge and Prof. K. S. Thakare
Web Page Access Prediction based on an Integrated Approach ............................................................. 55
Phyu Thwe
A Survey on Bi-Clustering and its Applications ..................................................................................65
K. Sathish Kumar, M. Ramalingam and Dr. V. Thiagarasu
Pixel Level Image Fusion: A Neuro-Fuzzy Approach ................................................................................ 71
Swathy Nair, Bindu Elias and VPS Naidu
A Comparative Analysis on Visualization of Microarray Gene Expression Data...................................... 87
Poornima. S and Dr. J. Jeba Emilyn
Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems (Doct...Olivier Jeunen
Slides for my Doctoral Symposium presentation at RecSys '19 in Copenhagen, titled "Revisiting Offline Evaluation for Implicit-Feedback Recommender Systems".
An efficient information retrieval ontology system based indexing for contexteSAT Journals
Abstract Many of the research or development projects are constructed and vast type of artifacts are released such as article, patent, report of research, conference papers, journal papers, experimental data and so on. The searching of the particular context through the keywords from the repository is not an easy task because the earliest system the problem of huge recalls with low precision. This paper challenges to construct a search algorithm based on the ontology to retrieve the relevant contexts. Ontology's are great knowledge of retrieving the context. In this paper, we utilize the WordNet ontology to retrieve the relevant contexts from the document repository. It is very difficult to retrieve the relevant context in its original format since we use the pre-processing step, which helps to retrieve context. The pre-processing step includes two major steps first one is stop word removal and the second one is stemming process. The outcome of the pre-processing step is indexing consist of important keywords and their corresponding keywords. When the user enter the keyword to the system, the ontology makes the several steps to make the refine keywords. Finally, the refine keywords are matched with index and relevant contexts are retrieved. The experimentation process is carried out with the help of different set of contexts to achieve the results and the performance analysis of the proposed approach is estimated by the evaluation metrics like precision, recall and F-measure. Keywords— Ontologies; WordNet; contexts; stemming; indexing.
This presentation was provided by Tim McGeary of Duke University during the NISO virtual conference, Open Data Projects, held on Wednesday, June 13, 2018.
This document discusses online learning to rank, which uses machine learning techniques to create ranking models from training data consisting of queries and documents matched with relevance judgements. It describes applications of learning to rank such as search engines and discusses online versus offline learning. Key algorithms for learning to rank problems include pointwise, pairwise and listwise approaches. The document also covers evaluating ranking models, handling biases in click data, and directions for future research.
Slides presenting preliminary overview of thesis work presented at the International Conference on Electronic Learning in the Workplace at Columbia University on June 11, 2010.
Similar to Invited Lecture on Interactive Information Retrieval (20)
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
2. Who am I?
2
Glasgow
§ Postdoctoral Researcher
Delft University of Technology
§ PhD in Interactive IR
University of Glasgow (2019)
Modelling Search and Stopping in
Interactive Information Retrieval
David Martin Maxwell
School of Computing Science
College of Science and Engineering
University of Glasgow
4. Who do we build Search Engines for?
Information Seekers/Users
Individual(s) searching for information
5. Considering the User
§ Literature in IR has often focused on the system-side
§ Retrieval models, improving ranking, efficiency, etc.
§ But we develop search engines for users!
§ How do users behave?
§ How does an interface change behaviour?
§ How can we better support users?
§ We need to better understand complex
interactions between a user and system
5
6. What is Interactive IR?
6
“The area of interactive information retrieval covers
research related to studying and assisting these diverse
end users of information access and retrieval systems.”
Ian Ruthven
University of Strathclyde
“... the interactive approach to IR has led to a
focus on the user-oriented activities of query
formulation and reformulation, and inspection and
judgement of retrieved items ...”
Nick Belkin
Rutgers University
“In interactive information retrieval, users
are typically studied along with their
interactions with systems and information.”
Diane Kelly
University of Tennessee
7. Essential Reading (Too many to list here)
7
A probability ranking principle for interactive
information retrieval
Norbert Fuhr
Received: 14 September 2007 / Accepted: 15 January 2008 / Published online: 7 February 2008
! Springer Science+Business Media, LLC 2008
Abstract The classical Probability Ranking Principle (PRP) forms the theoretical basis
for probabilistic Information Retrieval (IR) models, which are dominating IR theory since
about 20 years. However, the assumptions underlying the PRP often do not hold, and its
view is too narrow for interactive information retrieval (IIR). In this article, a new theo-
retical framework for interactive retrieval is proposed: The basic idea is that during IIR, a
user moves between situations. In each situation, the system presents to the user a list of
choices, about which s/he has to decide, and the first positive decision moves the user to a
new situation. Each choice is associated with a number of cost and probability parameters.
Based on these parameters, an optimum ordering of the choices can the derived—the PRP
for IIR. The relationship of this rule to the classical PRP is described, and issues of further
research are pointed out.
Keywords Probabilistic retrieval ! Interactive retrieval ! Optimum retrieval rule
1 Introduction
Inf Retrieval (2008) 11:251–265
DOI 10.1007/s10791-008-9045-0
Information Foraging
Peter Pirolli
and
Stuart K. Card
UIR Technical Report
Funded in part by the Office of Naval Research
January 1999
Methods for Evaluating Interactive Information
Retrieval Systems for Users
Diane Kelly, 2009 (FTIR)
A Probability Ranking Principle for Interactive
Information Retrieval
Norbert Fuhr, 2008 (IRJ)
The Economics in Interactive Information Retrieval
Leif Azzopardi, 2011 (SIGIR)
Information Foraging
Peter Pirolli (pictured) and Stuart Card, 1999 (Psy. Review)
8. Lecture Outline
§ We’ll be considering IIR from a modelling perspective
8
Session 2 (14:30-15:15)
Session 1 (13:30-14:15)
Part I
System- to
User-Sided
Research
Part II
The Interactive
IR Process
Part III
Conceptual
Modelling of
Interactive IR
Part V
Evaluation and the Simulation of Interaction
Walkthrough: conducting a simulated analysis of
searcher behaviour
Part IV
Theoretical
Models of
Search
9. Just a Note
§ Interactive Information Retrieval is a huge area of active
research – with many different facets and areas
§ We’re only going to be looking at a small subset of research
§ We won’t be looking at user studies, for example
§ When we say “model” when I have the floor, we refer to a
model of a user’s interactions a searcher undertakes – not
some kind of retrieval model
9
11. System vs. User-Sided Search
11
System-Sided
Evaluation
Ranking,
efficiency,
etc.
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
12. System vs. User-Sided Search
11
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
13. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Document Corpus
Collection of documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
14. System vs. User-Sided Search
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Document Corpus
Collection of documents
Indexing Process
Converting to an index
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
11
15. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Document Corpus
Collection of documents
Indexing Process
Converting to an index
Index
Various data structures
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
16. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
17. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Batch Queries
For system evaluation
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
18. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Judgements
Created by assessors
Batch Queries
For system evaluation
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
19. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Interface/SERP
Generation of a Search
Engine Results Page (SERP)
to display matching results
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Judgements
Created by assessors
Batch Queries
For system evaluation
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
20. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Interface/SERP
Generation of a Search
Engine Results Page (SERP)
to display matching results
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Judgements
Created by assessors
Searchers
With an information need,
seeking to satisfy said need
Batch Queries
For system evaluation
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
21. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Interface/SERP
Generation of a Search
Engine Results Page (SERP)
to display matching results
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Information Need
What to search for
Judgements
Created by assessors
Searchers
With an information need,
seeking to satisfy said need
Batch Queries
For system evaluation
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
22. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Interface/SERP
Generation of a Search
Engine Results Page (SERP)
to display matching results
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Information Need
What to search for
Query/Queries
Information need in term(s)
Judgements
Created by assessors
Searchers
With an information need,
seeking to satisfy said need
Batch Queries
For system evaluation
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
23. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Interface/SERP
Generation of a Search
Engine Results Page (SERP)
to display matching results
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Information Need
What to search for
Query/Queries
Information need in term(s)
Judgements
Created by assessors
Searchers
With an information need,
seeking to satisfy said need
Interaction
Clicking links, examining...
Batch Queries
For system evaluation
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
24. System vs. User-Sided Search
11
Concerned with the
development of retrieval
models, efficiency
improvements, etc.
Concerned with the
examination of the
interactions between a
system and searcher, the
presentation of results, etc.
Interface/SERP
Generation of a Search
Engine Results Page (SERP)
to display matching results
Document Corpus
Collection of documents
Retrieval Engine
Returns a (ranked) list of
documents, given an index,
retrieval model and query
Indexing Process
Converting to an index
Information Need
What to search for
Query/Queries
Information need in term(s)
Judgements
Created by assessors
Searchers
With an information need,
seeking to satisfy said need
Interaction
Clicking links, examining...
Batch Queries
For system evaluation
Index
Various data structures
Retrieval Model
Scores documents
System-Sided
Evaluation
Ranking,
efficiency,
etc.
User-Sided
Evaluation
Interaction,
presentation,
etc.
Interaction Cycle
“Classical IR”
Research
25. The (Interactive) IR Spectrum
“Archetypical IIR Study”
System Focused User/Searcher Focused
1
2
3
4
5
6
7
8
Figure adapted from Diane Kelly’s IIR evaluation book. With permission. 12
§ Users/searchers are involved in different studies to
varying degrees (or not at all!) – this spectrum is a handy
way to categorise them
26. The (Interactive) IR Spectrum
“Archetypical IIR Study”
System Focused User/Searcher Focused
TREC-style studies
1
2
3
4
5
6
7
8
Figure adapted from Diane Kelly’s IIR evaluation book. With permission. 13
§ “TREC-style” studies were for system-sided research
§ Assessors create the relevance judgements, but no real
interactions are observed per se
27. The (Interactive) IR Spectrum
“Archetypical IIR Study”
System Focused User/Searcher Focused
TREC-style studies
“User” makes
relevance assessments
1
2
3
4
5
6
7
8
Figure adapted from Diane Kelly’s IIR evaluation book. With permission. 14
§ What if you use a collection (i.e., web-based experiments)
where no relevance judgements are available?
§ Typically used for the creation of document collections (i.e., is
this document relevant to the information need?)
28. The (Interactive) IR Spectrum
“Archetypical IIR Study”
System Focused User/Searcher Focused
TREC-style studies
“User” makes
relevance assessments
Filtering and SDI
1
2
3
4
5
6
7
8
Log analysis
Figure adapted from Diane Kelly’s IIR evaluation book. With permission. 15
§ The dissemination of “transaction logs” from search engines to
improve ranking models, etc.
§ Assumptions made on user intention – huge volumes of data
allow researches to identify important regularities
29. The (Interactive) IR Spectrum
“Archetypical IIR Study”
System Focused User/Searcher Focused
TREC-style studies
“User” makes
relevance assessments
Log analysis
Filtering and SDI
1
2
3
4
5
6
7
8
TREC interactive studies
Figure adapted from Diane Kelly’s IIR evaluation book. With permission. 16
§ The typical IIR study – some system/interface is
evaluated, and we observe the searcher’s behaviour
§ Can be behavioural (interactions) or experience (surveys, etc.)
§ Typically report both system and human-based measures
30. The (Interactive) IR Spectrum
“Archetypical IIR Study”
System Focused User/Searcher Focused
TREC-style studies
“User” makes
relevance assessments
Log analysis
Filtering and SDI TREC interactive studies
1
2
3
4
5
6
7
8
Experimental
information behaviour
Information seeking
behaviour in context
Information seeking
behaviour with IR systems
Figure adapted from Diane Kelly’s IIR evaluation book. With permission. 17
§ 6 Isolation of specific components (i.e., controlling what
results are returned) – typically used in psychology studies
§ 8 Human-centric studies, where qualitative surveys and
interviews are typically conducted
31. Classical IR: The Cranfield Experiments
§ Much of the classical IR research follows
the Cranfield experiments devised by
Cyril Cleverdon at Cranfield university
§ Concept of a corpus, a set of
information needs, and judgements
Did you know that Cranfield is the only university in the world with a functional airport? 18
“…a laboratory type situation where, freed as far as
possible from the contamination of operational variables,
the performance of index languages could be considered
in isolation.” Cleverdon (1991)
Image
credit:
https://www.cranfield.ac.uk/press/news-2016/cranfield-
university-announces-plans-for-festival-of-flight
32. Assumptions of Cranfield
§ Assumes a static information need
§ Once a searcher starts, their need does not evolve over time
§ Representative of an entire population
§ Searchers assume the same documents are relevant
§ The list of documents is total and complete
§ All relevant documents have been identified beforehand
19
While good for (simplifying) experimentation, are these assumptions realistic?
33. § Assumptions are good for reproducible, system-sided
research; lacking for user-sided research
§ Carried across to Information Retrieval evaluation fora
(e.g., TREC, NTCIR, CLEF…)
§ Over the years have provided numerous test collections and
topics for assisting in promoting reproducible research
k
Cranfield “User Models”
20
Researchers often state that Cranfield neglects the user; they
don’t! They actually abstract the user…
?
34. The “Cranfield Style” Searcher Model
Cutoff k
reached?
Issue Query
Consider Relevant
No
Click Document
Click Summary
No more
topics?
Yes
Yes
No
21
35. “Cranfield Style” Searcher Assumptions
22
Cutoff k
reached?
Issue Query
Consider Relevant
No
Click Document
Click Summary
No more
topics?
Yes
Yes
No
36. “Cranfield Style” Searcher Assumptions
23
Cutoff k
reached?
Issue Query
Consider Relevant
No
Click Document
Click Summary
No more
topics?
Yes
Yes
No
A single query is
issued for each topic
37. “Cranfield Style” Searcher Assumptions
24
Cutoff k
reached?
Issue Query
Consider Relevant
No
Click Document
Click Summary
No more
topics?
Yes
Yes
No
A single query is
issued for each topic
Every document is inspected,
regardless of relevance to the
topic being examined
38. “Cranfield Style” Searcher Assumptions
25
Cutoff k
reached?
Issue Query
Consider Relevant
No
Click Document
Click Summary
No more
topics?
Yes
Yes
No
A single query is
issued for each topic
Every document is inspected,
regardless of relevance to the
topic being examined
A user will “examine”
to a fixed depth of k
39. So what is More Realistic?
§ Interactive Information Retrieval saves the day…
§ We begin to shift away from these generally accepted
assumptions for Information Retrieval evaluation
§ We start to consider more realistic (but more complex)
interaction models to better explain what a user does
when interacting with a search system and/or information
26
43. The Interactive IR Process
Retrieval Engine
Information Need
What to search for
29
44. The Interactive IR Process
Retrieval Engine
Information Need
What to search for
Query/Queries
Information need in term(s)
29
45. The Interactive IR Process
Interface/SERP
Retrieval Engine
Information Need
What to search for
Query/Queries
Information need in term(s)
Examination
29
46. The Interactive IR Process
Interface/SERP
Retrieval Engine
Information Need
What to search for
Query/Queries
Information need in term(s)
Click
Examination
Attractive?
29
47. The Interactive IR Process
Interface/SERP
Retrieval Engine
Information Need
What to search for
Query/Queries
Information need in term(s)
Document(s)
Click
The word
"Canberra" is
popularly
claimed to
derive from the word Kam-
bera or Canberry, which is
claimed to mean "meeting
place" in Ngunnawal, one of
the Indigenous languages
spoken in the district by...
Examination
Attractive?
29
48. The Interactive IR Process
Interface/SERP
Retrieval Engine
Information Need
What to search for
Query/Queries
Information need in term(s)
Document(s)
Click
The word
"Canberra" is
popularly
claimed to
derive from the word Kam-
bera or Canberry, which is
claimed to mean "meeting
place" in Ngunnawal, one of
the Indigenous languages
spoken in the district by...
Relevant?
Examination
Examination
Attractive?
29
49. The Interactive IR Process
Interface/SERP
Retrieval Engine
Information Need
What to search for
Query/Queries
Information need in term(s)
Document(s)
Click
STOP
The word
"Canberra" is
popularly
claimed to
derive from the word Kam-
bera or Canberry, which is
claimed to mean "meeting
place" in Ngunnawal, one of
the Indigenous languages
spoken in the district by...
Relevant?
Examination
Examination
Attractive?
29
50. The Interactive IR Process
Interface/SERP
Retrieval Engine
Information Need
What to search for
Query/Queries
Information need in term(s)
Document(s)
Examination
Attractive?
Click
STOP
STOP
The word
"Canberra" is
popularly
claimed to
derive from the word Kam-
bera or Canberry, which is
claimed to mean "meeting
place" in Ngunnawal, one of
the Indigenous languages
spoken in the district by...
Relevant?
Examination
29
51. The Interactive IR Process
Interface/SERP
Retrieval Engine
Information Need
What to search for
Query/Queries
Information need in term(s)
Document(s)
Reformulation
Examination
Attractive?
Click
STOP
STOP
The word
"Canberra" is
popularly
claimed to
derive from the word Kam-
bera or Canberry, which is
claimed to mean "meeting
place" in Ngunnawal, one of
the Indigenous languages
spoken in the district by...
Relevant?
Examination
29
52. The Information Need
§ The reason why a searcher approaches a retrieval system!
The Anomalous State of Knowledge (see Belkin)
A knowledge gap, or inconsistency with the world…
§ Is considered to be
dynamic (changes) as a
search session progresses
30
53. SERPs, Sessions, and Interactions
31
Canberra - Wikipedia
https://en.wikipedia.org/wiki/Canberra
Canberra is the capital city of Australia. With a population of 403,468, it is Aus-
tralia's largest inland city and the eighth-largest city overall. The city is located...
VisitCanberra: Canberra Holidays, Accommodation & Things...
https://visitcanberra.com.au/
Discover things to do in Canberra with our guide. Experience culture at the Na-
tional Portrait Gallery and the National Gallery of Australia, or visit the...
Canberra Airport | Arrivals, Departures, Lounges, Transport...
https://www.canberraairport.com.au/
Official website for Canberra Airport - The latest information on flights, parking,
transport and more. View live information on arrivals and departures.
canberra australia
Example Search Engine Results Page (SERP)
Query Terms
Title
Canberra
Capital of Australia
Canberra is the capital city of
Australia. With a population of
403,468, it is Australia’s larg-
est inland city and the
eighth-largest city overall.
Wikipedia
Left Rail (Result Summaries)
Right Rail
Source Snippet Fragments
Information Card
Result
Summary
54. SERPs, Sessions, and Interactions
32
Canberra - Wikipedia
https://en.wikipedia.org/wiki/Canberra
Canberra is the capital city of Australia. With a population of 403,468, it is Aus-
tralia's largest inland city and the eighth-largest city overall. The city is located...
VisitCanberra: Canberra Holidays, Accommodation & Things...
https://visitcanberra.com.au/
Discover things to do in Canberra with our guide. Experience culture at the Na-
tional Portrait Gallery and the National Gallery of Australia, or visit the...
Canberra Airport | Arrivals, Departures, Lounges, Transport...
https://www.canberraairport.com.au/
Official website for Canberra Airport - The latest information on flights, parking,
transport and more. View live information on arrivals and departures.
canberra australia
Example Search Engine Results Page (SERP)
Query Terms
Title
Canberra
Capital of Australia
Canberra is the capital city of
Australia. With a population of
403,468, it is Australia’s larg-
est inland city and the
eighth-largest city overall.
Wikipedia
Left Rail (Result Summaries)
Right Rail
Source Snippet Fragments
Information Card
Result
Summary
Interactions are recorded and
stored for post-hoc analysis
55. SERPs, Sessions, and Interactions
33
Canberra - Wikipedia
https://en.wikipedia.org/wiki/Canberra
Canberra is the capital city of Australia. With a population of 403,468, it is Aus-
tralia's largest inland city and the eighth-largest city overall. The city is located...
VisitCanberra: Canberra Holidays, Accommodation & Things...
https://visitcanberra.com.au/
Discover things to do in Canberra with our guide. Experience culture at the Na-
tional Portrait Gallery and the National Gallery of Australia, or visit the...
Canberra Airport | Arrivals, Departures, Lounges, Transport...
https://www.canberraairport.com.au/
Official website for Canberra Airport - The latest information on flights, parking,
transport and more. View live information on arrivals and departures.
canberra australia
Example Search Engine Results Page (SERP)
Query Terms
Title
Canberra
Capital of Australia
Canberra is the capital city of
Australia. With a population of
403,468, it is Australia’s larg-
est inland city and the
eighth-largest city overall.
Wikipedia
Left Rail (Result Summaries)
Right Rail
Source Snippet Fragments
Information Card
Result
Summary
Search Sessions often
constitute multiple queries
56. Search is Inherently Interactive
§ We know that the search process is not rigid!
§ Information needs are dynamic, and vary as a searcher
consumes information
§ Thinking about the complexity of a SERP and the interactions,
the basic searcher model is inadequate for demonstrating
what actually takes place when searching
§ Researchers have devised a number of expanded conceptual
and theoretical models to better explain IIR
34
57. Conceptual Modelling of Interactive IR
How can we better represent the search process?
PART III
58. Conceptual Models of Search
§ A conceptual model of search attempts to capture the
key interactions that take place during a search session
§ Being conceptual, they act as scaffolding – you can take
the scaffolding, and build all sorts of “user interaction models”
with them (instantiate each block in different ways)
Conceptual models differ from theoretical models; see later! 36
Write Slide Scream into Void Complete?
59. Expanded Conceptual Models
Adapted (with permission) from Baskaya et al. (see CIKM 2013 proceedings) 37
Issue Query Examine Snippet
Relevant?
Yes
Attractive?
Stop
Session?
Read Document
Continue
Examining
SERP?
No
Yes Yes
No
No
No
Yes
60. Expanded Conceptual Models
Adapted (with permission) from Baskaya et al. (see CIKM 2013 proceedings) 38
P=1 P<=1 P=1
P=1
P<=1
P<=1
P<=1
P<=1
P<=1
Formulate Query Scan a Snippet Click a Link Read a Document
Judge Document
Relevance
Stop Session
P<=1
61. Expanded Conceptual Models
§ General flow of the
searcher is the same as
before
§ Allowed searcher to
select which summary
to read – non-linear!
§ Also incorporates
ability to select a
search system to use
For more information, have a look at Thomas et al. (IIiX, precursor to CHIIR, in 2014) 39
Enticed by
summary i?
Select System Enter Query Choose position i
Evaluate summary i
Click
summary
link?
Read (part of)
document
End query?
End session?
Decide
next action
si
ri
No
Yes
Yes
No
Yes
No
Yes
No
Change
query
Change retrieval system
62. Expanded Conceptual Models
§ The Complex Searcher
Model – adapted from
observing logs and previous
conceptual models
§ More on this later
§ We can consider blocks in
isolation, or as part of the
entire process
https://www.dmax.org.uk/thesis/ 40
Yes
Select
Query
Out of queries
Appears
Useful?
Attractive?
Relevant?
Continue
on SERP?
Continue?
Examine Topic Generate Queries Issue Query
View SERP
Examine Snippet
Click Document
Assess Document
Mark Document
No
Yes
No
No
No
Yes Yes
Yes
No
64. Theoretical Models of Search
§ IIR researchers have proposed mathematically grounded
models that provide us with a descriptive, predictive ability
to explain how and why searchers behave in a given way
§ Such models have limitations, too!
§ Assumptions in human behaviour (behaving rationally)
§ Mathematically-based can be considered closed-form and can
make it hard to model the complex phenomena1
1Fishwick (1995) outlines simulation as a means for permitting complex phenomena; see later. 43
65. Theoretical Models of Search
44
§ Three competing theoretical models have been proposed…
All three theories have been shown to be mathematically equiv. See Azzopardi and Zuccon (2015 ICTIR)
Interactive
Probability
Ranking
Principle
Norbert Fuhr, 2008
Search
Economic
Theory
Leif Azzopardi, 2011
Information
Foraging
Theory
Peter Pirolli and
Stuart Card, 1999
Expanding the PRP Economic theory Animal behaviour
66. The Berrypicking Model
§ A well known model where searchers are considered
analogous to foragers, scavenging for food in the wild
See Bates (1989) 45
67. The Berrypicking Model
§ A well known model where searchers are considered
analogous to foragers, scavenging for food in the wild
See Bates (1989). 45
68. The Berrypicking Model
§ A well known model where searchers are considered
analogous to foragers, scavenging for food in the wild
See Bates (1989). 45
69. The Berrypicking Model
§ A well known model where searchers are considered
analogous to foragers, scavenging for food in the wild
See Bates (1989). 45
70. The Berrypicking Model
§ A well known model where searchers are considered
analogous to foragers, scavenging for food in the wild
See Bates (1989). 45
71. The Berrypicking Model
§ A well known model where searchers are considered
analogous to foragers, scavenging for food in the wild
See Bates (1989). 45
72. The Berrypicking Model
§ A well known model where searchers are considered
analogous to foragers, scavenging for food in the wild
See Bates (1989). 45
73. The Berrypicking Model
§ A well known model where searchers are considered
analogous to foragers, scavenging for food in the wild
Bates does publish a later paper that discusses cost/benefit analyses, however. 46
§ Highly descriptive, but importantly, not predictive
§ You go for the juiciest berries, but the model does not provide a
rationale as to why (accruing gain)
§ How long should a forager spend in a given berry bush?
§ We need models that offer predictive power to answer this
74. Information Foraging Theory
§ Devised from Foraging Theory, the study
of how animals forage for food
§ Examining their behaviours, where they
attempt to maximise their gain (intake)
per unit of time (in order to survive)
47
A totally fascinating book; see Stephens and Krebs (1986)
75. Information Foraging Theory
§ Pirolli & Card applied Foraging Theory to search!
§ Foraging Theory costs of three models…
48
Diet Model Patch Model Scent Model
76. Forager
Patches and Scent
§ An area in which gains can be made is called a patch
§ A forager will follow a given scent to the patch, and make
decisions as to whether to head towards it, or once inside,
when to leave it
49
77. Forager Patch
Patches and Scent
§ An area in which gains can be made is called a patch
§ A forager will follow a given scent to the patch, and make
decisions as to whether to head towards it, or once inside,
when to leave it
49
78. Forager
Scent (Pollen)
Patch
Patches and Scent
§ An area in which gains can be made is called a patch
§ A forager will follow a given scent to the patch, and make
decisions as to whether to head towards it, or once inside,
when to leave it
49
79. Forager
Scent (Pollen)
Patch
Patches and Scent
§ An area in which gains can be made is called a patch
§ A forager will follow a given scent to the patch, and make
decisions as to whether to head towards it, or once inside,
when to leave it
49
80. Forager
Scent (Pollen)
Patch
Beetween patch time
Patches and Scent
§ An area in which gains can be made is called a patch
§ A forager will follow a given scent to the patch, and make
decisions as to whether to head towards it, or once inside,
when to leave it
49
81. Forager
Scent (Pollen)
Patch
Beetween patch time
Within patch time
Patches and Scent
§ An area in which gains can be made is called a patch
§ A forager will follow a given scent to the patch, and make
decisions as to whether to head towards it, or once inside,
when to leave it
49
82. Forager
Scent (Pollen)
Patch
Beetween patch time
Within patch time STOP
Patches and Scent
§ An area in which gains can be made is called a patch
§ A forager will follow a given scent to the patch, and make
decisions as to whether to head towards it, or once inside,
when to leave it
49
83. Forager
Scent (Pollen)
Patch
Beetween patch time
Within patch time STOP
Patches and Scent
§ An area in which gains can be made is called a patch
§ A forager will follow a given scent to the patch, and make
decisions as to whether to head towards it, or once inside,
when to leave it
49
90. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
91. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
92. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
Gain Curve
93. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
Gain Curve
Between Patch
94. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
Within Patch
Gain Curve
Between Patch
95. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
Within Patch
Gain Curve
Between Patch
96. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
Within Patch
A
v
e
r
a
g
e
R
a
t
e
o
f
G
a
i
n
Gain Curve
Between Patch
97. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
Within Patch
A
v
e
r
a
g
e
R
a
t
e
o
f
G
a
i
n
Gain Curve
Between Patch
98. Predictive Power of IFT
§ We can use IFT to predict when
someone should stop examining
results on a SERP (for example)
§ We can use the Marginal Value
Theorem to predict when you should
stop and leave a patch
§ This is called the optimal stopping
point – gain diminishes after this!
51
Cumulative
Gain
(CG)
Time
Within Patch
STOP
A
v
e
r
a
g
e
R
a
t
e
o
f
G
a
i
n
Gain Curve
Between Patch
Prediction: stop
at this point!
99. Predicting Other Behaviours
§ IFT can predict a variety of other behaviours too – it’s
how you apply it that is important
§ Whether to enter a patch/SERP, etc…
§ Competing theories (e.g., SET) have also been used to
predict various search behaviours
§ For example, query length vs. gain trade-offs – what is the optimal
query length for a searcher to issue?1
§ Deals with cost/benefit trade-offs – what is most efficient?
1See the tutorial by Azzopardi and Zuccon on developing economic models. 52
101. Why is this Important?
§ Theoretical models provide us with an underpinning and
explanation for (rational) searcher behaviours
§ Conceptual models are based on what theoretical models
suggest plus real-world observations of searcher behaviours
to formalise the steps and decisions taken
§ Together, we have a strong set of tools to provide a credible
explanation of the IIR process – but how do we know they
are any good?
54
102. How do we Evaluate these Models?
§ Evaluation is important – how do we know they are
credible? How do we know they are useful?
§ We can evaluate these models through a combination of
user studies and the simulation of interaction
§ Following a long line of IR research using simulation
§ Offers the freedom to explore a wide range of scenarios (i.e., what
if experiments) all at a low cost, without searcher fatigue, etc.
Refer to Fishwisk (1995) for a detailed and nuanced argument for simulation. 55
103. The Simulation of Interaction
56
§ We can instantiate each of
the building blocks and
decision points in different
ways to see what happens
§ Studies have examined simulated
queries, browsing behaviours, cost
vs. time, session performance…
§ These experiments must
be properly grounded –
perhaps using interaction
data from a real-world study
Yes
Select
Query
Out of queries
Appears
Useful?
Attractive?
Relevant?
Continue
on SERP?
Continue?
Examine Topic Generate Queries Issue Query
View SERP
Examine Snippet
Click Document
Assess Document
Mark Document
No
Yes
No
No
No
Yes Yes
Yes
No