The document discusses dynamic search and modeling user information seeking behavior. It describes:
1) Characteristics of dynamic search tasks including rich user interactions over multiple queries, temporal dependency between queries and clicked documents, and aiming to fulfill complex evolving information needs.
2) A dual-agent reinforcement learning framework for dynamic search where the user and search engine are modeled as cooperative agents taking actions and receiving rewards.
3) Experiments on TREC datasets showing the proposed approaches outperform other retrieval systems in modeling dynamic search tasks.
Three signs your architecture is too small for big data. Camp IT December 2014Craig Jordan
Three capability gaps that a traditional business intelligence architecture has with respect to processing big data and recommended extensions to address them.
Elasticsearch Performance Testing and Scaling @ SignalJoachim Draeger
In this talk I describe the specific challenges that we faced at Signal to make our use case scale. I then go into detail on how we benchmarked single queries and different shard configurations. You can try the experiments yourself using The Signal Media One-Million News Articles Dataset, a Docker Compose stack and some scripts provided here: https://github.com/joachimdraeger/elasticsearch-performance-experiments.
I also got the great advice to have a look at https://github.com/elastic/rally which can also give you summaries for test runs.
This presentation was given in one of the DSATL Mettups in March 2018 in partnership with Southern Data Science Conference 2018 (www.southerndatascience.com)
Three signs your architecture is too small for big data. Camp IT December 2014Craig Jordan
Three capability gaps that a traditional business intelligence architecture has with respect to processing big data and recommended extensions to address them.
Elasticsearch Performance Testing and Scaling @ SignalJoachim Draeger
In this talk I describe the specific challenges that we faced at Signal to make our use case scale. I then go into detail on how we benchmarked single queries and different shard configurations. You can try the experiments yourself using The Signal Media One-Million News Articles Dataset, a Docker Compose stack and some scripts provided here: https://github.com/joachimdraeger/elasticsearch-performance-experiments.
I also got the great advice to have a look at https://github.com/elastic/rally which can also give you summaries for test runs.
This presentation was given in one of the DSATL Mettups in March 2018 in partnership with Southern Data Science Conference 2018 (www.southerndatascience.com)
Best Practices in Recommender System ChallengesAlan Said
Recommender System Challenges such as the Netflix Prize, KDD Cup, etc. have contributed vastly to the development and adoptability of recommender systems. Each year a number of challenges or contests are organized covering different aspects of recommendation. In this tutorial and panel, we present some of the factors involved in successfully organizing a challenge, whether for reasons purely related to research, industrial challenges, or to widen the scope of recommender systems applications.
Horizon: Deep Reinforcement Learning at ScaleDatabricks
To build a decision-making system, we must provide answers to two sets of questions: (1) ""What will happen if I make decision X?"" and (2) ""How should I pick which decision to make?"".
Typically, the first set of questions are answered with supervised learning: we build models to forecast whether someone will click on an ad, or visit a post. The second set of questions are more open-ended. In this talk, we will dive into how we can answer ""how"" questions, starting with heuristics and search. This will lead us to bandits, reinforcement learning, and Horizon: an open-source platform for training and deploying reinforcement learning models at massive scale. At Facebook, we are using Horizon, built using PyTorch 1.0 and Apache Spark, in a variety of AI-related and control tasks, spanning recommender systems, marketing & promotion distribution, and bandwidth optimization.
The talk will cover the key components of Horizon and the lessons we learned along the way that influenced the development of the platform.
Author: Jason Gauci
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
In this talk I'll show how the Bayesian Optimization methods used by SigOpt, coupled with the incredibly scalable deep learning architecture provided with ncloud and neon, allow anyone it easily tune their models to quickly achieve higher accuracy. I'll walk through the techniques and show an explicit example with results.
Simulation To Reality: Reinforcement Learning For Autonomous DrivingDonal Byrne
Slides for my talk at PyCon Ireland 2019. The talk goes a brief overview of Reinforcement Learning (RL) and then dives in the key steps required to take an RL project from start to finish using autonomous driving as a case study. Finally the talk concludes with some references on where to get started with RL
SPRINT 13 Workshop 1 What is, and how do you do AGILE? Roo Reynolds - GDS, Andrew Austin-Hancock - Maritime and Coastguard Agency, Keith Oliver - HM Coastguard, James Findlay - DfT
DataEd Slides: Data Management Maturity - Achieving Best Practices Using DMMDATAVERSITY
ince its release in 2014, the CMMI/Data Management Maturity (DMM)℠ model has become the de facto standard for planning and implementing programmatic improvements to organizational Data Management programs. It permits organizations to evaluate its current-state Data Management capabilities and discover gaps to remediate and strengths to leverage. The DMM reveals priorities, business needs, and a clear, rapid path for process improvements. This webinar will describe the DMM framework for assessing an organization's Data Management capabilities, its evolution, and illustrate its use as a roadmap guiding organizational Data Management improvements.
Key Takeaways:
- Our profession is advancing its knowledge and has a widespread basis for partnerships
- New industry assessment standard is based on successful CMM/CMMI foundation
- A clear need for Data Strategy
- A clear and unambiguous call for participation
Iterative Methodology for Personalization Models OptimizationSonya Liberman
Describing the iterative process of optimizing personalized content recommendation models, starting from user preferences optimization to a spark-based large-scale modular machine learning framework.
How do you get better outcomes for government? You make sure the right people have the right information to make the right decisions. This is a brief to market asking how they can help government do this.
Presented to a full house at the CBR Innovation Network as the first step forwards by the ACT Government as it seeks to embrace a data-driven culture...
On Monday, November 7, 2016, Smart Chicago Collaborative held the first CUTGroup Collective Community call. The goal of the CUTGroup Collective is to convene organizations and institutions in cities to help others establish new CUTGroups, create a new community, and share and learn from one another. For our first community call, we want to highlight CUTGroup Detroit’s story. Over the last few months, a collaboration across multiple entities invested in Detroit– the City of Detroit, Data Driven Detroit, and Microsoft– recruited for and conducted their first CUTGroup test. On our first call, the team involved will talk about their successes and challenges in building CUTGroup Detroit.
Slides were created by the CUTGroup Detroit team, which includes the City of Detroit, Data Driven Detroit, and Microsoft.
Learning to Rank (LTR) presentation at RELX Search Summit 2018. Contains information about history of LTR, taxonomy of LTR algorithms, popular algorithms, and case studies of applying LTR using the TMDB dataset using Solr, Elasticsearch and without index support.
Are you collecting just about every metric under the sun and the kitchen sink too? Understanding the cost of collecting metrics and the usefulness of those metrics is the only way to scale in a cloud native world. You can’t get away with just collecting everything as you grow. Your observability teams need to make decisions about what to collect, what to drop, what to aggregate, and still be able to alert, triage, remediate, and do their root cause analysis on a daily basis. Gain immediate insights into high cost data (DPPS), when to drop time series data, and how to determine when the value of that data is at its lowest. Session includes a recorded demo video of it in action.
How Celtra Optimizes its Advertising Platformwith DatabricksGrega Kespret
Leading brands such as Pepsi and Macy’s use Celtra’s technology platform for brand advertising. To inform better product design and resolve issues faster, Celtra relies on Databricks to gather insights from large-scale, diverse, and complex raw event data. Learn how Celtra uses Databricks to simplify their Spark deployment, achieve faster project turnaround time, and empower people to make data-driven decisions.
In this webinar, you will learn how Databricks helps Celtra to:
- Utilize Apache Spark to power their production analytics pipeline.
- Build a “Just-in-Time” data warehouse to analyze diverse data sources such as Elastic Load Balancer access logs, raw tracking events, operational data, and reportable metrics.
- Go beyond simple counting and group events into sequences (i.e., sessionization) and perform more complex analysis such as funnel analytics.
Best Practices in Recommender System ChallengesAlan Said
Recommender System Challenges such as the Netflix Prize, KDD Cup, etc. have contributed vastly to the development and adoptability of recommender systems. Each year a number of challenges or contests are organized covering different aspects of recommendation. In this tutorial and panel, we present some of the factors involved in successfully organizing a challenge, whether for reasons purely related to research, industrial challenges, or to widen the scope of recommender systems applications.
Horizon: Deep Reinforcement Learning at ScaleDatabricks
To build a decision-making system, we must provide answers to two sets of questions: (1) ""What will happen if I make decision X?"" and (2) ""How should I pick which decision to make?"".
Typically, the first set of questions are answered with supervised learning: we build models to forecast whether someone will click on an ad, or visit a post. The second set of questions are more open-ended. In this talk, we will dive into how we can answer ""how"" questions, starting with heuristics and search. This will lead us to bandits, reinforcement learning, and Horizon: an open-source platform for training and deploying reinforcement learning models at massive scale. At Facebook, we are using Horizon, built using PyTorch 1.0 and Apache Spark, in a variety of AI-related and control tasks, spanning recommender systems, marketing & promotion distribution, and bandwidth optimization.
The talk will cover the key components of Horizon and the lessons we learned along the way that influenced the development of the platform.
Author: Jason Gauci
Using SigOpt to Tune Deep Learning Models with Nervana CloudSigOpt
In this talk I'll show how the Bayesian Optimization methods used by SigOpt, coupled with the incredibly scalable deep learning architecture provided with ncloud and neon, allow anyone it easily tune their models to quickly achieve higher accuracy. I'll walk through the techniques and show an explicit example with results.
Simulation To Reality: Reinforcement Learning For Autonomous DrivingDonal Byrne
Slides for my talk at PyCon Ireland 2019. The talk goes a brief overview of Reinforcement Learning (RL) and then dives in the key steps required to take an RL project from start to finish using autonomous driving as a case study. Finally the talk concludes with some references on where to get started with RL
SPRINT 13 Workshop 1 What is, and how do you do AGILE? Roo Reynolds - GDS, Andrew Austin-Hancock - Maritime and Coastguard Agency, Keith Oliver - HM Coastguard, James Findlay - DfT
DataEd Slides: Data Management Maturity - Achieving Best Practices Using DMMDATAVERSITY
ince its release in 2014, the CMMI/Data Management Maturity (DMM)℠ model has become the de facto standard for planning and implementing programmatic improvements to organizational Data Management programs. It permits organizations to evaluate its current-state Data Management capabilities and discover gaps to remediate and strengths to leverage. The DMM reveals priorities, business needs, and a clear, rapid path for process improvements. This webinar will describe the DMM framework for assessing an organization's Data Management capabilities, its evolution, and illustrate its use as a roadmap guiding organizational Data Management improvements.
Key Takeaways:
- Our profession is advancing its knowledge and has a widespread basis for partnerships
- New industry assessment standard is based on successful CMM/CMMI foundation
- A clear need for Data Strategy
- A clear and unambiguous call for participation
Iterative Methodology for Personalization Models OptimizationSonya Liberman
Describing the iterative process of optimizing personalized content recommendation models, starting from user preferences optimization to a spark-based large-scale modular machine learning framework.
How do you get better outcomes for government? You make sure the right people have the right information to make the right decisions. This is a brief to market asking how they can help government do this.
Presented to a full house at the CBR Innovation Network as the first step forwards by the ACT Government as it seeks to embrace a data-driven culture...
On Monday, November 7, 2016, Smart Chicago Collaborative held the first CUTGroup Collective Community call. The goal of the CUTGroup Collective is to convene organizations and institutions in cities to help others establish new CUTGroups, create a new community, and share and learn from one another. For our first community call, we want to highlight CUTGroup Detroit’s story. Over the last few months, a collaboration across multiple entities invested in Detroit– the City of Detroit, Data Driven Detroit, and Microsoft– recruited for and conducted their first CUTGroup test. On our first call, the team involved will talk about their successes and challenges in building CUTGroup Detroit.
Slides were created by the CUTGroup Detroit team, which includes the City of Detroit, Data Driven Detroit, and Microsoft.
Learning to Rank (LTR) presentation at RELX Search Summit 2018. Contains information about history of LTR, taxonomy of LTR algorithms, popular algorithms, and case studies of applying LTR using the TMDB dataset using Solr, Elasticsearch and without index support.
Are you collecting just about every metric under the sun and the kitchen sink too? Understanding the cost of collecting metrics and the usefulness of those metrics is the only way to scale in a cloud native world. You can’t get away with just collecting everything as you grow. Your observability teams need to make decisions about what to collect, what to drop, what to aggregate, and still be able to alert, triage, remediate, and do their root cause analysis on a daily basis. Gain immediate insights into high cost data (DPPS), when to drop time series data, and how to determine when the value of that data is at its lowest. Session includes a recorded demo video of it in action.
How Celtra Optimizes its Advertising Platformwith DatabricksGrega Kespret
Leading brands such as Pepsi and Macy’s use Celtra’s technology platform for brand advertising. To inform better product design and resolve issues faster, Celtra relies on Databricks to gather insights from large-scale, diverse, and complex raw event data. Learn how Celtra uses Databricks to simplify their Spark deployment, achieve faster project turnaround time, and empower people to make data-driven decisions.
In this webinar, you will learn how Databricks helps Celtra to:
- Utilize Apache Spark to power their production analytics pipeline.
- Build a “Just-in-Time” data warehouse to analyze diverse data sources such as Elastic Load Balancer access logs, raw tracking events, operational data, and reportable metrics.
- Go beyond simple counting and group events into sequences (i.e., sessionization) and perform more complex analysis such as funnel analytics.
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
1. Dynamic Search and
Beyond
Prof. Grace Hui Yang
InfoSense Group
Department of Computer Science
Georgetown University
huiyang@cs.georgetown.edu
Sep 29, 2018
CCIR 2018 @ Guilin
2. • Our graduate program focuses on
Information Systems,
Privacy and Security,
and Computer Theory.
• Ph.D., Master’s, Postdocs
3. • ACM International Conference on Theory of
Information Retrieval (ICTIR)
• Its importance in the IR community
• Acknowledgement to Guangxi normal university,
CCF, and many old and new friends
4. Statistical Modeling of
Information Seeking
• Aims to connect user’s information seeking
behaviors with retrieval models
• The ‘dynamics’ in the search process are the
primary elements to be modeled
• I call this set of novel retrieval algorithms “Dynamic
IR Modeling”
5. Task: Dynamic IR
• The information retrieval task that aims to find
relevant documents for a session of multiple queries.
• It happens when information needs are complex,
vague, evolving, often containing multiple subtopics
• Not possible to be resolved by one-shot ad-hoc
retrieval
• e.g. “Purchasing a home”, “What is the meaning of
life”
6. E.g. Find what city and state Dulles airport is in, what shuttles ride-sharing vans and
taxi cabs connect the airport to other cities, what hotels are close to the airport, what
are some cheap off-airport parking, and what are the metro stops close to the Dulles
airport.
Information
need
User
Search
Engine
An Illustration
7. Characteristics of Dynamic IR
• Rich interactions
• Query formulation
• Document clicks
• Document examination
• eye movement
• mouse movements
• etc.
4
8. Characteristics of Dynamic IR
• Temporal dependency
5
clicked
documentsquery
D1
ranked documents
q1 C1
D2
q2 C2 ……
…… Dn
qn Cn
I
information need
iteration 1 iteration 2 iteration n
9. Characteristics of Dynamic IR
• Aim for a long-term goal
• Great if we can find early what a user
ultimately want
4
10. Reinforcement Learning (RL)
• Fits well in this trial-and-error setting
• It is to learn from repeated, varied attempts which are
continued until success.
• The learner (also known as agent) learns from its dynamic
interactions with the world
• rather than from a labeled dataset as in supervised
learning.
• The stochastic model assumes that the system's current
state depend on the previous state and action in a non-
deterministic manner 6
12. ○ Based on Markov Decision Process (MDP)
○ States: Queries
! Observable
○ Actions:
! User actions:
○ Add/remove/unchange the query terms
○ Nicely correspond to our definition of query change
! Search Engine actions:
○ Increase/ decrease /remain term weights
○ Rewards:
! nDCG
[Guan, Zhang, and Yang SIGIR 2013]
QUERY CHANGE MODEL
13. SEARCH ENGINE AGENT’S ACTIONS
∈ Di−1 action Example
qtheme
Y increase “pocono mountain” in s6
N increase
“france world cup 98 reaction” in s28, france
world cup 98 reaction stock market→ france world
cup 98 reaction
+∆q
Y decrease
‘policy’ in s37, Merck lobbyists → Merck
lobbyists US policy
N increase
‘US’ in s37, Merck lobbyists → Merck lobbyists
US policy
−∆q
Y decrease
‘reaction’ in s28, france world cup 98 reaction
→ france world cup 98
N No change
‘legislation’ in s32, bollywood legislation
→bollywood law
14. QUERY CHANGE RETRIEVAL MODEL (QCM)
○ Bellman Equation gives the optimal value for an MDP:
○ The reward function is used as the document relevance
score function and is tweaked backwards from Bellman
equation:
Document relevant
score
Query Transition
model
Maximum past
relevance
Current
reward/relevance
score
15. CALCULATING THE TRANSITION MODEL
• According to Query Change and Search Engine Actions
Current reward/
relevance score
Increase weights for
theme terms
Decrease weights for
old added terms
Decrease weights for
removed terms
Increase weights for
novel added terms
16. ○ Partially Observable Markov Decision Process
○ Two agents
● Cooperative game
● Joint Optimization
WIN-WIN SEARCH: DUAL-AGENT STOCHASTIC GAME
● Hidden states
● Actions
● Rewards
● Markov
[Luo, Zhang, and Yang SIGIR 2014]
17. A MARKOV CHAIN OF DECISION MAKING STATES
[Luo, Zhang, and Yang SIGIR 2014]
18. SRT
Relevant &
Exploitation
SRR
Relevant &
Exploration
SNRT
Non-Relevant &
Exploitation
SNRR
Non-Relevant &
Exploration
● scooter price ⟶ scooter stores ● collecting old US coins⟶ selling
old US coins
● Philadelphia NYC travel ⟶
Philadelphia NYC train
● Boston tourism ⟶ NYC tourism
q0
HIDDEN DECISION MAKING STATES
[Luo, Zhang, and Yang SIGIR 2014]
20. ACTIONS
! User Action (Au)
○ add query terms (+Δq)
○ remove query terms (-Δq)
○ keep query terms (qtheme)
! Search Engine Action(Ase)
○ Increase/ decrease/ keep term weights
○ Switch on or off a search technique,
○ e.g. to use or not to use query expansion
○ adjust parameters in search techniques
○ e.g., select the best k for the top k docs used in
PRF
! Message from the user(Σu)
○ clicked documents
○ SAT clicked documents
! Message from search engine(Σse)
○ top k returned documents
Messages are essentially
documents that an agent thinks
are relevant.
[Luo, Zhang, and Yang SIGIR 2014]
23. SEARCH ACCURACY
○ Search accuracy on TREC 2012 Session Track
TREC 2012 Session Track
◆ Win-win outperforms most retrieval algorithms on TREC 2012.
24. ◆ Systems in TREC 2012 perform better than in TREC 2013.
◆ many relevant documents are not included in ClueWeb12 CatB
collection
◆ Win-win outperforms all retrieval algorithms on TREC 2013.
◆ It is highly effective in Session Search.
SEARCH ACCURACY
○ Search accuracy on TREC 2013 Session Track
TREC 2013 Session Track
25. SEARCH ACCURACY FOR DIFFERENT
SESSION TYPES
○ TREC 2012 Sessions are classified into:
! Product: Factual / Intellectual
! Goal quality: Specific / Amorphous
Intellectual %chg Amorphous %chg Specific %chg Factual %chg
TREC best 0.3369 0.00% 0.3495 0.00% 0.3007 0.00% 0.3138 0.00%
Nugget 0.3305 -1.90% 0.3397 -2.80% 0.2736 -9.01% 0.2871 -8.51%
QCM 0.3870 14.87% 0.3689 5.55% 0.3091 2.79% 0.3066 -2.29%
QCM+DUP 0.3900 15.76% 0.3692 5.64% 0.3114 3.56% 0.3072 -2.10%
- Better handle sessions that demonstrate evolution and exploration Because QCM
treats a session as a continuous process by studying changes among query
transitions and modeling the dynamics
QCM
27. DESIGN OPTIONS
○ Is there a temporal component?
○ States – What changes with each time step?
○ Actions – How does your system change the state?
○ Rewards – How do you measure feedback or
effectiveness in your problem at each time step?
○ Transition Probability – Can you determine this?
! If not, then model free approach is more suitable
ECIR’15
29. A Direct Policy Learning
Framework
• Learns a direct mapping from observations to actions by
gradient descent
• Define a history: A chain of events happening in a
session
• the dynamic changes of states, actions, observations,
and rewards in a session
ICTIR’15
30. Browse Phase
• Actor: the user
• It happens
• after the search results are shown to the user
• before the user starts to write the next query
• Records how the user perceives and examines the
(previously retrieved) search results
ICTIR’15
Decompose a history
31. Query Phase
• Actor: the user
• It happens
• when the user writes a query
• Assuming the query is created based on
• what has been seen in the browse phase
• the information need
ICTIR’15
Decompose a history
32. Rank Phase
• Actor: the search engine
• It happens
• after the query is entered
• before the search results are returned
• It is where the search algorithm takes place
Decompose a history
35. Ranking Function
• It originally presents the probability of selecting a
(ranking) action
• In our context, the probability of selecting d to be put
at the top of a ranked list under n3 and θ3 at the tth
iteration
• Then we sort the documents by it to generate the
document list
36. Updates:
Feature function:
Query Features
• Test if a search term w∈q
t
and w∈q
t
−1
• # of times that a term w occurs in q
1
,q
2
,…,q
t
Query-Document Features
• Test if a search term w∈+∆q
t
and w∈D
t
−1
• Test if a document d contains a term w ∈ −∆q
t
tf
.
idf score of a document d to q
t
Click Features
• Test if there are SAT-Clicks in Dt−1
• # of times a document being clicked in the
current session
• # of seconds a document being viewed and
reviewed in the current session
Query-Document-Click Features
• Test if qi leads to SAT-Clicks in Di, where i =
0...t−1
Session Features
• position at the current session
Browse
Query
Rank
37. Efficiency - TREC 2012 Session
• lemur > dpl > qcm > winwin
• dpl achieves a good balance between accuracy and efficiency
• the conclusions are also consistent upon experiments on TREC’12
~ 14 Session Tracks
DPL
38. TREC 2012 Session
• dpl achieves a significant improvement over the TREC best run
• We found similar conclusions on TREC 2013 and 2014 Session Tracks
DPL
39. TREC DYNAMIC DOMAIN 2015-2017
! The search task focuses on specific
domains
! In the three years, we had explored
domains from the dark web (illicit good and
Ebola) and polar science, to more general
web domains (NYT)
! What is consistent?
○ The participating system is expected to
help the user through interactions & get
their tasks done
○ User’s information need usually consists
of multiple aspects
41. FEEDBACK FROM A SIMULATED USER
! https://github.com/trec-dd/trec-dd-jig
42. DOMAIN USED IN 2017
○ New York Times Annotated Corpus
! Sandhaus, Evan. "The new york times annotated corpus." Linguistic Data
Consortium, Philadelphia 6, no. 12 (2008): e26752.
! Archives of New York Times in 20 years, from January 1, 1987 and June 19, 2007
! Uncompressed size 16 GB
! Over 1.8 million documents
! Over 650,000 article summaries written by library scientists.
! Over 1,500,000 articles manually tagged by library scientists
! Over 275,000 algorithmically-tagged articles that have been hand verified by
professionals
43. ANNOTATION
○ Create Topic and Relevance Judgement at the same time
! Not by pooling
○ Topic – subtopic – passage – Relevance Judgement
○ The challenge: how to be complete
44. ○ Useful information that the user gains
! Raw relevance score
○ Discounting
! Based on document ranking
! Based on diversity
○ User’s efforts
! Time spent
! Lengths of documents being
viewed
EVALUATION METRICS FOR DYNAMIC SEARCH
45. ○ Most session search metrics consider all those factors into
one overwhelmingly complex formula
○ The optimal value, aka upper bound, of those metrics highly
varies on different search topics
○ In Cranfield-like settings (e.g. TREC), the difference is often
ignored
THE PROBLEM
47. ○ What is the optimal metric value that a system can
achieve?
! How to get the upper bound for each search topic?
! How does it affect the evaluation conclusions?
○ Variance of different topics
○ Normalization
RESEARCH QUESTIONS
!"#$%& = (
)*+,-
$./_!"#$% 1#23", 5 − 7#/%$_8#9:;(1#23")
922%$_8#9:; 1#23" − 7#/%$_8#9:;(1#23")
48. ○ Session-DCG (sDCG)
! Järvelin et al. "Discounted cumulated gain based evaluation of multiple-query IR
sessions." Advances in Information Retrieval (2008): 4-15.
○ Cube Test (CT)
! Luo et al. "The water filling model and the cube test: multi-dimensional evaluation for professional
search." CIKM, 2013.
○ Expected Utility (EU)
○ Yang and Abhimanyu. "Modeling expected utility of multi-session information distillation." ICTIR
2009.
DYNAMIC SEARCH METRICS
!" = $
%
& ' $
(,* ∈%
$
,∈-.,/
0, ∗ 23 ,,(,*45 − 7 ∗ 89:;(=, >))
@A =
∑(C5
D ∑*C5
|F(GH.|
∑, 0, IJK =, > ∗ 23(,,(,*45)
∑(C5
D ∑*C5
|F(GH.|
89:;(=, >)
:L@M = $
(C5
D
$
*C5
|F(GH.|
IJK(=, >)
1 + logS > ∗ 1 + logST =
49. ○ sDCG
○ Cube Test
○ Expected Utility
DECONSTRUCT THE METRICS
CostGain Rank discount Novelty discount
!"#$ = &
'()
*
&
+()
|-'./0|
123(5, 7)
1 + log> 7 ∗ 1 + log>@ 5
#A =
∑'()
*
∑+()
|-'./0|
∑C DC 123 5, 7 ∗ EF(C,',+G))
∑'()
* ∑+()
|-'./0|
HI!J(5, 7)
KL = &
M
N O &
',+ ∈M
&
C∈Q0,R
DC ∗ EF C,',+G) − T ∗ HI!J(5, 7))
53. ! The difference of the optimal value a metric would
produce for different topics is large and should not
be ignored.
54. ○ Rearrangement Inequality
○ In IR, Probability Ranking Principle [4]
! the overall effectiveness of an IR system can be
achieved the best by ranking the documents by their
usefulness in descending order
OUR SOLUTION
!"#$ + !&#$'" + … + !$#" ≤ !* " #" + !* & #& + … + !* $ #$ ≤ !"#" + !&#& + ⋯ + !$#$
,-. !" ≤ !& … ≤ !$ /01 #" ≤ #& … ≤ #$
58. ! Using the bounds for normalization brings in more
fairness into evaluation
59. Conclusion
• Our main contributions:
• Put user into the models
• Created a bridge between information
seeking studies/user behavior studies with
learning
• Yield a family of new generative retrieval
models for a complex, dynamic settings
• Able to explain the results
60. A Few Thinkings
• Information seeking is a Markov Decision Process, instead of
independent searches
• User actions that cost more efforts, such as query changes,
are stronger signals than clicks
• Search is also a learning process for the user, who also
evolves
• Users and search engines form a partnership to explore the
information space
• They influence each other; It is a two-way communication
• Complex evaluation metrics might not be appropriate; the
complexity should either be modelled in the model or the
metric, but not in both
61. Look into the future
• Dynamic IR Models are good for modeling
information seeking
• A lot of room to study the user and the search
engine interaction in a generative way
• The thinking I presented here could be able to
generate new methods not only on retrieval and
evaluation, but also on related fields
• Exciting!!
62. Thank You!
• Email:
huiyang@cs.georgetown.edu
• Group Page: InfoSense at
http://infosense.cs.georgetown.
edu/
• Dynamic IR Website:
http://www.dynamic-ir-
modeling.org/
• Book: Dynamic Information
Retrieval Modeling
• TREC Dynamic Domain Track:
http://trec-dd.org/