This talk describes the Cube Test, an IR evaluation metric that we create for measuring system effectiveness for the entire information seeking process. Featuring modeling of a cap to stop (task completion), multiple subtopics, time aspect, subtopic importance, subtopic relatedness, volume of gain.
Designing States, Actions, and Rewards for Using POMDP in Session SearchGrace Yang
Coming out with the states, actions, rewards design for an application is an art. We discuss among the available options in IR and evaluate the options' effectiveness and efficiency.
Cross-domain algorithms have been introduced to help improving recommendations and to alleviate cold-start problem, especially in small and sparse datasets. These algorithms work by transferring information from source domain(s) to target domain. In this paper, we study if such algorithms can be helpful for large-scale datasets. We introduce a large-scale cross-domain recommender algorithm derived from canonical correlation analysis and analyze its performance, in comparison with single and cross-domain baseline algorithms. Our experiments in both cold-start and hot-start situations show the effectiveness of the proposed approach.
It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...Shaghayegh (Sherry) Sahebi
As the heterogeneity of data sources are increasing on the web, and due to the sparsity of data in each of these data sources, cross-domain recommendation is becoming an emerging research topic in the recent years. Cross-domain collaborative filtering aims to transfer the user rating pattern from source (auxiliary) domains to a target domain for the purpose of alleviating the sparsity problem and providing better target recommendations. However, the studies so far have either focused on a limited number of domains that are assumed to be related to each other (such as books and movies), or a division of the same dataset (such as movies) into different domains based on an item characteristic (such as genre). In this paper, we study a broad set of domains and their characteristics to understand the factors that affect the success or failure of cross-domain collaborative filtering, the amount of improvement in cross-domain approaches, and the selection of best source domains for a speficic target domain. We propose to use Canonical Correlation Analysis (CCA) as a significant major factor in finding the most
promising source domains for a target domain, and suggest a cross-domain collaborative filtering based on CCA (CD-CCA) that proves to be successful in using the shared information between domains in the target recommendations.
Designing States, Actions, and Rewards for Using POMDP in Session SearchGrace Yang
Coming out with the states, actions, rewards design for an application is an art. We discuss among the available options in IR and evaluate the options' effectiveness and efficiency.
Cross-domain algorithms have been introduced to help improving recommendations and to alleviate cold-start problem, especially in small and sparse datasets. These algorithms work by transferring information from source domain(s) to target domain. In this paper, we study if such algorithms can be helpful for large-scale datasets. We introduce a large-scale cross-domain recommender algorithm derived from canonical correlation analysis and analyze its performance, in comparison with single and cross-domain baseline algorithms. Our experiments in both cold-start and hot-start situations show the effectiveness of the proposed approach.
It Takes Two to Tango: an Exploration of Domain Pairs for Cross-Domain Collab...Shaghayegh (Sherry) Sahebi
As the heterogeneity of data sources are increasing on the web, and due to the sparsity of data in each of these data sources, cross-domain recommendation is becoming an emerging research topic in the recent years. Cross-domain collaborative filtering aims to transfer the user rating pattern from source (auxiliary) domains to a target domain for the purpose of alleviating the sparsity problem and providing better target recommendations. However, the studies so far have either focused on a limited number of domains that are assumed to be related to each other (such as books and movies), or a division of the same dataset (such as movies) into different domains based on an item characteristic (such as genre). In this paper, we study a broad set of domains and their characteristics to understand the factors that affect the success or failure of cross-domain collaborative filtering, the amount of improvement in cross-domain approaches, and the selection of best source domains for a speficic target domain. We propose to use Canonical Correlation Analysis (CCA) as a significant major factor in finding the most
promising source domains for a target domain, and suggest a cross-domain collaborative filtering based on CCA (CD-CCA) that proves to be successful in using the shared information between domains in the target recommendations.
Learning to Reinforce Search EffectivenessGrace Yang
We use contextual bandit and EM for modeling the two communication between the user and the search engine. The 4th algorithm that we've created for dynamic search
CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...CUbRIK Project
Presentation at CIKM 2013 of the CUbRIK research paper: "Efficient Jaccard-based Diversity Analysis of Large
Document Collections" authored by Fan Deng, Stefan Siersdorfer and Sergej Zerr of L3S Research Center, partner of the CUbRIK Consortium.
Behavioral Intervention for ADHD, ASD, ODD and General Behavior IssuesTuesday's Child
Meg Kincaid, PhD, Clinical Director of Tuesday's Child presents at the Illinois Chapter of the American Academy of Pediatrics Annual Conference on September 20, 2014.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
A CCP is an experienced practitioner with advanced knowledge and technical expertise to apply the broad principles and best practices of Total Cost Management (TCM) in the planning, execution and management of any organizational project or program. CCPs also demonstrate the ability to research and communicate aspects of TCM principles and practices to all levels of project or program stakeholders, both internally and externally.
Energy simulation & analysis of two residential buildingschirag aggarwal
-> Analysed and compared the energy consumption of a residential building modelled using common building materials and specifications used in Delhi for decades to that of modelled by altering the building envelope and the AC system specifications.
-> Used eQUEST software.
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
Learning to Reinforce Search EffectivenessGrace Yang
We use contextual bandit and EM for modeling the two communication between the user and the search engine. The 4th algorithm that we've created for dynamic search
CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...CUbRIK Project
Presentation at CIKM 2013 of the CUbRIK research paper: "Efficient Jaccard-based Diversity Analysis of Large
Document Collections" authored by Fan Deng, Stefan Siersdorfer and Sergej Zerr of L3S Research Center, partner of the CUbRIK Consortium.
Behavioral Intervention for ADHD, ASD, ODD and General Behavior IssuesTuesday's Child
Meg Kincaid, PhD, Clinical Director of Tuesday's Child presents at the Illinois Chapter of the American Academy of Pediatrics Annual Conference on September 20, 2014.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
A CCP is an experienced practitioner with advanced knowledge and technical expertise to apply the broad principles and best practices of Total Cost Management (TCM) in the planning, execution and management of any organizational project or program. CCPs also demonstrate the ability to research and communicate aspects of TCM principles and practices to all levels of project or program stakeholders, both internally and externally.
Energy simulation & analysis of two residential buildingschirag aggarwal
-> Analysed and compared the energy consumption of a residential building modelled using common building materials and specifications used in Delhi for decades to that of modelled by altering the building envelope and the AC system specifications.
-> Used eQUEST software.
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...StormForge .io
Complimentary Live Webinar
Sponsored by StormForge
Analyzing the performance and behavior of applications run on Kubernetes is often challenging, making the need to optimize prior to production something that you must have. However, a problem has reared its head in the form of a question: How do you get an accurate measurement of application performance or other behavior without accurate testing or an accurate representation of how it will run in production? In this webinar, we will present and discuss a new fully Open Source tool for creating the needed tests with which to accurately measure your applications. We hope you will join us to learn more about this tool, and find out how you can help contribute.
This webinar is sponsored by StormForge and hosted by The Linux Foundation.
Speaker
Noah Abrahams, Open Source Advocate
Noah is an Open Source Advocate for StormForge, merging Open Source Strategy with Developer Advocacy. He has been involved in cloud for over 12 years, has been contributing to the Kubernetes ecosystem for 5 years, and has been up and down the business stack from DevOps and Architecture to Sales, Enablement, and Education. You will find him running meetups in Las Vegas and attending conferences, once those are both happening again.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
insect taxonomy importance systematics and classification
The Water Filling Model and The Cube Test: Multi-Dimensional Evaluation for Professional Search (CIKM 2013)
1. THE WATER FILLING MODEL AND
THE CUBE TEST:
Multi-Dimensional Evaluation
for Professional Search
Jiyun Luo1 Christopher Wing1 Grace Hui Yang1
Marti A. Hearst2
1Department of Computer Science
Georgetown University
Washington, DC, USA
{jl1749, cpw26}@georgetown.edu
huiyang@cs.georgetown.edu
CIKM 2013
2School of Information
University of California, Berkeley
Berkeley, CA, USA
hearst@berkeley.edu
1
2. INTRODUCTION
¢ Complicated search has recently received much
attention
¢ Professional search activities are usually
complicated search tasks
— Examples: Medical record search, Legal search,
Patent prior art search
¢ Evaluation metrics need to reflect this complexity
— U-measure for whole session evaluation [Sakai et al.
sigir’13]
— Time-based gain [Smucker and Clarke sigir’12]
— α-nDCG for diversity and novelty [Clarke et al. sigir’08]
— PRES for recall-orientated search tasks [Magdy and Jones,
sigir’10]
2
3. PROFESSIONAL SEARCH
¢ Rich information needs
— Multiple aspects or subtopics
¢ Time-sensitive
— It is not true that professional searchers, e.g., lawyers, are
evil and would like to read irrelevant documents since they
are paid by time and only care about recall
¢ Novelty
— Once examined one relevant document, subsequent
relevant documents are perceived as less relevant
¢ Stopping criteria
— Once a sub-information-need has been fulfilled, relevant
documents about it will contribute not much any more
¢ A mix of unranked and ranked retrieval
— Boolean search and proximity search are still popular 3
4. Fenestration Segment Stent-
Graft and Fenestration Method
US 20090259290 A1
Patent Prior Art Search
ABSTRACT
A method includes deploying a fenestration
segment stent-graft into a main vessel such
that a fenestration section …
1. A fenestration segment stent-graft comprising : a proximal
section comprising a woven graft cloth; …
2. The fenestration segment stent-graft of claim 1 wherein said
proximal section comprises a proximal end and a distal end, …
3. The fenestration segment stent-graft of claim 2 wherein said
attachment means comprises stitching.
…
20. A fenestration segment stent-graft comprising : a
proximal section; a distal section; …
21. The fenestration segment stent-graft of claim 20 wherein said
fenestration section comprises : graft material comprising loose woven
fibers…
Claims
4
Looking for published literature that can be
used to `say no’ to a patent application. A
granted patent should be novel and non-
trivial.
Ø Time constraint: less than 6 hours
Independent
DependentDependentDependent
5. 5
¢ Information need with
multiple subtopics
¢ Goal: fulfill the info need
with relevant documents as
soon as possible
¢ A document can cover
different subtopics
¢ Stop finding more relevant
documents for a subtopic or
for the entire information
need
¢ A cube with multiple
segments
¢ Goal: fill up the cube with
water as soon as possible
¢ “document water” can flow
in different segments
¢ Reaching a cap in a segment
and no more water can go
there
Professional Search The Water-filling Model
We draw an analogy between Professional Search
and Filling Water into a Cube
How to judge a search system is good?
Ø We assume the searcher wants the multi-subtopics of a task
to be fulfilled as quickly as possible & as much as possible
6. The Task Cube
Ø The Cube with unit length
represents the entire
information need
Ø Each cuboid in the Cube
represents a subtopic
Ø The top of the Cube is the
cap that limits the maximum
amount of relevant
information needed
Ø Stopping criterion
Ø The bottom is segmented into different areas.
Ø The area size indicates the importance of each
subtopic.
Ø E.g. in prior art search, independent claims are
assigned more weights than dependent claims
6
An empty
task cube for
a search task
with 6 subtopics
7. The Water Filling Model
7
Ø A new coming relevant
document will increase
waters in all its relevant
subtopics
Ø The height increment is the
relevance gain from that
document with regard to that
subtopic
Ø The total height of the water
in one cuboid represents the
accumulated relevance gain
for a subtopic
Ø Total volume in the task
Cube is the total Gain
8. The Cube Test
Ø Based on the water-filling model, we design
a new multi-dimensional evaluation metric
for professional search: the Cube Test (CT)
8
Ø CT calculates the rates of how fast a search
system can fill up the task cube as much as
possible
Ø It is a speed function
9. The Gain Function
𝐺𝑎𝑖𝑛( 𝑄, 𝑑𝑗)=∑𝑖↑▒𝑎𝑟𝑒𝑎𝑖 ×height𝑖, 𝑗 × KeepFilling𝑖
Ø Document dj’s gain is calculated as the
volume of relevant “document water” that
matches to all subtopics in the task cube.
Ø A more concrete equation:
where - Γ is a discounting factor for subtopic novelty, Γ = γnrel(c
i
,j-1)
where nrel(ci, j-1) is # of relevant documents for subtopic ci in
previously examined documents (d1 to dj-1).
- θi is the importance of the ith subtopic, ∑𝑖↑▒θ 𝑖 = 1.
- rel(d j,c i) is the water height, i.e., the document d j’s
relevance grade towards subtopic c i,
- Ι is the indicator function,
- MaxHeight is the cap for subtopic relevance (set to 1).
9
10. 10
Ø Total Gain for a list of documents
have been examined
The Total Gain Function
Ø Note that it does not assume any
traversal order
Ø It even does not assume ranked
retrieval
Ø This allows us to support both ranked
and unranked retrieval or a mix of
them
11. The Cube Test - Recap
11
Ø It is a speed function
Ø The time function is the amount of time taken from the
beginning up to the tth document, it can be
Ø actual reading time
Ø a formulation similar to TBG [Smucker &
Clarke,sigir’12], taking into account document length
∑𝑗=1↑𝑡▒4.4+ 𝑟↓𝑖 ×(0.018 𝑙↓𝑗 +7.8)
Ø or simply # of documents have been examined so far
12. EXPERIMENTS
Datasets
USPTO
• It consists of three million US patent applications and
publications from 2001 to 2013 in XML with images removed.
• We created 33 runs for 49 prior art finding tasks.
• Office actions written by US Patent Examiners are parsed
and the ground truth are extracted automatically from them
(PublicPair)
CLEF-IP 2012
• XML patent documents from the European Patent Office
(EPO) prior to 2002 and 400,000+ documents published by
the World Intellectual Property Organization (WIPO).
• We evaluate the 31 official runs from 5 teams who
participated CLEF-IP 2012.
12
13. Discriminative Power
Ø We compare the new metric with
a few well-known metrics:
• Recall
• I-rec (Sakai et al. EVIA’10]
• nDCG
• α-nDCG [Clarke et al. sigir’08]
• PRES [Magdy and Jones, sigir’10]
• MAP
• TBG [Smucker & Clarke, sigir’12]
• nERR-IA [Sakai & Song, sigir’11]
Ø Evaluate the evaluation metrics
by their discrimination power
[Sakai, sigir’06]
Ø We test a few variations of CT
Ø In the CLEF-IP dataset, all CT
metrics show high
discriminative power.
13
Ø For the USPTO dataset, Recall and
I-rec show the best discriminative
power. CT metrics show good
discriminative power.
14. Tradeoff between coverage and single relevance
Ø CT is able to adjust its bias between
recall-oriented tasks and precision-
oriented tasks
Ø We create two artificial runs
Ø coverage run It arranges relevant
documents to each subtopic in a round-
robin fashion.
Ø single relevance run It puts all relevant
documents ordered by rel(d, ci) for a
subtopic first, then for the next subtopic.
CT vs. γ for the coverage run
CT vs. γ for the single
relevance run
The novelty discount base γ ranges in
[0.1,0.9].
When γ is small, CT has a big novelty
discount, is biased towards coverage and
rewards more for runs that spread relevant
documents across different subtopics;
When γ is big, CT is biased towards precision
and rewards more for runs that produce highly
relevant documents early.
14
15. Conclusions
Ø This paper presents a novel evaluation metric (the Cube
Test), based on a novel utility model (the water filling model)
Ø It addresses several important dimensions in professional
search, and in complicated search in general
Ø Covers different aspects or subtopics
Ø Subtopics no need to be equally important
Ø Allows for single document to cover several subtopics
Ø Is time-sensitive
Ø Handles the stopping criterion
Ø Adding more relevant documents to certain subtopic
will not help to improve the overall gain
Ø Expresses the tradeoff between time, quality of
documents, and diverse coverage of subtopics
15
Acknowledgments: Portions of this work were conducted to explore
new concepts under the umbrella of a larger project at the US Patent
and Trademark Office.
16. THANK YOU
Jiyun Luo1 Christopher Wing1 Hui Yang1 Marti A. Hearst2
1Department of Computer Science
Georgetown University
Washington, DC, USA
{jl1749, cpw26}@georgetown.edu
huiyang@cs.georgetown.edu
2School of Information
University of California, Berkeley
Berkeley, CA, USA
hearst@berkeley.edu
16