Building Terrier by Open Collaboration - Presentation Transcript
Building by Open
Collaboration
Craig Macdonald
University of Glasgow
craigm@dcs.gla.ac.uk
Motivations
• Why should we have IR platforms?
– They facilitate research, by bringing researchers close
to the state‐of‐the‐art
• Why should we have !\"#$%&!'()# IR platforms?
– They facilitate greater research potential
• Keep in contact with the commercial companies
– They provide visibility
– By contributing towards open source platforms, we
can all reap the benefits
History of Terrier
• 2001‐2004: Terrier started as EPSRC (British
research council) project: Iadh Ounis, Gianni
Amati, Ben He & Vassilis Plachouras form
platform. I join later.
• 2005: v1.0.0 core released by the Univ. of
Glasgow as open source under the MPL
license, followed later by v.1.0.1 and v1.0.2
• 2007: v1.1.0, v1.1.1 released
• 2008: v2.0, v2.1, v2.2 released
Current Terrier Core
• A framework for doing IR experimentation and
building IR apps
• Supports indexing commonly used IR research
collections (e.g. TREC). Indexing options are
direct indexing, single‐pass indexing & Map
Reduce‐based indexing
• Standard retrieval facilities, with many
weighting models, including Query Expansion
• Sample desktop search application
Popularity
• 5000 downloads since 2005
• Growing academic usage
– FIRE: most popular platform
– CLEF: widely used platform
– TREC: for 2008, one of most popular platforms
Working with users
• The discussion forum provides a place for
users to ask questions about how to use the
platform
– 650 posts from August 2005 to Feb 2009
• Allows users to help !a#$%&'$!(%
Working with developers
• A central benefit of open source is that others
can use and improve that software, to the
benefit of others
• We are making changes to make it easier for
patches to be accepted to Terrier
– Issue tracking [In place]
– Open source code repository [In progress]
Issue Tracking
Key Summary Resolution
Issue Tracking
• Allows people to collaborate on making
changes to Terrier
• Using the issue tracker, proposed changes can
be discussed, contributed code patches can be
attached and reviewed
Contributing to Terrier
1. File an issue about the proposed changes
2. Propose a patch
3. Some discussion may ensue
4. Once the patch is deemed *))#\"+*,-#, it will
be committed by a committer
5. Next version of Terrier will include
contributed code
Filing an Issue
One sentence about the issue
What version of Terrier has this issue?
More details about the problem:
enough information to
reproduce the problem
Acceptable Patches
• Fits with Terrier’s style
– E.g. comments, javadoc, documentation
– Reuses existing code
• Can be cleanly applied (up to date with
current code)
• Does not break existing functionality
– All test cases pass
• Is agreeable to the committer
Open Source Code Repository
• We work on Terrier for months before releases
• However, patches need to be up‐to‐date
• In progress: open up Terrier core source
repository
• Patches can then in‐sync with current source
– Easier to commit and accept!
Our limitations
• Our aim is to do research
• The platform facilitates that. By having the core
platform open source we help &'$!()%do research
also
• Open source works best when &'$!()%can
contribute back as well
• We need your help to build a community – we
can’t spend all year developing code to release,
and helping users
– We can all take a share of the development and of
assisting users on the forum
Conclusions
• Motivations for open source IR platform
• Engaging with users
http://ir.dcs.gla.ac.uk/terrier/forum/
• Engaging with developers
http://ir.dcs.gla.ac.uk/terrier/issues/
• Filing issues and contributing patches
0 comments
Post a comment