Similar Documents Search Track
RIRES: Russian Information Retrieval Evaluation Seminar

 Call for participation 
 General principles 
 Test collections 
 Relevance tables 


Similar Documents Search Track


The purpose of this track is to evaluate methods of document retrieval which use feedback from the user. In the context of this track we evaluate methods of search of similar documents by a sample document.

For this track the standard procedure is used.

Test Collection

The source dataset is a union of and legal documents (2004) collections.

Documents located in all the archives narod.*, legal.* and *_training.* are included in the dataset.

Task Description for Participating Systems

Each participant is granted access to the collections and a set of tasks. Each task is a (query, relevant document) pair. The tasks are based on the set of queries used in the previous ROMIP workshops (2004-2006). Evaluation is based on strong relevance criteria, i.e. a document is considered relevant if all of the assessors mark it as relevant.

Expected result is an ordered list of document URLs. Maximum list size is 100 per (query, relevant document) pair.

Evaluation Methodology

  • instructions for assessors:
    assessors evaluate document relevance to a query (phrase) basing on an extended description of the user information need, without seeing the sample document relevant to the query.
  • evaluation method: pooling (pool depth is 50)
  • relevance scale:
    • yes / probably yes / perhaps yes / no / impossible to evaluate
    • yes / no / impossible to evaluate
  • official metrics
    • precision
    • recall
    • TREC 11-point precision/recall graph
    • bpref

Data Formats