ROMIP Test Collections
RIRES: Russian Information Retrieval Evaluation Seminar

 News 
 About 
 Manifesto 
 Call for participation 
 General principles 
 Participation 
 Tracks 
 Participants 
 Test collections 
 Publications 
 Relevance tables 
 History 
 2004 
 2005 
 Forum 

По-русскиПо-русски
 

ROMIP Test Collections

We prepared the following collections for evaluation of participating systems:
  • Narod.ru Web collection
    It is a pseudorandom selection of web sites from the domain narod.ru (narod.ru is a national free hosting provider in Russia). The collection consists of 728 000 documents.

  • KM.ru Web collection 2007 (NEW)
    KM.ru collection is a copy of www.km.ru multiportal. It consists of about 3 000 000 documents.

  • BY.web collection 2007 (NEW)
    It is a subset of pages from the .by domain which were present in the index of Yandex on May, 2007.

  • DMOZ Web collection
    Collection based on the Russian-language section of the dmoz.org catalog. This collection is used as a training set in classification of Web sites and Web pages tracks.

  • Legal documents collection 2004
    Collection of documents from the Russian Federation legislation built in 2004. It consists of 61 000 documens.

  • Legal documents collection 2007 (NEW)
    Collection of documents from the Russian Federation legislation built in 2007. It consists of 300 000 documens.

  • News collection
    A set of news reports from 25 different sources covering three non-overlapping time intervals. The size of this collection is about 31 500 documents.