Auteurs
Résumé
Cette communication évalue et compare l’efficacité de modèles vectoriels, probabilistes ou de langue afin de dépister des articles de presse rédigés en langue française. En se basant sur un corpus créé durant trois campagnes d’évaluation CLEF et comprenant
Abstract
This paper describes and evaluates vector-space, probabilistic and language IR models used to retrieve news articles from a corpus written in the French language. Based on three CLEF test-collections and 151 topics, we analyze the retrieval effectiveness of these approaches and analyze the poor retrieval results of hard topics. An appropriate robust evaluation is not easy because both the mean average precision (MAP) or the geometric mean (GMAP) present some drawbacks. In order to obtain a better picture, we suggest using the First Relevant Score (or FRS, based on the rank of the first relevant item). We evaluate and compare these three measures in particular when using blind query expansion technique. MOTS-CLÉS : Evaluation de recherche robuste ; expansion aveugle ; requêtes difficiles. KEY WORDS : Robust evaluation; blind query expansion; hard queries.