ARIA

Association Francophone de Recherche d’Information (RI) et Applications

Actes de CORIA 2004
PDF

Auteurs

Cyril Goutte, Pavel B. Dobrokhotov, Éric Gaussier, Anne-Lise Veuthey

Résumé

Le travail que nous présentons ici a pour but la comparaison de méthodes de sélection

Abstract

In this contribution, we review a number of approaches to feature selection, divided in two broad classes. Some are corpus-based, ie they use only the data to assess the relevance of each feature, and aim at identifying a small subset of relevant features on which to train categorisation models. Others are model-based, ie they assess the relevance of each feature on the basis of the model used for categorisation. This second class of measures allows to better understand the model decisions. Furthermore, comparing the two classes provide insight on whether or not corpus-based feature extraction is selective enough, and does not overgener- ate compared to model-based selection. Our experimental comparison is mainly based on a collection of medical abstracts, provided by the Swiss Institute of Bioinformatics.

Posts Récents

Catégories

A Propos

ARIA (Association Francophone de Recherche d’Information (RI) et Applications) est une société savante, association loi 1901, ayant pour but de promouvoir le savoir et les connaissances du domaine de la Recherche d’Information (RI) et des divers domaines scientifiques en jeu dans la conception, la réalisation et l’évaluation des systèmes de Recherche d’Information.