Regroupement de relations pour l'extraction d'information non supervisée.

01-01-0001

Actes de CORIA 2012

PDF

Auteurs

Wei Wang 0055, Romaric Besançon, Olivier Ferret, Brigitte Grau

Résumé

En contexte de veille, l’extraction d’information non supervisée a pour but d’extraire

Abstract

The purpose of unsupervised information extraction is to extract information from text without fixing the type of information. Our work concentrates on the task of extracting and characterizing new relations between given entity types. We first propose in this article a filtering procedure to remove false relation candidates by combining heuristics and machine learning models. Best results achieve a score of 77.1% F-measure. Similar relations are then grouped together semantically using Markov Clustering and All Pairs Similarity Search algo- rithm, which can efficiently identify similar candidates in large scale. Finally, evaluations of clustering results, using both internal and external measures, show that the integration of the filtering step allows to double the recall while keeping the same precision.

Posts Récents

Premier appel à communication - CORIA 2024

13-12-2023

Journée accès à l’information (GDR TAL)

21-09-2022

Conférence CIRCLE 2022

24-01-2022

Conférence CORIA/RJCRI 2021

01-01-2021

Groupe de lecture ARIA

06-11-2020

A Propos

ARIA (Association Francophone de Recherche d’Information (RI) et Applications) est une société savante, association loi 1901, ayant pour but de promouvoir le savoir et les connaissances du domaine de la Recherche d’Information (RI) et des divers domaines scientifiques en jeu dans la conception, la réalisation et l’évaluation des systèmes de Recherche d’Information.