Sentence filtering for information extraction in genomics, a classification problem
2001
Nédellec , Claire (INRA (France). UR 1077 Unité de recherche Mathématique, informatique et génome) | Ould Abdel Vetah , Mohamed (Centre National de la Recherche ScientifiqueValiGen SA, OrsayLa Défense(France). Université Paris 11, UMR8623 Laboratoire de Recherche en Informatique (LRI)) | Bessières , Philippe (INRA (France). UR 1077 Unité de recherche Mathématique, informatique et génome)
In some domains, Information Extraction (IE) from texts requires syntactic and semantic parsing. This analysis is computationally expensive and IE is potentially noisy if it applies to the whole set of documents when the relevant information is sparse. A preprocessing phase that selects the fragments which are potentially relevant increases the efficiency of the IE process. This phase has to be fast and based on a shallow description of the texts. We applied various classification methods — IVI, a Naive Bayes learner and C4.5 — to this fragment filtering task in the domain of functional genomics. This paper describes the results of this study. We show that the IVI and Naive Bayes methods with feature selection gives the best results as compared with their results without feature selection and with C4.5 results.
Show more [+] Less [-]AGROVOC Keywords
Bibliographic information
This bibliographic record has been provided by Institut national de la recherche agronomique