RUA
(10.025 recursos)
Este sitio es un repositorio institucional para facilitar el acceso a las publicaciones producidas por los miembros de la universidad. La interfaz del sitio está disponible en parte en español y en parte en Inglés. Un canal RSS está disponible para que los usuarios se mantengan al día con los nuevos materiales añadidos. El sitio dispone de ayuda en español solamente.
Mostrando recursos 1 - 20 de 55
1.
An algorithm for anaphora resolution in Spanish texts - Palomar Sanz, Manuel; Ferrández Rodríguez, Antonio; Moreno Boronat, Lidia; Martínez Barco, Patricio Manuel; Peral Cortés, Jesús; Saiz Noeda, Maximiliano; Muñoz Guillena, Rafael
This paper presents an algorithm for identifying noun phrase antecedents of third person personal
pronouns, demonstrative pronouns, re?exive pronouns, and omitted pronouns (zero pronouns)
in unrestricted Spanish texts. We define a list of constraints and preferences for different types
of pronominal expressions, and we document in detail the importance of each kind of knowledge
(lexical, morphological, syntactic, and statistical) in anaphora resolution for Spanish. The paper
also provides a definition for syntactic conditions on Spanish NP-pronoun noncoreference using
partial parsing. The algorithm has been evaluated on a corpus of 1,677 pronouns and achieved
a success rate of 76.8%. We have also implemented four competitive algorithms and tested...
2.
The successful application of natural language processing for information retrieval - Ferrández Rodríguez, Antonio; Rojas Hernández, Yenory; Peral Cortés, Jesús
In this paper, a novel model for monolingual Information Retrieval in English and Spanish language is proposed. This model uses Natural Language Processing techniques (a POS-tagger, a Partial Parser, and an Anaphora Resolver) in order to improve the precision of traditional IR systems, by means of indexing the entities and the relations between these entities in the documents. This model is evaluated on both the Spanish and English CLEF corpora. For the English queries, there is a maximum increase of 35.11% in the average precision. For the Spanish queries, the maximum increase is 37.18%.
3.
Multilingual extension of temporal expression recognition using parallel corpora - Puchol Blasco, Marcel; Saquete Boró, Estela; Martínez Barco, Patricio Manuel
This paper presents the automatic extension of TERSEO to other languages,
a knowledge-based system for the recognition and
normalization of temporal expressions, originally developed for
Spanish.
TERSEO was extended to English and Italian through the
automatic translation of the temporal expressions, and it
was presented in previous works (see Saquete et al. (2004a)),
but a new methodology has been designed with the
purpose of obtaining better results in this issue.
This new methodology is based on the use of parallel corpora for extending
the TERSEO temporal model to other languages. In this case, two
different methods have been tested: (1) automatic translation of
TERSEO patterns to other
languages and (2) automatic corpora...
4.
Event ordering using TERSEO system - Saquete Boró, Estela; Muñoz Guillena, Rafael; Martínez Barco, Patricio Manuel
Preprint submitted to Elsevier Science, 14th February 2005
5.
Multiple-taxonomy question classification for category search on faceted information - Tomás Díaz, David; Vicedo González, José Luis
In this paper we present a novel multiple-taxonomy question classification system, facing the challenge of assigning categories in multiple taxonomies to natural language questions. We applied our system to category search on faceted information. The system provides a natural language interface to faceted information, detecting the categories requested by the user and narrowing down the document search space to those documents pertaining to the facet values identified. The system was developed in the framework of language modeling, and the models to detect categories are inferred directly from the corpus of documents.
6.
Translation of pronominal anaphora between English and Spanish: discrepancies and evaluation - Peral Cortés, Jesús; Ferrández Rodríguez, Antonio
This paper evaluates the different tasks carried out in the translation of pronominal anaphora in a machine translation (MT) system. The MT interlingua approach named AGIR (Anaphora Generation with an Interlingua Representation) improves upon other proposals presented to date because it is able to translate intersentential anaphors, detect co-reference chains, and translate Spanish zero pronouns into Englishissues hardly considered by other systems. The paper presents the resolution and evaluation of these anaphora problems in AGIR with the use of different kinds of knowledge (lexical, morphological, syntactic, and semantic). The translation of English and Spanish anaphoric third-person personal pronouns (including Spanish...
7.
New measures for open-domain question answering evaluation within a time constraint - Noguera Robles, Elisa; Llopis Pascual, Fernando; Ferrández Rodríguez, Antonio; Escapa García, Luis Alberto
Previous works on evaluating the performance of Question Answering (QA) systems are focused on the evaluation of the precision.
In this paper, we developed a mathematic procedure in order to explore new evaluation measures in QA systems considering the answer
time. Also, we carried out an exercise for the evaluation of QA systems within a time constraint in the CLEF-2006 campaign, using the proposed measures. The main conclusion is that the evaluation of QA systems in realtime can be a new scenario for the evaluation of QA systems.
8.
A knowledge based method for the medical question answering problem - Muñoz Terol, Rafael; Martínez Barco, Patricio Manuel; Palomar Sanz, Manuel
In this paper, a restricted domain question answering (QA) system is described. The design architecture of this QA system and the features that
allow the adaptation of the QA system to the medical domain are also presented. The advantages of this QA system include the simple process
of defining the question taxonomy answered by the system as well as the possibility of locally or remotely managed document collections. The
main computing methods of the QA system are based on the application of natural language processing (NLP) techniques to infer the logic
forms and on the treatment of the logic forms. The knowledge of the...
9.
Il sistema TERSEO per l'italiano - Saquete Boró, Estela; Martínez Barco, Patricio Manuel; Muñoz Guillena, Rafael
In this paper, we describe the process to extend TERSEO
system to Italian using an automatic porting to this new
language, as it was performed to English.
10.
An algorithm for anaphora resolution in Spanish texts - Palomar Sanz, Manuel; Ferrández Rodríguez, Antonio; Moreno Boronat, Lidia; Martínez Barco, Patricio Manuel; Peral Cortés, Jesús; Saiz Noeda, Maximiliano; Muñoz Guillena, Rafael
This paper presents an algorithm for identifying noun phrase antecedents of third person personal
pronouns, demonstrative pronouns, re?exive pronouns, and omitted pronouns (zero pronouns)
in unrestricted Spanish texts. We define a list of constraints and preferences for different types
of pronominal expressions, and we document in detail the importance of each kind of knowledge
(lexical, morphological, syntactic, and statistical) in anaphora resolution for Spanish. The paper
also provides a definition for syntactic conditions on Spanish NP-pronoun noncoreference using
partial parsing. The algorithm has been evaluated on a corpus of 1,677 pronouns and achieved
a success rate of 76.8%. We have also implemented four competitive algorithms and tested...
11.
The successful application of natural language processing for information retrieval - Ferrández Rodríguez, Antonio; Rojas Hernández, Yenory; Peral Cortés, Jesús
In this paper, a novel model for monolingual Information Retrieval in English and Spanish language is proposed. This model uses Natural Language Processing techniques (a POS-tagger, a Partial Parser, and an Anaphora Resolver) in order to improve the precision of traditional IR systems, by means of indexing the entities and the relations between these entities in the documents. This model is evaluated on both the Spanish and English CLEF corpora. For the English queries, there is a maximum increase of 35.11% in the average precision. For the Spanish queries, the maximum increase is 37.18%.
12.
Multilingual extension of temporal expression recognition using parallel corpora - Puchol Blasco, Marcel; Saquete Boró, Estela; Martínez Barco, Patricio Manuel
This paper presents the automatic extension of TERSEO to other languages,
a knowledge-based system for the recognition and
normalization of temporal expressions, originally developed for
Spanish.
TERSEO was extended to English and Italian through the
automatic translation of the temporal expressions, and it
was presented in previous works (see Saquete et al. (2004a)),
but a new methodology has been designed with the
purpose of obtaining better results in this issue.
This new methodology is based on the use of parallel corpora for extending
the TERSEO temporal model to other languages. In this case, two
different methods have been tested: (1) automatic translation of
TERSEO patterns to other
languages and (2) automatic corpora...
13.
Event ordering using TERSEO system - Saquete Boró, Estela; Muñoz Guillena, Rafael; Martínez Barco, Patricio Manuel
Preprint submitted to Elsevier Science, 14th February 2005
14.
Multiple-taxonomy question classification for category search on faceted information - Tomás Díaz, David; Vicedo González, José Luis
In this paper we present a novel multiple-taxonomy question classification system, facing the challenge of assigning categories in multiple taxonomies to natural language questions. We applied our system to category search on faceted information. The system provides a natural language interface to faceted information, detecting the categories requested by the user and narrowing down the document search space to those documents pertaining to the facet values identified. The system was developed in the framework of language modeling, and the models to detect categories are inferred directly from the corpus of documents.
15.
Translation of pronominal anaphora between English and Spanish: discrepancies and evaluation - Peral Cortés, Jesús; Ferrández Rodríguez, Antonio
This paper evaluates the different tasks carried out in the translation of pronominal anaphora in a machine translation (MT) system. The MT interlingua approach named AGIR (Anaphora Generation with an Interlingua Representation) improves upon other proposals presented to date because it is able to translate intersentential anaphors, detect co-reference chains, and translate Spanish zero pronouns into Englishissues hardly considered by other systems. The paper presents the resolution and evaluation of these anaphora problems in AGIR with the use of different kinds of knowledge (lexical, morphological, syntactic, and semantic). The translation of English and Spanish anaphoric third-person personal pronouns (including Spanish...
16.
New measures for open-domain question answering evaluation within a time constraint - Noguera Robles, Elisa; Llopis Pascual, Fernando; Ferrández Rodríguez, Antonio; Escapa García, Luis Alberto
Previous works on evaluating the performance of Question Answering (QA) systems are focused on the evaluation of the precision.
In this paper, we developed a mathematic procedure in order to explore new evaluation measures in QA systems considering the answer
time. Also, we carried out an exercise for the evaluation of QA systems within a time constraint in the CLEF-2006 campaign, using the proposed measures. The main conclusion is that the evaluation of QA systems in realtime can be a new scenario for the evaluation of QA systems.
17.
A knowledge based method for the medical question answering problem - Muñoz Terol, Rafael; Martínez Barco, Patricio Manuel; Palomar Sanz, Manuel
In this paper, a restricted domain question answering (QA) system is described. The design architecture of this QA system and the features that
allow the adaptation of the QA system to the medical domain are also presented. The advantages of this QA system include the simple process
of defining the question taxonomy answered by the system as well as the possibility of locally or remotely managed document collections. The
main computing methods of the QA system are based on the application of natural language processing (NLP) techniques to infer the logic
forms and on the treatment of the logic forms. The knowledge of the...
18.
Il sistema TERSEO per l'italiano - Saquete Boró, Estela; Martínez Barco, Patricio Manuel; Muñoz Guillena, Rafael
In this paper, we describe the process to extend TERSEO
system to Italian using an automatic porting to this new
language, as it was performed to English.
19.
Análisis de terminologías de salud para su utilización como ontologías computacionales en los sistemas de información clínicos - Romá Ferri, María Teresa; Palomar Sanz, Manuel
Objetivos: Las ontologías son un recurso que permite trabajar informáticamente con la conceptualización del significado y evitar la limitación impuesta por los términos normalizados. El objetivo de este estudio es establecer el grado de usabilidad de las terminologías para el diseño de ontologías, que contribuyan a resolver los problemas de interoperabilidad semántica, y de reutilización de conocimiento en los sistemas de información clínicos. Métodos: Se han analizado 6 de las terminologías más relevantes para el ámbito clínico, epidemiológico, documental y administrativo-económico. Se valoraron las siguientes cualidades: cobertura conceptual, estructura jerárquica, granularidad conceptual, relaciones conceptuales y grado de formalismo utilizado en...
20.
QALL-ME : Question Answering Learning technologies in a multiLingual and multiModal Environment - Izquierdo Beviá, Rubén; Ferrández Escámez, Óscar; Ferrández Escámez, Sergio; Tomás Díaz, David; Vicedo González, José Luis; Martínez Barco, Patricio Manuel; Suárez Cueto, Armando
En este documento presentamos el proyecto QALL-ME, relacionado con las tecnologías de los sistemas de información. El proyecto tiene un duración de 36 meses y está financiado por la Unión Europea y será llevado a cabo por 7 instituciones. El objetivo general es establecer una infraestructura compartida para la
Búsqueda de Respuestas en un dominio abierto multilingüe y multimodal para dispositivos móviles. Con las necesidades de información actuales de la sociedad, se
atisba un mercado potencial enorme de los distintos objetivos que se persiguen en
el QALL-ME.