Resource data
A Formal Framework for Linguistic Annotation (revised version)
Bird, Steven Liberman, Mark
Location:
http://arxiv.org/abs/cs/0010033
`Linguistic annotation' covers any descriptive or analytic notations applied
to raw language data. The basic data may be in the form of time functions -
audio, video and/or physiological recordings - or it may be textual. The added
notations may include transcriptions of all sorts (from phonetic features to
discourse structures), part-of-speech and sense tagging, syntactic analysis,
`named entity' identification, co-reference annotation, and so on. While there
are several ongoing efforts to provide formats and tools for such annotations
and to publish annotated linguistic databases, the lack of widely accepted
standards is becoming a critical problem. Proposed standards, to the extent
they exist, have focused on file formats. This paper focuses instead on the
logical structure of linguistic annotations. We survey a wide variety of
existing annotation formats and demonstrate a common conceptual core, the
annotation graph. This provides a formal framework for constructing,
maintaining and searching linguistic annotations, while remaining consistent
with many alternative data structures and file formats.
Belongs to: arXiv
Descargar SCORM
¡Sea el primero en solicitar este recurso!
Para poder solicitar este recurso debe identificarse como usuario de la biblioteca
Users rating
No hay ninguna valoración para este recurso. Sea el primero en
valorar este recurso.
Detalles del recurso
|
A Formal Framework for Linguistic Annotation (revised version)
|
| Id. |
241895 |
| Titulo |
A Formal Framework for Linguistic Annotation (revised version) |
| Autor(es) |
Bird, Steven Liberman, Mark |
| Location |
http://arxiv.org/abs/cs/0010033
|
| Versión |
1.0 |
| Estado |
Final
|
| Descripción |
`Linguistic annotation' covers any descriptive or analytic notations applied
to raw language data. The basic data may be in the form of time functions -
audio, video and/or physiological recordings - or it may be textual. The added
notations may include transcriptions of all sorts (from phonetic features to
discourse structures), part-of-speech and sense tagging, syntactic analysis,
`named entity' identification, co-reference annotation, and so on. While there
are several ongoing efforts to provide formats and tools for such annotations
and to publish annotated linguistic databases, the lack of widely accepted
standards is becoming a critical problem. Proposed standards, to the extent
they exist, have focused on file formats. This paper focuses instead on the
logical structure of linguistic annotations. We survey a wide variety of
existing annotation formats and demonstrate a common conceptual core, the
annotation graph. This provides a formal framework for constructing,
maintaining and searching linguistic annotations, while remaining consistent
with many alternative data structures and file formats. |
| Palabras clave |
Computer Science - Computation and Language |
| Tipo de recurso |
Texto Narrativo
|
| Tipo de Interactividad |
Expositivo
|
| Nivel de Interactividad |
muy bajo
|
| Audiencia |
Estudiante
Profesor
Autor
|
| Estructura |
Atomic |
| Coste |
no
|
| Copyright |
sí
|
| Requerimientos técnicos |
Browser: Any |
| Fecha de contribución |
24-feb-2007 |
| Contacto |
|
|