Saturday, November 29, 2014

 

 



Soy un nuevo usuario

Olvidé mi contraseña

Entrada usuarios

Lógica Matemáticas Astronomía y Astrofísica Física Química Ciencias de la Vida
Ciencias de la Tierra y Espacio Ciencias Agrarias Ciencias Médicas Ciencias Tecnológicas Antropología Demografía
Ciencias Económicas Geografía Historia Ciencias Jurídicas y Derecho Lingüística Pedagogía
Ciencia Política Psicología Artes y Letras Sociología Ética Filosofía
 

rss_1.0 Clasificación por Disciplina

Nomenclatura Unesco > (57) Lingüística

Mostrando recursos 38,041 - 38,060 de 48,603

38041. Programming with Structures, Functions, and Objects - Fritz Henglein,Konstantin Laufer
We describe program structuring mechanisms for integrating algebraic, functional and objectoriented programming in a single framework. Our language is a statically typed higher-order language with specifications, structures, types, and values, and with universal and existential abstraction over structures, types, and values. We show that existential types over structures generalize both the necessarily homogeneous type classes of Haskell and the necessarily heterogeneous object classes of object-oriented programming languages such as C++ or Eiffel. Following recent work on ML, we provide separate linguistic mechanisms for reusing specifications and structures. Subtyping is provided in the form of explicit type conversions. The language mechanisms are introduced by examples to emphasize their pragmatic aspects. We...

38042. Part of Speech Tagging and Lemmatisation for the Spoken Dutch Corpus - Frank Van Eynde,Jakub Zavrel,Walter Daelemans
This paper describes the lemmatisation and tagging guidelines developed for the "Spoken Dutch Corpus", and lays out the philosophy behind the high granularity tagset that was designed for the project. To bootstrap the annotation of large quantities of material (10 million words) with this new tagset we tested several existing taggers and tagger generators on initial samples of the corpus. The results show that the most effective method, when trained on the small samples, is a high quality implementation of a Hidden Markov Model tagger generator. 1. Introduction The Dutch-Flemish project "Corpus Gesproken Nederlands " (1998-2003) aims at the collection, transcription and annotation of ten million words of...

38043. Typed Feature Formalisms as a Common Basis for Linguistic Specification - Intelligenz Gmbh,Hans-ulrich Krieger,Deutsches Forschungszentrum,K Unstliche Intelligenz
. Typed feature formalisms (TFF) play an increasingly important role in CL and NLP. Many of these systems are inspired by Pollard and Sag's work on Head-Driven Phrase Structure Grammar (HPSG), which has shown that a great deal of syntax and semantics can be neatly encoded within TFF. However, syntax and semantics are not the only areas in which TFF can be beneficially employed. In this paper, I will show that TFF can also be used as a means to model finite automata (FA) and to perform certain types of logical inferencing. In particular, I will (i) describe how FA can be defined and processed within TFF and...

38044. Combining Neural Networks and Fuzzy Controllers - Detlef Nauck,Frank Klawonn,Rudolf Kruse
. Fuzzy controllers are designed to work with knowledge in the form of linguistic control rules. But the translation of these linguistic rules into the framework of fuzzy set theory depends on the choice of certain parameters, for which no formal method is known. The optimization of these parameters can be carried out by neural networks, which are designed to learn from training data, but which are in general not able to profit from structural knowledge. In this paper we discuss approaches which combine fuzzy controllers and neural networks, and present our own hybrid architecture where principles from fuzzy control theory and from neural networks are integrated into one...

38045. SURGE: a Comprehensive Plug-in Syntactic Realization Component for Text Generation - Michael Elhadad,Jacques Robin
this paper on the development and evolution of such a component, the surge system. surge has been widely distributed in the generation community and has been embedded into several complete systems. The goals of reusability and wide coverage have led to a large system and many of the issues faced during the development of the system are issues common to the development of large software systems. From the linguistic side, they have also led to the definition of an input specification language of a relatively low level of abstraction and to a pragmatic approach c fl 1999 Association for Computational Linguistics Computational Linguistics Volume 99, Number 4 to grammar...

38046. An Annotation Scheme for Concept-to-Speech Synthesis - Janet Hitzeman,Alan W. Black,Chris Mellish,Jon Oberlander,Massimo Poesio,Buccleuch Place
The SOLE concept-to-speech system uses linguistic information provided by an NLG component to improve the intonation of synthetic speech. As the text is generated, the system automatically annotates the text with linguistic information using a set of XML tags which we have developed for this purpose. The annotation is then used by the synthesis component in producing the intonation. We describe the annotation system and discuss our choice of linguistic constructs to annotate. An Annotation Scheme for Concept-to-Speech Synthesis 1 Introduction The goal of the SOLE project is to make use of high-level linguistic information to improve the quality of the intonation of synthetic speech. SOLE's natural language generation...

38047. Fuzzy Diagnosis - Ludmila I. Kuncheva
Introduction Starting from the pioneering publication of Lot Zadeh in 1965 [55], fuzzy sets have been applied to many elds in which uncertainty plays a key role. Medicine, often on the borderline between science and art, is an excellent exponent: vagueness, linguistic uncertainty, hesitation, measurement imprecision, natural diversity, subjectivity { all these are prominently present in medical diagnosis. While statistical uncertainty can be handled in a rigorous way, the treatment of nonstatistical uncertainty is still a challenge [22]. For example, the nonstatistical uncertainty in high" attributed to the blood pressure of a patient is at least threefold: The patient. For a normally hypotonic patient high" blood pressure is...

38048. Meta-Learning for Phonemic Annotation of Corpora - Walter Daelemans,Erik Tjong Kim Sang,Steven Gillis
We apply rule induction, classier combination and meta-learning (stacked classiers) to the problem of bootstrapping high accuracy automatic annotation of corpora with pronunciation information. The task we address in this paper consists of generating phonemic representations reecting the Flemish and Dutch pronunciations of a word on the basis of its orthographic representation (which in turn is based on the actual speech recordings). We compare several possible approaches to achieve the text-topronunciation mapping task: memory-based learning, transformation-based learning, rule induction, maximum entropy modeling, combination of classiers in stacked learning, and stacking of meta-learners. We are interested both in optimal accuracy and in obtaining insight into the linguistic regularities involved. As far as accuracy is concerned, an already high accuracy level (93%...

38049. Towards Automating the Evolution of Linguistic Competence in Artificial Agents - Piotr J. Gmytrasiewicz,Dhruva Gopal
The goal of this research is to understand and automate the mechanisms by which language can emerge among artificial, knowledge-based and rational agents. We use the paradigm of rationality defined by decision theory, and employ the formal model of negotiation studied in game theory to allow the emergence and enrichment of an agent communication language. 1 Introduction The aim of our research is to understand and automate the mechanisms by which language can emerge among artificial, knowledge-based and rational agents. Our ultimate goal is to be able to design and implement agents that, upon encountering other agent(s) with which they...

38050. An Analysis of Statistical and Syntactic Phrases - Mandar Mitra,Chris Buckley,Amit Singhal,Claire Cardie
As the amount of textual information available through the World Wide Web grows, there is a growing need for high-precision IR systems that enable a user to find useful information from the masses of available textual data. Phrases have traditionally been regarded as precision-enhancing devices and have proved useful as content-identifiers in representing documents. In this study, we compare the usefulness of phrases recognized using linguistic methods and those recognized by statistical techniques. We focus in particular on high-precision retrieval. We discover that once a good basic ranking scheme is being used, the use of phrases does not have a major effect on precision at high...

38051. Building A Thai Part-Of-Speech Tagged Corpus (ORCHID) - Naoto Takahashi,Hitoshi Isahara,Virach Sornlertlamvanich
ORCHID (Open linguistic Resources CHanelled toward InterDisciplinary research) is an initiative project aimed at building linguistic resources to support research in, but not limited to, natural language processing. Based on the concept of an open architecture design, the resources must be fully compatible with similar resources, and software tools must also be made available. This paper presents one result of the project, the construction of a Thai part-of-speech (POS) tagged corpus, which is a preliminary stage in the construction of a Thai speech corpus. The POS-tagged corpus is the result of collaborative research between the Communications Research Laboratory (CRL) in...

38052. Neural Optimization of Linguistic Variables and Membership Functions - Rafal Adamczak,Krzysztof Grabczewski
Algorithms for extraction of logical rules from data that contains real-valued components require determination of linguistic variables or membership functions. Context-dependent membership functions for crisp and fuzzy linguistic variables are introduced and methods of their determination described. Methodology of extraction, optimization and application of sets of logical rules is described. Neural networks are used for initial determination of linguistic variables and rule extraction, followed by minimization procedures for optimization of the sets of rules. Gaussian uncertainties of measurements are assumed during application of crisp logical rules, leading to "soft trapezoidal" membership functions and allowing to optimize the linguistic variables using gradient procedures. Applications to a number of benchmark and real life...

38053. Homme-Machine - Paul Boucher,Projet Repco
: In this paper, we present Compounds, an Intelligent Tutoring System which can teach French students of English how to produce and understand lexicalised and newly formed compounds. This ITS must be able to detect the difficulties of each student and to provide him or her with an appropriate set of exercises to improve him or her weak points. Here, we focus on the creation of the expert model. We study different linguistic theories of the English compounding process and we build a representation of the expert model together with a generic model of the errors that French students make. We describe the computer implementation of these two...

38054. sten Dahl
Grammaticalization is commonly seen as "a process which turns lexemes into grammatical formatives and makes grammatical formatives still more grammatical". In this paper, it is argued that grammaticalization has to be treated in the wider perspective of the life cycles of grammatical constructions. The notion of an "inflationary process" is invoked in order to explain what goes on in grammaticalization. It is argued that the degree of independence of an element of a linguistic expression reflects its informational value and that one of the main components of grammaticalization is rhetorical devaluation, by which a construction comes to be used with a lower informational impact. 1. Grammaticalization as an integrated...

38055. Strategic Priorities for the Development of Language Technology in Minority Languages - K. Sarasola
Language technology development for minority languages differs in several aspects from their development for widely used languages. The high capacity and computational power of present computers, added to the scarcity of human and linguistic resources implies the design of new and different strategies. This proposal presents the conclusions after twelve years of experience with the automatic processing of Basque. 1. Introduction Language Engineering is recognized as one of the fundamental enabling technologies for the future. Language Engineering will make an indispensable contribution to the success of the information society. The availability and usability of new telematic services will depend on developments in Language Engineering. In the future natural language will become...

38056. Soft Constraint Logic Programming and Generalized Shortest Path Problems - Stefano Bistarelli,Ugo Montanari,Francesca Rossi
In this paper we study the relationship between Constraint Programming (CP) and Shortest Path (SP) problems. In particular, we show that classical, multicriteria, partially ordered, and modality-based SP problems can be naturally modeled and solved within the Soft Constraint Logic Programming (SCLP) framework, where logic programming is coupled with soft constraints. In this way we provide this large class of SP problems with a high-level and declarative linguistic support whose semantics takes care of both finding the cost of the shortest path(s) and also of actually finding the path(s). On the other hand, some efficient algorithms for certain classes of...

38057. Dialogue Acts, Synchronising Units and Anaphora Resolution - Miriam Eckert,Michael Strube
In this paper, we present the results of a corpus analysis, and a model of anaphora resolution in spontaneous spoken dialogues in the form of an algorithm. The main finding of our corpus analysis is that less than half the pronouns and demonstratives have NP antecedents in the preceding text. 22% have sentential antecedents and the remainder have no identifiable linguistic antecedents. As part of the corpus analysis we present the results of interannotator agreement tests. These were carried out for marking anaphor types and their antecedents, and for segmenting the dialogues into dialogue acts. The results of the inter-annotator agreement tests indicate that our classification method is reliable...

38058. Memory-Based Word Sense Disambiguation - Jorn Veenstra,Antal Van Den Bosch,Sabine Buchholz,Daelemans Jakub Zavrel
. We describe a memory-based classification architecture for word sense disambiguation and its application to the senseval evaluation task. For each ambiguous word, a semantic word expert is automatically trained using a memory-based approach. In each expert, selecting the correct sense of a word in a new context is achieved by finding the closest match to stored examples of this task. Advantages of the approach include (i) fast development time for word experts, (ii) easy and elegant automatic integration of information sources, (iii) use of all available data for training the experts, and (iv) relatively high accuracy with minimal linguistic...

38059. New Lexical Entries for Unknown Words - James Kilbury,Petra Naerger,Ingrid Renz
The following paper presents an approach for simulating the acquisition of new l exical entries for unknown words, an issue that is central to natural language processing since no lexicon can ever be complete. Acquisition involves two main tasks. First, the appropriate information about an unknown word in a given linguistic context (i.e. sentence) is identified. It is shown that this task requires new general considerations about shared information in unificationbased representations. Second, the collected information is formulated in a new lexical entry according to a comprehensive theory of the lexicon which defines the form of lexical entries and the relations between them. This task is solved...

38060. Learning Syntactic Rules and Tags with Genetic Algorithms for Information Retrieval and Filtering: An Empirical Basis for Grammatical Rules - Robert M. Losee
The grammars of natural languages may be learned by using genetic algorithms that reproduce and mutate grammatical rules and part-of-speech tags, improving the quality of later generations of grammatical components. Syntactic rules are randomly generated and then evolve; those rules resulting in improved parsing and occasionally improved retrieval and filtering performance are allowed to further propagate. The LUST system learns the characteristics of the language or sublanguage used in document abstracts by learning from the document rankings obtained from the parsed abstracts. Unlike the application of traditional linguistic rules to retrieval and filtering applications, LUST develops grammatical structures and tags without the prior imposition of some common grammatical...

 

Busque un recurso