An Annotation Scheme for Concept-to-Speech Synthesis
- Janet Hitzeman; Alan W. Black; Chris Mellish; Jon Oberlander; Massimo Poesio; Paul Taylor; Buccleuch Place
The SOLE concept-to-speech system uses linguistic information provided by an NLG component to improve the intonation of synthetic speech. As the text is generated, the system automatically annotates the text with linguistic information using a set of XML tags which we have developed for this purpose. The annotation is then used by the synthesis component in producing the intonation. We describe the annotation system and discuss our choice of linguistic constructs to annotate. 1 Introduction The goal of the SOLE project 1 is to make use of high-level linguistic information to improve the quality of the intonation of synthetic speech....
Harmonizing the Approaches
- The Fracas Consortium; Robin Cooper; Dick Crouch; Jan Van Eijck; Chris Fox; Josef Van Genabith; Jan Jaspars; Hans Kamp; Manfred Pinkal; Massimo Poesio; Steve Pulman; Espen Vestre; Deliverable D
ion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12 2.3 Quantification : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15 2.4 Propositions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 20 2.5 Predication :...
Automatic Generation of a Fuzzy Rule Base for Online Handwriting Recognition
- Ashutosh Malaviya; Hartmut Surmann; Liliane Peters
An automatic method to generate fuzzy rules and their membership functions to recognize handwritten characters is described. Firstly an initial rule base is created on the basis of a referential data set containing handwriting prototypes. Subsequently the classification behavior of the fuzzy rules is optimized with a genetic algorithm, which is regarded as a typical solution to NP-complete problems. A suitable fitness function which corresponds to the human perception of the linguistic variables is obtained. The proposed rule generation process extends the learning and adaptive capabilities of existing fuzzy rule based recognition system. Keywords: fuzzy features, fuzzy rule generation, genetic...
Distributional Clustering Of English Words
- Fernando Pereira Att; Fernando Pereira; Naftali Tishby; Lillian Lee
We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical "soft" clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data. INTRODUCTION Methods for automatically classifying words according to their contexts of use have both scientific and practical interest. The scientific questions arise in connection to distributional views of linguistic...
Learning the Rule Base of a Fuzzy Controller by a Genetic Algorithm
- Jörn Hopf; Frank Klawonn
For the design of a fuzzy controller it is necessary to choose, besides other parameters, suitable membership functions for the linguistic terms and to determine a rule base. This paper deals with the problem of finding a good rule base --- the basis of a fuzzy controller. Consulting experts still is the usual but time--consuming and therefore rather expensive method. Besides, after having designed the controller, one cannot be sure that the rule base will lead to near optimal control. This paper shows how to reduce significantly the period of development (and the costs) of fuzzy controllers with the help...
Acquiring Rules for Reducing Morphological Ambiguity from POS Tagged Corpus in Korean
- Jae-Hoon Kim Spoken; Jae-hoon Kim; Byung-gyu Jang
In Korean, most morphological analyzers have suffered from lack of ordering restrictions as morphotactics in many cases. To alleviate this problem, we use linguistic knowledge called subsumption relation. In this paper, we propose a method for reducing morphological ambiguities using the subsumption relation and for automatically inferring some rules, called subsumption conditions, from the part-of-speech tagged corpus. The conditions are represented by finite-state automata. Using the conditions, we efficiently examine whether two morphological structures are in the subsumption relation. Our experiment shows very promising results. We expect that the results might be positively reflected to probabilistic part-of-speech tagging systems, which...
Aspects Of Salience In Natural Language Generation
- T. Pattabhiraman; Simon Fraser University; Name T. Pattabhiraman
This dissertation examines the role of salience in natural language generation (NLG). The salience of an entity, in intuitive terms, refers to its prominence, and is interpreted as a measure of how well an entity stands out from other entities and biases the preference of the generator in selecting words and complex constructs. Through an analysis of previous work in diverse disciplines, we show the variety of salience effects in NLG. Next, we classify several important determinants of salience, corresponding to different factors contributing to salience. We then delineate two theoretically-significant categories: canonical salience and instantial salience. The former is...
A Semantics of Contrast and Information Structure for Specifying Intonation in Spoken Language Generation
- Scott Allan Prevost
In this dissertation I present a model for the determination of intonation contours from context and provide two implemented systems which apply this theory to the problem of generating spoken language with appropriate intonation from high-level semantic representations. The theory and implementations presented here are based on an information structure framework that mediates between intonation and discourse, and encodes the proper level of semantic information to account for both contextually-bound accentuation patterns and intonational phrasing. The structural similarities among these linguistic levels of representation are the basis for selecting Combinatory Categorial Grammar (CCG, Steedman 1985,1990a) as the model for spoken...
Higher--Order Coloured Unification and Natural Language Semantics
- Claire Gardent Computational; Claire Gardent; Michael Kohlhase
In this paper, we show that Higher--Order Coloured Unification -- a form of unification developed for automated theorem proving -- provides a general theory for modeling the interface between the interpretation process and other sources of linguistic, non semantic information. In particular, it provides the general theory for the Primary Occurrence Restriction which (Dalrymple et al., 1991)'s analysis called for. 1 Introduction It is well known that Higher--Order Unification (HOU) can be used to construct the semantics of Natural Language: (Dalrymple et al., 1991) -- henceforth, DSP -- show that it allows a treatment of VP-- Ellipsis which successfully captures...
Interpreting Changes In The Fuzzy Sets Of A Self-Adaptive Neural Fuzzy Controller
- Detlef Nauck; Rudolf Kruse
We describe a procedure for the adaptation of membership functions in a fuzzy control environment by using neural network learning principles. The changes in the fuzzy sets can be easily interpreted. By using a fuzzy error that is propagated back through the architecture of our fuzzy controller, we receive an unsupervised learning technique, where each rule tunes the membership functions of its antecedent and its consequence. INTRODUCTION Classical control theory is based on mathematical models that describe the behaviour of the plant under consideration. The main idea of fuzzy control [9, 10], which has proved to be a very successful...
- And The; Harald Trost; Deutsches Forschungszentrum; Kunstliche Intelligenz Gmbh; Wolfgang Heinz; Johannes Matiasek; Ernst Buchberger
This paper describes certain aspects of Datenbank-DIALOG 1 , a German language interface to relational databases developed at the Austrian Research Institute for Artificial Intelligence. Besides giving a short overview of the system architecture it emphasizes the issues of portability and habitability and how they are being tackled in the design of Datenbank-DIALOG. To demonstrate how design strategies support the development of a habitable system we take examples from the area of comparisons and measures, both of which are important for many application domains and nontrivial from a linguistic point of view. Datenbank-DIALOG has been fully implemented and is accessible...
A Geometric Approach to Mapping Bitext Correspondence
- I. Dan Melamed
The first step in most corpus-based multilingual NLP work is to construct a detailed map of the correspondence between a text and its translation. Several automatic methods for this task have been proposed in recent years. Yet even the best of these methods can err by several typeset pages. The Smooth Injective Map Recognizer (SIMR) is a new bitext mapping algorithm. SIMR's errors are smaller than those of the previous front-runner by more than a factor of 4. Its robustness has enabled new commercial-quality applications. The greedy nature of the algorithm makes it independent of memory resources. Unlike other bitext...
Integrating Reflection, Strong Typing and Static Checking
- D. Stemple; R. Morrison; G. N. C. Kirby; R. C. H. Connor; Mae+ Mccarthy; M. I. The Lisp
We define and present the computational structure of linguistic reflection as the ability of a running program to generate new program fragments and to integrate these into its own execution. The integration of this kind of reflection with compiler based, strongly typed programming languages is described. This integration is accomplished in a manner that preserves strong typing and does not unduly limit the amount of static type checking that can be performed. The benefits that accrue to linguistic reflection in the area of database and persistent programming languages are outlined and two examples are given.
Writing and Correcting Textual Scenarios for System Design
- Camille Ben Achour; Camille Ben Achour
Since a few years, scenarios have gained in popularity in Requirements Engineering. Textual scenarios are narrative descriptions of flows of actions between agents. They are often proposed to elicit, validate or document requirements. The CREWS experience has shown that the advantage of scenarios is their easiness of use, and that their disadvantage stands in the lack of guidelines for 'quality' authoring. In this article, we propose guidance for the authoring of scenarios. The guided scenario authoring process is divided into two main stages : the writing of scenarios, and the correcting of scenarios. To guide the writing of scenarios, we...
Distributed Parsing With HPSG Grammars
- Intelligenz Gmbh; Abdel Kader Diagne; Walter Kasper; Hans-Ulrich Krieger; Deutsches Forschungszentrum
Unification-based theories of grammar allow for an integration of different levels of linguistic descriptions in the common framework of typed feature structures. Dependencies among the levels are expressed by coreferences. Though highly attractive theoretically, using such codescriptions for analysis create problems of efficiency. We present an approach to a modular use of codescriptions on the syntactic and semantic level. Grammatical analysis is performed by tightly coupled parsers running in tandem, each using only designated parts of the grammatical description. In the paper we describe the partitioning of grammatical information for the parsers and present results about the performance. Acknowledgements. We...
A Fuzzy Perceptron as a Generic Model for Neuro-Fuzzy Approaches
- Detlef Nauck
This paper presents a fuzzy perceptron as a generic model of multilayer fuzzy neural networks, or neural fuzzy systems, respectively. This model is suggested to ease the comparision of different neuro--fuzzy approaches that are known from the literature. A fuzzy perceptron is not a fuzzification of a common neural network architecture, and it is not our intention to enhance neural learning algorithms by fuzzy methods. The idea of the fuzzy perceptron is to provide an architecture that can be initialized with prior knowledge, and that can be trained using neural learning methods. The training is carried out in such a...
A compositional treatment of polysemous arguments in Categorial Grammar
- Anne-Marie Mineur; Paul Buitelaar
We discuss an extension of the standard logical rules (functional application and abstraction) in Categorial Grammar (CG), in order to deal with some specific cases of polysemy. We borrow from Generative Lexicon theory which proposes the mechanism of coercion, next to a rich nominal lexical semantic structure called qualia structure. In a previous paper we introduced coercion into the framework of sign-based Categorial Grammar and investigated its impact on traditional Fregean compositionality. In this paper we will elaborate on this idea, mostly working towards the introduction of a new semantic dimension. Where in current versions of sign-based Categorial Grammar only...
Constructing Fuzzy Models with Linguistic Integrity - AFRELI Algorithm
- Jairo J. Espinosa; Joos Vandewalle; Joos V
We present an algorithm to extract rules relating input-output data. The rules are created in the environment of fuzzy systems. The concept of linguistic integrity is discussed and used as a framework to propose an algorithm for rule extraction (AFRELI). The algorithm is complemented with the use of the FuZion algorithm created to merge consecutive membership functions and guaranteed the distinguishability between fuzzy sets on each domain. Keywords Fuzzy Modeling, function approximation, knowledge extraction, data minning I. Introduction Mathematical models are powerful tools to natural phenomena represent in a systematic way . They open the possibility of studying the behavior...
A Database Interface for File Update
- Serge Abiteboul; Sophie Cluet; Tova Milo
this paper, we consider how structured data stored in files can be updated using database update languages. The interest of using database languages to manipulate files is twofold. First, it opens database systems to external data. This concerns data residing in files or data transiting on communication channels and possibly coming from other databases . Secondly, it provides high level query/update facilities to systems that usually rely on very primitive linguistic support. (See  for recent works in this direction). Similar motivations appear in [4, 5, 7, 8, 11, 12, 13, 14, 15, 17, 19, 20, 21] In a previous...
Semantic-Oriented Chart Parsing with Defaults
- Thomas Stürmer
We present a computational model of incremental, interactive text analysis. The model is based on an active chart and supports interleaved syntactic and semantic processing. It can handle intra- and intermodular constraints without forcing the use of the same formalism for the description of syntactic and semantic knowledge. An essential part of the model are defaults which guide an analysis algorithm based on our approach to compute the most plausible solution. We will argue that the resulting computation can be understood as semanticoriented parsing. We will also show how our model can be abstracted into a NL understanding system architecture,...