Interfaces as locus of historical change - Workshop: Grammaticalization and Linguistic Theory
- Miriam Butt
this paper, I take a look at a particular V-V complex predicate which occurs in both Urdu/Hindi and Bengali
Two Automatic Approaches For Analyzing Connected Speech Processes In Dutch
- Mirjam Wester; Judith M. Kessens; Helmer Strik
This paper describes two automatic approaches used to study connected speech processes (CSPs) in Dutch. The first approach was from a linguistic point of view - the top-down method. This method can be used for verification of hypotheses about CSPs. The second approach - the bottom-up method - uses a constrained phone recognizer to generate phone transcriptions. An alignment was carried out between the two transcriptions and a reference transcription. A comparison between the two methods showed that 68% agreement was achieved on the CSPs. Although phone accuracy is only 63%, the bottom-up approach is useful for studying CSPs. From...
Processing Unknown Words in HPSG
- Petra Barg; Markus Walther
The lexical acquisition system presented in this paper incrementally updates linguistic properties of unknown words inferred from their surrounding context by parsing sentences with an HPSG grammar for German. We employ a gradual, informationbased concept of "unknownness" providing a uniform treatment for the range of completely known to maximally unknown lexical entries. "Unknown" information is viewed as revisable information, which is either generalizable or specializable. Updating takes place after parsing, which only requires a modified lexical lookup. Revisable pieces of information are identified by grammar-specified declarations which provide access paths into the parse feature structure. The updating mechanism revises the...
A Metagrammatical Logical Formalism
- Vincenzo Manca
This paper presents a logical formalism for fundamental concepts of traditional grammar. In its basic sense a grammar is simply a system of (informal) rules which try to discriminate correct linguistic expressions from ones that are incorrect, according to an ideal speaker of the language considered. The term traditional grammar refers to the concepts, methods, and terminology, elaborated over the centuries by grammarians, philosophers and linguists, that constitute (more and less consciously) the average linguistic culture: proposition, predicate, substantive, subject, attribute, complement, modifier, determiner, coordination, subordination, anaphora, ellipsis, deixis, and sentence. These concepts are generally used on the basis of...
Automatic Construction of Frame Representations for Spontaneous Speech in Unrestricted Domains
- Klaus Zechner
This paper presents a system which automatically generates shallow semantic frame structures for conversational speech in unrestricted domains. We argue that such shallow semantic representations can indeed be generated with a minimum amount of linguistic knowledge engineering and without having to explicitly construct a semantic knowledge base. The system is designed to be robust to deal with the problems of speech dysfluencies, ungrammaticalities, and imperfect speech recognition. Initial results on speech transcripts are promising in that correct mappings could be identified in 21% of the clauses of a test set (resp. 44% of this test set where ungrammatical or verb-less...
A System for Facilitating and Enhancing Web Search
- Steffen Staab; Christian Braun; Ilvio Bruder; Antje Düsterhöft; Andreas Heuer; Meike Klettke; Günter Neumann; Bernd Prager; Jan Pretzel; Hans-Peter Schnurr; Rudi Studer; Hans Uszkoreit; Burkhard Wrenger
We present a system that uses semantic methods and natural language processing capabilites in order to provide comprehensive and easy-to-use access to tourist information in the WWW. Thereby, the system is designed such that as background knowledge and linguistic coverage increase, the benefits of the system improve, while it guarantees state-of-the-art information and database retrieval capabilities as its bottom line.
Principles for Organizing Semantic Relations in Large Knowledge Bases
- Larry M. Stephens; Yufeng F. Chen
This paper defines principles for organizing semantic relations represented by slots in frame-structured knowledge bases. We consider not only the ways that slots are used in reasoning about a given domain but also the features of the representation language of the knowledge-based system in which the slots reside. We find that the organization of slots may be based on the knowledge-level semantics of relations and the symbol-level function of slots that implement the representation language. However, the organization of slots is more understandable if these two fundamental distinctions are explicitly separated. The symbol-level organization of slots depends on the inferencing...
Typology and Logical Structure of Natural Languages
- Vincenzo Manca
This paper focuses on some relationships between logic and natural languages, a topic that is crucial in Western philosophy (From Aristotle, to medieval Modistae, to Leibniz and the founders of modern mathematical logic). Specifically, the search for a universal grammar is considered in connection with linguistic typology (see Greenberg 's Universals of Grammar) and formalisms for knowledge representation (see Kamp, Moore). A preliminary investigation proves that usual linguistic concepts could be incorporated into a formal theory of general explicative power. This suggests a logical theory of grammars, where classical linguistic analysis is generalized and formalized. The conceptual and terminological apparatus...
A Word-Level Morphosyntactic Analyzer for Basque
- I. Aduriz; E. Agirre; I. Aldezabal; X. Arregi; J. M. Arriola; X. Artola; K. Gojenola; A. Maritxalar; K. Sarasola; M. Urkia; Gran Va; Cortes Catalanas
This work presents the development and implementation of a full morphological analyzer for Basque, an agglutinative language. Several problems (phrase structure inside word-forms, noun ellipsis, multiplicity of values for the same feature and the use of complex linguistic representations) have forced us to go beyond the morphological segmentation of words, and to include an extra module that performs a full morphosyntactic parsing of each word-form. A unification-based word-level grammar has been defined for that purpose. The system has been integrated into a general environment for the automatic processing of corpora, using TEI-conformant SGML feature structures. 1. Introduction Morphological analysis of...
Modeling Dependency Grammar With Restricted Constraints
- Ingo Schröder; Wolfgang Menzel; Kilian Foth; Michael Schulz
In this paper, parsing with dependency grammar is modeled as a constraint satisfaction problem. A restricted kind of constraints is proposed, which is simple enough to be implemented efficiently, but which is also rich enough to express a wide variety of grammatical well-formedness conditions. We give a number of examples to demonstrate how different kinds of linguistic knowledge can be encoded in this formalism.
Expertise in Object and Face Recognition
- James Tanaka; Isabel Gauthier
egorized for the community's nonlinguistic purposes or, to use his term, for the level of ##############. As Brown points out, the level of usual utility changes according to the demands of the linguistic community and this is especially true for expert populations. So, for example, while it is quite acceptable for most of us to refer to the object outside our office window as a "bird," if we were among a group of bird watchers, it would be important to specify whether the object was a "whitethroated " or "white-crown sparrow." Generally, experts prefer to identify objects in their domain...
Domain-Specific Knowledge Acquisition For Conceptual Sentence Analysis
- Claire Cardie
The availability of on-line corpora is rapidly changing the field of natural language processing (NLP) from one dominated by theoretical models of often very specific linguistic phenomena to one guided by computational models that simultaneously account for a wide variety of phenomena that occur in real-world text. Thus far, among the best-performing and most robust systems for reading and summarizing large amounts of real-world text are knowledge-based natural language systems. These systems rely heavily on domain-specific, handcrafted knowledge to handle the myriad syntactic, semantic, and pragmatic ambiguities that pervade virtually all aspects of sentence analysis. Not surprisingly, however, generating this...
Language, Beliefs and Concepts
- Jens Allwood
cepts and beliefs important and useful in the culture connected with that language. Language will in a sense act as a storage for those concepts and views which have been important in the historical development of the particular culture. This is sometimes expressed in a slightly exaggerated form by saying that a language codifies the world view of a certain culture. However beliefs are also stored linguistically in a more direct manner, e.g. in writing, in which form they are available to a historian or an anthropologist who is interested in patterns of thought from earlier periods or other cultures....
Computational Modelling And Generation Of Prosodic Structure In Swedish
- Merle Horne; Marcus Filipsson
A summary of the motivation for the various levels of structure assumed in a prosodic hierarchy for Swedish and the linguistic and discourse parameters that are needed for their recognition in texts are presented.
- Walter Daelemans; Gert Durieux
. Machine Learning techniques are useful tools for the automatic extension of existing lexical databases. In this paper, we review some symbolic machine learning methods which can be used to add new lexical material to the lexicon by automatically inducing the regularities implicit in lexical representations already present. We introduce the general methodology for the construction of inductive lexica, and discuss empirical results on extending lexica with two types of information: pronunciation and gender. 1. Introduction Computational lexicology and lexicography (the study of the structure, organization, and contents of computational lexica) have become central disciplines both in language engineering and...
Translation by Structural Correspondences
- Ronald M. Kaplan; Klaus Netter; Jürgen Wedekind; Annie Zaenen
We sketch and illustrate an approach to machine translation that exploits the potential of simultaneous correspondences between separate levels of representation, as formalized in the LFG notation of codescriptions. The approach is illustrated with examples from English, German and French where the source and the target language sentences show noteworthy differences in linguistic analyses.
Combining Error-Driven Pruning and Classification for Partial Parsing
- Claire Cardie; Scott Mardis; David Pierce
We present a new approach to partial parsing of natural language texts that relies on machine learning methods. The approach combines corpus-based grammar induction with a very simple pattern-matching algorithm and an optional constituent verification step. The grammar induction algorithm acquires a set of rules for each level of linguistic analysis using a new technique for errordriven pruning of treebank grammars. The constituent verification step employs standard inductive learning techniques as an additional precision-enhancing device. We evaluate the approach on four partial parsing data sets and find that performance is very good (over 93% precision and recall) for applications that...
Measuring Verb Similarity
- Philip Resnik; Mona Diab
The way we model semantic similarity is closely tied to our understanding of linguistic representations. We present several models of semantic similarity, based on differing representational assumptions, and investigate their properties via comparison with human ratings of verb similarity. The results offer insight into the bases for human similarity judgments and provide a testbed for further investigation of the interactions among syntactic properties, semantic structure, and semantic content. Introduction The way we model semantic similarity is closely tied to our understanding of how linguistic representations are acquired and used. Some models of similarity, such as Tversky's (1977), assume an explicit...
Levels of categorization in visual recognition studied with functional MRI
- Isabel Gauthier Adam; Adam W. Anderson; Michael J. Tarr; Pawel Skudlarski; John C. Gore
to conceptual categories, and our results also establish the importance of manipulating task requirements when evaluating a `neural module' hypothesis. Addresses: *Psychology, Yale University, PO Box 208205, New Haven, Connecticut 06520-8205, USA. + Diagnostic Radiology, Yale University School of Medicine, PO Box 208042, New Haven, Connecticut 06520-8042, USA. # Cognitive and Linguistic Sciences, Brown University, Box 1978, Providence, RI 02912, USA. Correspondence: Isabel Gauthier E-mail: email@example.com Current Biology 1997, 7: 645-651. Background The neural processes that underlie recognition of a face, rather than another object, could be special in at least two ways: they may require unique perceptual processing and/or...
Extending Unification Formalisms
- G. Erbach; M. van der Kraan; S. Manandhar; H. Ruessink; W. Skut; C. Thiersch
This paper describes some of the results of the project The Reusability of Grammatical Resources. The aim of the project is to extend current grammar formalisms with notational devices and constraint solvers in order to aid the development of reusable grammars. The project took the Advanced Linguistic Engineering Platform (ALEP) as its starting point. ALEP allows two levels of extension: additional syntactic expressions ("syntactic sugar") and specialised external constraint solvers. Our syntactic additions to ALEP comprise support for LFG coherence and completeness and an extended notation for phrase-structure rules. The project has developed solvers for set constraints, set operations, linear...