Mostrando recursos 181 - 200 de 76,584

  1. Rapid Development of Spoken Language Understanding Grammars Abstract

    Ye-yi Wang; Alex Acero
    To facilitate the development of spoken dialog systems and speech enabled applications, we introduce SGStudio (Semantic Grammar Studio), a grammar authoring tool that enables regular software developers with little speech/linguistic background to rapidly create quality semantic grammars for automatic speech recognition (ASR) and spoken language understanding (SLU). We focus on the underlying technology of SGStudio, including knowledge assisted example-based grammar learning, grammar controls and configurable grammar structures. While the focus of SGStudio is to increase productivity, experimental results show that it also improves the quality of the grammars being developed. Key words: Automatic grammar generation, context free grammars (CFGs), example-based...

  2. Dependency treelet translation: Syntactically informed phrasal SMT

    Chris Quirk; Arul Menezes; Colin Cherry
    We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. This method requires a source-language dependency parser, target language word segmentation and an unsupervised word alignment component. We align a parallel corpus, project the source dependency parse onto the target sentence, extract dependency treelet translation pairs, and train a tree-based ordering model. We describe an efficient decoder and show that using these treebased models in combination with conventional SMT models provides a promising approach that incorporates the power of phrasal SMT with the linguistic generality available in...

  3. VideoQA: Question answering on news video

    Hui Yang; Lekha Chaisorn; Yunlong Zhao; Shi-yong Neo; Tat-seng Chua
    When querying a news video archive, the users are interested in retrieving precise answers in the form of a summary that best answers the query. However, current video retrieval systems, including the search engines on the web, are designed to retrieve documents instead of precise answers. This research explores the use of question answering (QA) techniques to support personalized news video retrieval. Users interact with our system, VideoQA, using short natural language questions with implicit constraints on contents, context, duration, and genre of expected videos. VideoQA returns short precise news video summaries as answers. The main contributions of this research...

  4. Seamless Integration of Rule-Based Knowledge and Object-Oriented Functionality with Linguistic Symbiosis


  5. Linguistic Understanding with Speech Acts 1

    Donald G. Thaxton; Dr. Albert Esterline Advisor
    We describe a prototype scheduler implemented in the logic programming language LIFE that tracks the effects of speech acts. A speech act consists of an illocutionary force (indicated by a performative verb) and a propositional content. The effects of speech acts of interest here are obligations, permissions, prohibitions, and assertions. We have developed a context free grammar using LIFE definite clause grammar rules. To structure the values of attributes, we use feature structures. These record-like structures represent limited information. Feature structures are also used to record the effects of speech acts. 1.

  6. Abstract Prosody modeling with soft templates

    Greg Kochanski; Chilin Shih
    This paper describes a novel prosody generation model. We intend it to broadly support many linguistic theories and multiple languages, for the model imposes no restriction on accent categories and shapes. This capability is crucial to the next generation of text-to-speech systems that will need to synthesize intonation variations for different speech acts, emotions, and styles of speech. The system supports mark-up tags that are mathematically defined and generate f0 deterministically. Underlying the tags is an articulatory model of accent interaction which balances physiological and communication constraints. We specify the model by way of an algorithm for calculating the pitch,...

  7. An Architecture for Autonomous Agents Exploiting Conceptual Representations Abstract

    A. Chella A; M. Frixione B; S. Gaglio A
    An architecture for autonomous agents is proposed, that integrates the functional and the behavioral approaches to robotics. The integration is based on the introduction of a conceptual level, linking together a subconceptual, behavioral, level, and a linguistic level, encompassing symbolic representation and data processing. The proposed architecture is described with reference to an experimental setup, in which the robot task is that of building a significant description of its working environment.

  8. How do we tell an association from a rule? Comment on Sloman

    Gerd Gigerenzer; Terry Regier
    S. A. Sloman’s (1996) intriguing argument for separate associative and rule-based reasoning systems is unfortunately damaged by a certain amount of slack in the distinction he makes between these two posited mental mechanisms. The authors suggest that the distinction could be sharpened by overt reference to explicit models of associative and rule-based processing. They also point out that “simultaneous contradictory belief, ” which Sloman takes as evidence for separate associative and rule-based systems, need not be interpreted in this fashion. It may also signal a number of other things, including the presence of linguistic ambiguity (as in the Linda problem),...

  9. What is generic programming

    Gabriel Dos Reis
    The last two decades have seen an ever-growing interest in generic programming. As for most programming paradigms, there are several definitions of generic programming in use. In the simplest view generic programming is equated to a set of language mechanisms for implementing type-safe polymorphic containers, such as List<T> in Java. The notion of generic programming that motivated the design of the Standard Template Library (STL) advocates a broader definition: a programming paradigm for designing and developing reusable and efficient collections of algorithms. The functional programming community uses the term as a synonym for polytypic and type-indexed programming, which involves designing...

  10. Generalisations over Corpus-induced Frame Assignment Rules

    Anette Frank
    In this paper we discuss motivations and strategies for generalising over instance-based frame assignment rules that we extract from frame-annotated corpora. Corpus-induced syntax-semantics mapping rules for frame assignment can be used for automatic semantic role labelling of unparsed text, but further, to extract linguistic knowledge for a lexical semantic resource with a general syntax-semantics interface. We provide a data analysis of a comprehensive rule set of corpus-induced frame assignment rules, and discuss the potential of applying different types of generalisations and filters, to obtain a uniform extended data set for the extraction of linguistic knowledge. 1.

  11. A formal framework for linguistic tree query

    Catherine Lai
    The analysis of human communication, in all its forms, increas-ingly depends on large collections of texts and transcribed record-ings. These collections, or corpora, are often richly annotated with structural information. These data sets are extremely large so manual analysis is only successful up to a point. As such, sig-nificant effort has recently been invested in automatic techniques for extracting and analyzing these massive data sets. However, further progress on analytical tools is confronted by three major challenges. First, we need the right data model. Second, we need to understand the theoretical foundations of query languages on that data model. Finally,...

  12. Extracting Relevant Named Entities for Automated Expense Reimbursement

    Guangyu Zhu
    Expense reimbursement is a time-consuming and labor-intensive process across organizations. In this paper, we present a prototype expense reimbursement system that dramatically reduces the elapsed time and costs involved, by eliminating paper from the process life cycle. Our complete solution involves (1) an electronic submission infrastructure that provides multi-channel image capture, secure transport and centralized storage of paper documents; (2) an unconstrained data mining approach to extracting relevant named entities from un-structured document images; (3) automation of auditing procedures that enables automatic expense validation with minimum human interaction. Extracting relevant named entities robustly from document images with unconstrained layouts and...

  13. Kocku von Stuckrad


    The structures of social, religious, and scientific discourse are much more dependent on meaning than on the elaboration and exploration of some objective, transcendent “truths. ” This has been convincingly shown in the philosophical discussions of the last five decades. My aim on the following pages is to apply this pragmatic argument to the question of femininity. My analysis will develop in three steps: First, I shall depict the results of pragmatic philosophy’s research into epistemology and—more important—historiography. Second, I shall change the perspective and examine the role of the feminine within the scope of modern psychology. In order to...

  14. I Software as Property: The Theoretical Paradox

    Eben Moglen
    SOFTWARE: no other word so thoroughly connotes the practical and social effects of the digital revolution. Originally, the term was purely technical, and denoted the parts of a computer system that, unlike “hardware,” which was unchangeably manufactured in system electronics, could be altered freely. The first software amounted to the plug configuration of cables or switches on the outside panels of an electronic device, but as soon as linguistic means of altering computer behavior had been developed, “software” mostly denoted the expressions in more or less human-readable language that both described and controlled machine behavior. 1

  15. An architecture for parallel corpus-based grammar learning

    Jonas Kuhn
    This paper describes an architecture for exploiting implicit information about the grammar of the languages included in a parallel corpus. By initially applying statistical word alignment and defining an appropriate representation format for cross-linguistic structural correspondence, this implicit information can feed a system for bootstrapping grammars. The proposed architecture will be underlying in the new PTOLEMAIOS project. Dieses Papier beschreibt einer Architektur, mit der die implizit in Parallelkorpora enthaltene Information über die Grammatiken der beteiligten Sprachen ausgenutzt werden soll. Wenn vorab eine statistische Wortalignierung angewandt wird und ein geeignetes Repräsentationformat für die crosslinguistische Strukturkorrespondenz definiert wird, kann diese implizite Information...

  16. Linguistic Support for Distributed Programming Abstractions ∗

    Christian Heide; Damm Patrick; Thomas Eugster; Rachid Guerraoui
    What abstractions are useful for distributed programming? This question has constituted an active area of research in the last decades and several candidate abstractions have been proposed, including remote method invocations, tuple spaces and publish/subscribe. How should such abstractions be offered to the programmer? Should they sit besides centralized programming abstractions in the core of a language? Should they rather sit within external libraries? Should they benefit from specific compiler support? These questions are also important but have sparked less enthousiasm. This paper contributes to addressing these questions in the context of Java and the type-based publish/subscribe (TPS) abstraction, an...

  17. Toward a definition and linguistic support for partial quiescence. submitted for publication

    Billy Yan-kit Man; Hiu Ning (angela Chan; Andrew J. Gallagher; Aaron W. Keen; Ronald A. Olsson
    Abstract. The global quiescence of a distributed computation (or distributed termination detection) is an important problem. Some concurrent programming languages and systems provide global quiescence detection as a built-in feature so that programmers do not need to write special synchronization code to detect quiescence. This paper introduces partial quiescence (PQ), which generalizes quiescence detection to a specified part of a distributed computation. Partial quiescence is useful, for example, when two independent concurrent computations that both rely on global quiescence need to be combined into a single program. The paper describes how we have designed and implemented a PQ mechanism within...

  18. $rec.titulo


    There is a difficulty in applying the conventional word-based indexing to Korean+. The indexable segment of a word, i.e. stem is often a compound noun, which results in the seri-ous decrease of retrieval effectiveness. The morpheme-based indexing, which decomposes a compound noun into simple nouns, has been developed to overcome the problem of com-pound nouns. It, however, requires a large dictionary and Using n-Grams for Korean Text Retrieval complex linguistic knowledge. In this paper we propose a new indexing method by combining the word-based index-ing and the n-gram indexing. The proposed method alle-viates the problem of compound nouns without dictionaries...

  19. Automatic Acquisition of Translation Knowledge Using Structural Matching Between Parse Trees


    Abstract — In this paper we present a rule-based formalism for the representation, acquisition, and application of translation knowledge. The formalism is being used successfully in a Japanese-English machine translation system. The translation knowledge is learnt automatically from a parallel corpus using structural matching between the parse trees of translation examples. We have developed a comfortable user interface, which makes it possible to invoke the translation functionality directly from MS Word. The user can customize the translation knowledge by simply correcting translation results in MS Word. Our system is mainly intended for language students, therefore, we also offer the display...

  20. A Review of Common Problems in Linguistic Resources and a New Way to Represent Ontological Relationships

    Antonio Vaquero; Fernando Sáenz; Facultad De Informática
    Francisco Alvarez 2 Existing lexical resources have taxonomic structure related problems that negatively impact the results of domain-specific applications. This is the result of an approach that focuses on implementation and content issues rather than on questions of design, semantic cleanness and application usefulness. Although taxonomy structuring methodologies have been developed to correct some of these problems, they remain too general and far from the problem-solving and domain-specific approach that ontology-based linguistic resources need in order to be problem-solving. Based on a short analysis of some common problems in lexical resources and of the available taxonomy structuring methodologies, we propose...

Aviso de cookies: Usamos cookies propias y de terceros para mejorar nuestros servicios, para análisis estadístico y para mostrarle publicidad. Si continua navegando consideramos que acepta su uso en los términos establecidos en la Política de cookies.