PORSCHE: Performance Oriented SCHEma mediation. Inf. Syst
- Khalid Saleem; Zohra Bellahsene; Ela Hunt
Semantic matching of schemas in heterogeneous data sharing systems is time con-suming and error prone. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data shar-ing involves a large number of data sources, such techniques are not suitable. We present a new robust automatic method which discovers semantic schema matches in a large set of XML schemas, incrementally creates an integrated schema encom-passing all schema trees, and defines mappings from the contributing schemas to the integrated schema. Our method, PORSCHE (Performance ORiented SCHEma me-diation), utilises a holistic approach which first clusters...
1. From observation to interpretation
Neurolinguistic research has been engaged in evaluating models of language using measures from brain structure and function, and/or in investigating brain structure and function with respect to language representation using proposed models of language. While the aphasiological strategy, which classiWes aphasias based on performance modality and a few linguistic variables, has been the most stable, cognitive neurolinguistics has had less success in reliably associating more elaborately proposed levels and units of language models with brain structure. Functional imaging emerged at this stage of neurolinguistic research. In this review article, it is proposed that the often-inconsistent superXuity of outcomes arising from...
O.: Unsupervised Morpheme Discovery with Allomorfessor
- Sami Virpioja; Oskar Kohonen
We describe Allomorfessor, which extends the unsupervised morpheme segmentation method Morfessor to account for the linguistic phenomenon of allomorphy, where one morpheme has several different surface forms. The method discovers common base forms for allomorphs from an unannotated corpus by finding small modifications, called mutations, for them. Using Maximum a Posteriori estimation, the model is able to decide the amount and types of the mutations needed for the particular language. The method is evaluated in Morpho Challenge 2009.
Syntactic Re-Alignment Models for Machine Translation
- Jonathan May; Kevin Knight
We present a method for improving word alignment for statistical syntax-based ma-chine translation that employs a syntacti-cally informed alignment model closer to the translation model than commonly-used word alignment models. This leads to ex-traction of more useful linguistic patterns and improved BLEU scores on translation experiments in Chinese and Arabic. 1 Methods of statistical MT Roughly speaking, there are two paths commonly taken in statistical machine translation (Figure 1). The idealistic path uses an unsupervised learning algorithm such as EM (Demptser et al., 1977)
Converting a Bilingual Dictionary into a Bilingual Knowledge Bank based on the Synchronous SSTC
- Tang Enya Kong; Mosleh H. Al-adhaileh
In this paper, we would like to present an approach to construct a huge Bilingual Knowledge Bank (BKB) from an English Malay bilingual dictionary based on the idea of synchronous Structured String-Tree Correspondence (SSTC). The SSTC is a general structure that can associate an arbitrary tree structure to string in a language as desired by the annotator to be the interpretation structure of the string, and more importantly is the facility to specify the correspondence between the string and the associated tree which can be non-projective. With this structure, we are able to match linguistic units at different inter levels...
A linguistic and navigational knowledge approach to text navigation
- Javier Couto; Facultad De Ingeniería Udelar; Jean-luc Minel
We present an approach to text navigation conceived as a cognitive process exploiting linguistic information present in texts. We claim that the navigational knowledge in-volved in this process can be modeled in a declarative way with the Sextant language. Since Sextant refers exhaustively to specific linguistic phenomena, we have defined a customized text representation. These dif-ferent components are implemented in the text navigation system NaviTexte. Two ap-plications of NaviTexte are described. 1
The Linguistic Foundation of Input Method Editors The Case of R
- Ming Hsu; Yun Hsu; A Nokian Parable
Imagine that in an alternative universe, there exist the Nokian people. Though mute, the Nokians are nevertheless a culturally and scientifically advanced people. Currently, the Nokians are in the midst of a great technological revolution, powered by the invention of
55Language and Identity in Iceland National Languages and Language Policies From Linguistic Patriotism to Cultural Nationalism: Language and Identity in
- Guðmundur Hálfdanarson
fessor of history at the University of Iceland, specializing in European social and intel-lectual history, with special emphasis on the history and theory of nationalism. Among his latest publication are Íslenska þjóðríkið – upphaf og endimörk (The Icelandic Nation State – Origins and Limits, 2001) and with H. Jensen and L. Berntson, Europa 1800–2000 (2003).
CHINESE PERSON NAME IDENTIFICATION BASED ON RULES AND STATISTICS
- Wenjie Cao; Chengqing Zong; Juha Iso-sipilä; Bo Xu
This paper describes our strategies for automatic identification of Chinese person names in text. In our approach, we use bound words, bound rules and linguistic information, including parts of speech, dependency between words, etc., to represent the external context features of names. Bound rules are trained by real corpus. Based on one million Chinese person names, we have developed a probability model to represent the internal features of Chinese names. In the identification process, firstly, a potential Chinese person name is extracted by using the rules and characters that can be used as surnames. Secondly, the weight of the potential...
- Lutz Edzard
The last seven chapters (chs. 565–71) of Sbawayhi’s Kit¢ab contain many pho-netic and phonological observations that can be conveniently recast in terms of theories of linguistic preference and natural generative phonology (Hooper 1976), notably in terms of the approach of Vennemann (1983, 1988). Optimal-ity Theory (Prince and Smolensky 1993) offers a formal means to capture the “constraint ranking ” that is implicit in Sbawayhi’s rejection of disallowed forms and evaluation of parallelly occurring and competing forms (“candi-dates”). The relevant phenomena under investigation in this paper are mainly assimilatory processes but also re-syllabification and haplological syllable ellipsis. 1.
NL domain explanations in knowledge based MAT
- Galia Angelova; Kalina Bontcheva
This paper discusses an innovative approach to knowledge based Machine Aided Translation (MAT) where the translator is supported by an user-friendly environment providing linguistic and domain knowledge explanations. Our project aims at integration of a Knowledge Base (KB) in a MAT system and studies the integration principles as well as the internal interface between language and know-ledge. The paper presents ome related work, rel~)rts the solutions applied in our project and tries to gene-raiize our evaluation of the selected MAT approach. 1. Introduct ion The notion of MAT comprises approaches where- in contrast o MT- the human user keeps the...
Japanese politeness in the work of Fujio Minami1 (南不二男)
- Barbara Pizziconi
This paper originates in a re-examination of the Japanese literature on Linguistic Politeness, at a time when an exhaustive and final answer to the question of what Politeness really is seems as elusive as it has ever been. Japanese works on Japanese linguistics remain virtually unknown to the non-
The interaction between spontaneous imitation and linguistic knowledge
- Kuniko Y. Nielsen
The spontaneous imitation paradigm (Goldinger, 1998), in which subjects ' speech is compared before and after they are exposed to target speech, has shown that subjects shift their production in the direction of the target, indicating the use of episodic traces in speech perception as well as the close tie between speech perception and production. By using this paradigm, the current study aims to investigate the psychological reality of three levels of linguistic unit (i.e., word, phoneme, and sub-phonemic unit such as feature/gesture) through physical measurements instead of perceptual assessments. An experiment was carried out to test: 1) whether spontaneous...
1 “Talking like us”: Migration, adolescence and new dialect formation in Q’eqchi ’ Maya WORK IN PROGRESS: DO NOT CITE WITHOUT PERMISSION FROM THE AUTHOR
- Sergio Romero Ph. D
1. Dialect and the sociolinguistic order of indexicality in Mayan languages Dialects aren’t just regional, mutually intelligible varieties of a given language (Trudgill 1986). In Maya societies, dialectal differences have considerable social connotations that native speakers are sharply aware of. Of course, the modalities of metapragmatic evaluation differ from community to community. They are deeply embedded in the synchronic economy of linguistic interaction and the social history of particular areas (Silverstein 2003). We must consider four aspects of dialects as social registers in Mayan languages to understand their cultural import and relation with language change: First, dialectal differences are ethnolinguistic...
Effects of audience on orthographic variation
- Josh Iorio
Research has demonstrated that speakers make linguistic choices in order to narrow or widen the social distance between a speaker and his or her audience. These choices are often based on a speaker’s awareness of an audience’s demographic profile, which is composed of characteristics such as age, gender, and ethnicity. The present study investigates the role that awareness of audience plays in influencing orthographic choices in a “demographically lean” community, i.e. an online community where demographic information about the audience is largely absent or intentionally obscured. Results indicate that awareness of audience remains a significant explanatory factor for style-shifting, even...
Model parameters Syntax Text Waveform synthesis
- Söllerhaus Kleinwalsertal; Martin Haase; Surface Realiser Prosody Assignment
produce spoken output from some underlying, conceptual representation CTS system for information retrieval, or as part in a dialogue system common architecture, reflecting important role of prosody: – NLG module creates prosodically marked-up text from input concept – synthesis module produces actual speech waveforms from NLG output linguistic information is preserved for prosody assignment Instance-based Concept-to-Speech Synthesis:
Conjunction meets negation: A study in crosslinguistic variation
- Anna Szabolcsi; Bill Haddican
Abstract. The central topic of this inquiry is a cross-linguistic contrast in the interaction of conjunction and negation. In Hungarian (Russian, Serbian, Italian, Japanese), in contrast to English (German), negated definite conjunctions are naturally and exclusively interpreted as `neither’. It is proposed that Hungarian conjunctions simply replicate the behavior of plurals, their closest semantic relatives. More puzzling is why English-type languages present a different range of interpretations. By teasing out finer distinctions in intonation and context the paper tracks down missing readings and argues that it is eventually not necessary to postulate a radical cross-linguistic semantic difference. In the course...
Computers and the Humanities xx: nnn-nnn, yyyy
- Dirk Speelman; Stefan Grondelaers; Dirk Geeraerts
© yyyy. Kluwer Academic Publishers. Printed in the Netherlands. Profile-based linguistic uniformity as a generic method for comparing language varieties
Ageing and Speech Prosody
- Brigitte Zellner Keller
Ageing is part of the normal evolution of human beings. Demographic projections to 2030 indicate that more than 60 countries will have at least 2 million people age 65 or older1. Yet knowlegde about speech in the elderly is still dispersed and incomplete, in particular in the area of normal ageing. Prosody within a linguistic community is triggered by a number of parameters which are investigated (see this conference). Yet, little is currently known about the longitudinal evolution of this speech component. This paper is a first state of the art about speech prosody and ageing, with the hope that...
Shuffle from Sequential to Parallel in Production Planning
- Adalbert Golomety; Alina Pitic; Iulia Golomety; Antoniu Pitic
Abstract:- This paper presents an implementation of shuffle operation in production planning. We present a computational formula for shuffle and some optimizations to reduce the sets of shuffle strings. Our idea is to combine shuffle with parallelism for a planning of production phases. Key-Words:- shuffle, production phases, production planning, linguistic model, execution time. 1 Linguistic Model of Production Process By production process we understand the transformation action of resources (material, energy) in final products according of a fabrication recipe. The model that we will show has the purpose to determine the set of actions strings that represents right evolutions of...