www.ijllalw.org SYNONYMY ENRICHMENT IN LANGUAGE EDITING
- Eli Syarifah Aeni; Et Al; Eli Syarifah Aeni; Dewi Ratnasari
A simple definition of synonymy is a word that has the same meaning. Although the similarity is not perfect because it means there are still differences of meaning that depend on the context of the sentence. Each word has different shades of meaning. However, the most important is how words are synonymous in the context of the sentence proper placement and can understand by the general public. It is not easy receiving new vocabulary or language, especially for the common people. This is where the importance of the role of the editor to make the language becomes more varies, but...
Automatic Normalization of Punjabi Words
- Vishal Gupta
Abstract—For any language in the world, automatic normalization of words is a basic linguistic resource required to develop any type of application in Natural Language Processing (NLP) with high accuracy like: machine translation, document classification, document clustering, text question answering, topic tracking, text summarization and keywords extraction etc. It is not possible to achieve high accuracy without using automatic normalization of words for NLP applications for any language. This paper concentrates on automatic normalization of Punjabi words. Punjabi is the official language for state of Punjab. But Punjabi is under resource language. There are very less number of computational-linguistic resources...
A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization
- Lu Wang; Hema Raghavan; Vittorio Castelli; Radu Florian; Claire Cardie
We consider the problem of using sentence compression techniques to facilitate queryfocused multi-document summarization. We present a sentence-compression-based framework for the task, and design a series of learning-based compression models built on parse trees. An innovative beam search decoder is proposed to efficiently find highly probable compressions. Under this framework, we show how to integrate various indicative metrics such as linguistic motivation and query relevance into the compression process by deriving a novel formulation of a compression scoring function. Our best model achieves statistically significant improvement over the state-of-the-art systems on several metrics (e.g. 8.0 % and 5.4 % improvements...
Abstract The article presents a notion of Probabilistic Tree-generating Binary Grammar (PTgBG), a
probabilistic extension of Tree-generating Binary Grammars (TgBG). TgBG is a formalism developed to describe languages with syntactic discontinuities, designed with two practical goals in mind: 1) efficient parsing 2) representation of results in the form of flat, easily processable phrase-structure trees. Here, we introduce rule probabilities for TgBG, thus creating PTgBG. We compare PTgBG to standard Probabilistic Context-Free Grammar (PCFG) in terms of parse tree probabilities and sentence probabilities. We conclude that PTgBG may be effectively parsed by the methods developed for PCFG, while allowing for convenient expression of some linguistic phenomena, specific for non-configurational languages. Keywords Free word order...
- Ali Muhammad Nizamani
artificial intelligence, Supervised learning Abstract � This paper provide proposed OCR solution of Sindhi Character Recognition using Artificial Neural Networks (ANNs) and exposed major alphabet differences between Sindhi and Arabic languages with OCR perspective. Huge literature is available in hard copy format and needs to convert into soft copy format so that everyone can access and perform searching to achieve desired needs from Sindhi literature. Sindhi language is very rich language contains fifty two characters and it has ability to merge other languages words in word list. With comparison of other Unicode character languages, Sindhi languages characters having differences in...
Supporting Organizations: Linguistic Society of Taiwan (LST) Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
On behalf of ILAS, a co-host of this conference with National Chengchi University, I would like to welcome you all to the 27th Pacific Asia Conference on Language, Information and Computation (PACLIC27). The PACLIC conference has a long history,dating back to 1982 where the first conference of this series was organized with the original name “Korea-Japan Joint Conference on Formal Linguistics”. It was the consensus of the organizer of the 1994
Machine translation: statistical approach with additional linguistic knowledge
- Von Der Fakultät Für Mathematik; Und Naturwissenschaften Der; Maja Popović; Berichter Universitätsprofessor; Dr. -ing Hermann Ney
Diese Dissertation ist auf den Internetseiten der Hochschulbibliothek online verfügbar. Acknowledgments I would like to express my gratitude to all the people who supported and accompanied me during the preparation of this work. First, I would like to express my gratitude to my advisor Professor Dr.-Ing. Hermann Ney, head of the Lehrstuhl für Informatik 6 at the RWTH Aachen University. This thesis would not have been possible without his advices and patience. I am very grateful that he gave me the possibility to attend various conferences, workshops and meetings. I would also like to thank Professor Dr. Andy Way from...
UDC 81 Definition of the Concept of Polylingualism in Social Discourse and Linguistics
- Gulnara T. Smagulova
Abstract. The article deals with the definition of the concept of polylingualism as a general term in social discourse and as a linguistic term in a number of research works. The author shares the opinion that perspective efficient development of polylingual and polycultural education is only possible in terms of person-centered conditions and appropriate polycommunicative nature of any language. The interaction between educational system and society (sociocultural level, in relation to languages, sociocultural and sociolinguistic contexts) should be also considered.
1 Executive Summary
- Hans Kamp; Ro Lenci; James Pustejovsky; Hans Kamp; Alessandro Lenci; James Pustejovsky; Hans Kamp; Ro Lenci; James Pustejovsky
This report documents the program and the outcomes of Dagstuhl Seminar 13462 “Computational Models of Language Meaning in Context”. The seminar addresses one of the most significant issues to arise in contemporary formal and computational models of language and inference: that of the role and expressiveness of distributional models of semantics and statistically derived models of language and linguistic behavior. The availability of very large corpora has brought about a near revolution in computational linguistics and language modeling, including machine translation, information extraction, and question-answering. Several new models of language meaning are emerging that provide potential formal interpretations of linguistic...
Variation and Semantic Relation Interpretation: Linguistic and Processing Issues
- Nathalie Aussenac-gilles; Anne Condamines
Abstract. Studies in linguistics define lexico-syntactic patterns to characterize the linguistic utterances that can be interpreted with semantic relations. Because patterns are assumed to reflect linguistic regularities that have a stable interpretation, several software implement such patterns to extract semantic relations from text. Nevertheless, a thorough analysis of pattern occurrences in various corpora proved that variation may affect their interpretation. In this paper, we report the linguistic variations that impact relation interpretation in language, and may lead to errors in relation extraction systems. We analyze several features of state-of-the-art pattern-based relation extraction tools, mostly how patterns are represented and matched...
DISCOURSE STRATEGIES MODEL: AN INITIAL PHASE FOR DISCOVERY OF THE FACT- BASED STATEMENTS FROM DESCRIPTIVE TEXT.
- Bruce A. Calway; Ross Smith
Fact-based conceptual modelling approaches seek, as one goal, to express detail about a universe of discourse (UoD) as elementary declarative statements, generalised from collections of data. A further resource available to the systems analyst, as the fact-modeller, is to discover fact-based statements from descriptive natural language information systems specifications. However, such extant formalised approaches as exist, that can assist the fact-based analyst to process textual resources, are incomplete. This paper proposes a discourse strategies model, for use as an initial process for the fact-based analyst processing descriptive text. The model is synthesised from extant linguistic and fact-based modelling approaches represented...
Linguistic Divergence Patterns in English to Marathi Translation
- S. B. Kulkarni; Dr. B. A. M. U; P. D. Deshmukh; M. M. Kazi; Dr. B. A. M. U; K. V. Kale; Dr. B. A. M. U
In machine translation system, the text is translated from one language known as source language into another language known as target language. The development of a machine translation system needs to identify the patterns of divergence between two languages. A detail study of divergence issues in machine translation is required for their proper classification and detection. The primary objective of this paper is to understand the types of divergence problems that operate behind English to Marathi translation. In this paper, the various divergence patterns between English-Marathi language pair are considered. This will enable us to come up with strategies to...
Performance Evaluation of Fuzzy Logic and PID Controller for Liquid Level Process
- H. Kala; D. Deepakraj; P. Gopalakrishnan; P. Vengadesan; M. Karumbal Iyyar
Abstract: The objective of this paper is to investigate and find a solution by designing the PID and FUZZY Controllers for liquid level process. Measuring the level of liquids is a critical need in many industrial plants. Fuzzy control is based on fuzzy logic-a logical system that is much closer in spirit to human thinking and natural language than traitional logical systems. During the past several years, fuzzy control has emerged as one of the most active and fruitful areas for research in the applications of fuzzy set theory, especially in the realm of industrial processes, which do not lend...
A Computational Classification of Urdu Dynamic Copula Verb
- Qaiser Abbas; Fachbereich Sprachwissenschaft; Universität Konstanz; Ghulam Raza
In this paper, a lexical functional grammar for an automatic classification of Urdu copula verb hO (be/become) is presented according to linguistic theories. A test suite of sentences containing almost all different conjugation forms of copula verb is extracted from a raw corpus. It is tried to keep only the cases of copular construction because the copula verb hO is very much dynamic in nature of function. The respective syntactic and functional structures of different cases of copular construction are presented, through which the lexical, syntactical and functional information required by copula verb is explored. The explorations made computationally are...
New Insights into Hierarchical Clustering and Linguistic Normalization for Speaker Diarization
- Doctorat Paristech; Telecom Paristech; Simon Bozonnet; Directeur Dr; Nicholas Evans
pour obtenir le grade de docteur délivré par
A TYPE THEORETICAL FRAMEWORK FOR NATURAL LANGUAGE SEMANTICS: THE MONTAGOVIAN GENERATIVE LEXICON
- Christian Retor É
Abstract. We present a framework, named the Montagovian generative lexicon, for computing the semantics of natural language sentences, expressed in many sorted higher order logic. Word meaning is depicted by lambda terms of second order lambda calculus (Girard’s system F) with base types including a type for propositions and many types for sorts of a many sorted logic. This framework is able to integrate a proper treatment of lexical phenomena into a Montagovian compositional semantics, including the restriction of selection which imposes the nature of the arguments of a predicate, and the possible adaptation of a word meaning to some...
Communicability in Corporate Intranet: Analyzing the Interaction among Deaf Bilingual Users
- Aline Da Silva Alves; Simone Bacellar; Leal Ferreira; Viviane Santos De Oliveira; Denis Silva Da; Alberto Barbosa Raposo
Abstract � This article presents issues of communicability that can impact in the interaction of pre-linguistic bilingual deep deaf user on a corporate Intranet. Therefore, an evaluation was carried out in the interface of a corporate system of a science and technology institution in health based on the Communicability Evaluation Method (CEM) from the Semiotic Engineering, where the objective was to evaluate the failure in communication between the interface and those users. From this research, which analyzed failures in communication between the interface and deaf users in an organizational context, it was possible to demonstrate the importance of including deaf...
- Rency Susan Varghese; Susan Varghese
Base calling is the central part of any large-scale genomic sequencing effort. Current sequencing technology produces error rates less than 3.5%. This corresponds to at least 35 errors in a 1000 base read. As the base calling algorithm's error rates drop, the smaller base call errors could be difficult to locate. Hence, assembling algorithms and human operators use a confidence value measure to determine how well the base calling algorithm has performed for each base call. This will clearly make it easier to uncover potential errors and correct them, thus increasing the throughput of genetic sequencing. The model developed here...
Enabling Enterprise Semantic Search through Language Technologies: the ProgressIt Experience
- Roberto Basili; Andrea Ciapetti; Danilo Croce; Valeria Marino; Paolo Salvatore; Valerio Storch; Ciaotech Srl Roma
Abstract. This paper presents the platform targeted in the PROGRESS-IT project. It represents an Enterprise Semantic Search engine tailored for Small and Medium Sized Enterprises to retrieve information about Projects, Grants, Patents or Scientific Papers. The proposed solution improves the usability and quality of standard search engines through Distributional models of Lexical Semantics. The quality of the Keyword Search has been improved with Query Suggestion, Expansion and Result Re-Ranking. Moreover, the interaction with the system has been specialized for the analysts by defining a set of Dashboards designed to enable richer queries avoiding the complexity of their definition. This paper...
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-16101 Cognitive and linguistic skills in Swedish children with cochlear implants – measures of accuracy and latency as indicators of development
- Malin Vass; Tina Ibertsson; Björn Lyxell; Birgitta Sahlen; Mathias Hällgren; Birgitta Larsby; Elina Maki-torkko; Malin Vass; Tina Ibertsson; Björn Lyxell; Birgitta Sahlen; Mathias Hällgren; Birgitta Larsby; Malin Wass; Tina Ibertsson; Björn Lyxell; Birgitta Sahlén; Mathias Hällgren; Birgitta Larsby; Elina Mäki-torkko
children with cochlear implants- measures of accuracy and latency as indicators of development