Corpus Projects at Kent

The future for corpora at Kent State

A. Student and faculty creation of sub-corpora, ad hoc corpora

B. Tagging and editing

C. Collocation dictionary or thesauri in other languages; a corpora-based multilingual collocation dictionary or multilingual thesauri

D. Personal concordances used as translation memories

E. Adapting and trading corpora with other translator-training programs

F. Creation of "learner corpora" (concordanced student translations) to assist faculty in material preparation, troubleshooting and      overcoming problems of directionality

G. Student use of corpora as aids in assessing translation quality

H. Bilingual digital libraries with hypertextualized works/parallel corpora/comparable corpora


Papers Presented at the 2008 ATISA Conference


Designing Corpus-based Translation Tasks for the Classroom: Considerations, Test Cases and Prospects
Kelly Washbourne     Spanish Translation   Assistant Professor

While electronic corpora have gained pride of place in many translator training environments in the last decade, scant critical attention has been focused on corpus analysis: in particular, how and what students learn from them, and the implications for autonomous learning. This paper seeks to demonstrate how aligned translation segments in translation corpora and KWIC (keyword in context) concordance lines in monlingual corpora can allow patterns to be discerned that reveal systems of choices—macrostrategies—employed by multiple writers or translators, thus testing empirically our  intitutions and received generalizations about translated texts. The multiple uses and foci of corpora for translator training will be surveyed: expert modeling, comparative near-synonym profiles, verification of intuitions, “feature extraction” of given text types and genres, register, contrastive syntax, collication, colligation (grammatical collication), neology, and LSP term extraction. Corpora may even be used for quality assurance—editing. Some sample tasks that incorporate the novice’s various learning processes, while showing corpora to be long-term teaching tools and guarantors of quality rather than ad-hoc problem solvers, will be presented. Finally, speculation about the future and upper limits of corpus use (irony, metaphor?) in the classroom will be offered, including the unmined potential for learning from learner corpora.

A corpus-driven approach for analyzing the interplay between linguistic adequacy and cultural acceptability in the context of advertisement translation
Erik Angelone     German Translation    Assistant Professor

An examination of current trends in translation studies curricula suggests that translators will soon be equipped with the technological know-how to effectively utilize both SL and TL corpora in conjunction with corresponding language extraction applications to further  foster translation success. This presentation is intended to suggest a preliminary corpus-driven framework for use by the translator in analyzing paramount shifts in the German to English translation (and cultural localization) of hotel websites. Corpora to be analyzed will consist of a series of German SL hotel websites along with their English TL versions (US-based), in this case serving as potential parallel texts. Comparative language trends and patterns, obtained using a simple concordancing tool, in conjunction with POS tagging and chunking, can then be analyzed by the translator in discerning both direct and oblique translation procedures (Vinay and Darbelnet 1958) and how successfully translating the hotel advertisement text, a prime example of the operative text type (Reiss 1977), hinges on an intricate, interwoven balance of linguistic adequacy and cultural acceptability (Toury 1995).

Studying Cohesion in Corpora:Comparative Research in Sentencing and Paragraphing in Russian and English Texts
Brian Baer    Russian Translation    Associate Professor

Tatyana Bystrova    PhD student of Translation Studies

Translation training and assessment remain largely focused on issues of semantic transfer at the level of the word or phrase. However, many textual elements located above the level of the sentence, when ignored, can negatively impact the quality of a translation, contributing to the production of translationese. Based on the findings from a comparative study of Russian and English corpora, this presentation will isolate differences in sentencing and paragraphing norms in Russian and English and situate those differences within broader issues of textual cohesion and discourse organization. Particular attention will be paid to the phenomenon of the one-sentence paragraph. Research findings will be presented that isolate the particular functions of the one-sentence paragraph in English and Russian along with strategies for treating such paragraphs in translation. Examples will be taken from corpora of literary and non-literary (editorials) texts, allowing for a comparison across text types. In addition to generating new data, this comparative study seeks to model the use of untagged corpora in Translation Studies research.


Selected Corpus-related Faculty Research

1. Gregory M. Shreve has published "Corpus enhancement and computer assisted localization and translation" (309-331) in c.7, Perspectives on Localization, Keiran J. Dunne (ed.), American Translators Association Scholarly Monograph Series XIII, John     Benjamins, 2006.

  He has also produced studies on the role of corpora in multilingual ontologies, thesauri, and taxonomies for use in internationalizing digital library content and metadata.

2. Sue Ellen Wright and Gehard Budin (eds.), in Terminology Management, VII, 8.4 (PP.725-944), cover corpus-related applications.

3. Erik Angelone, The Conceptualization and Integration of an E-Collocation Trainer: Methods of Empirical,Translation-based Collocation Research, Wissenschaftlicher Verlag Trier, 2007