TextChunkingCline

Text Content Units Viewed as a Cline


Content units can be viewed at a variety of levels along a cline gradient, ranging in relative size from individual words and terms to collocations and phrases, co-occurrences, translation memory units, controlled language units, example-based machine translation units, and full chunks of standard text or single-source authoring components. They may exist as monolingual units with roughly parallel equivalents in other languages or truly parallel texts derived from or intended for the creation of aligned texts in two or more languages. Individual terminology or content units can be "manually" added to a knowledge repository, matched units can be created during the generation of new original text or translation components, or matches can be created automatically using text mining algorithms and system heuristics. The various levels represented on the cline relate direction to functions and resources in the Information Management Cycle