+49 (0) 228 - 94 86 70
info@dicits.com

EN

How does machine translation work?

Machine translation (MT), or rule-based machine translation (RBMT)[1] to be more precise, refers to the analysis of a text based on linguistic rules and its translation into another language with the help of a complex computer program.

The simplest form of MT is word for word translation, where a morphological analysis is used to determine the basic form of every word (also known as lemmatising). This basic form is then looked up in a machine dictionary and the corresponding target word inserted into the text. The result is often unusable because the translation reads like a pure dictionary look-up.

In transfer-based translation, after lemmatising the syntactic structure of the source sentence is analysed and the analysed sentence content saved in a symbolic and as far as possible neutral intermediate form, from which the target sentence is then generated based on linguistic rules. If the text has a fairly simple syntactic structure and a well-prepared machine dictionary is used, the result can be satisfactory.

The aim is to develop an MT system which translates the content of the source text correctly and losslessly into an interlanguage, from which in turn all required and available target language sentences can be correctly and losslessly generated. For texts that are not overly figurative (such as marketing texts), the result should be satisfactory.

The quality of an MT system depends heavily on the information entered into the system's machine dictionary. All steps of linguistic analysis (morphological, syntactic, semantic) require extensive linguistic information from the dictionary to enable the system to analyse the source sentence and its words as unambiguously as possible. But even a human translator will often struggle to parse ambiguous language without proper understanding of the text, subject-specific or general knowledge; incorrect translations that have to be corrected in post-editing are therefore common.


[1] Another approach is statistical machine translation (SMT), where large bilingual text bodies are used to statistically determine word correlations and derive translations.