WikiHashtags
Advertisement

Apertium is a machine translation platform. Both the language data and software are free and open source and released under the terms of the GNU General Public License.

Apertium originally was meant for translation between closely related languages (eg. between a minority and a majority language), although it has recently been expanded to treat more divergent language pairs. To create a new machine translation system, one just has to develop linguistic data (dictionaries, rules) in well-specified XML formats.

Apertium currently has language data for (in stable version) Basque, Catalan, English, Esperanto, French, Galician, Occitan, Portuguese, Romanian, Spanish, Welsh Norwegian(Nynorsk and Bokmål); in the unstable version: Bengali, Romanian, Asturian, Italian, Icelandic, Nepali, Swedish, Danish, Irish and Scottish Gaelic.

Apertium is a shallow-transfer machine translation system, which uses finite state transducers for all of its lexical transformations, and hidden Markov models for part-of-speech tagging or word category disambiguation. It is a pipeline-based, highly modular system, a language pair doesn't have to use all the standard modules, and in fact you may use modules from outside of the standard Apertium toolset (eg. Helsinki Finite State Tools for advanced morphological analysis, the VISL CG-3 Constraint Grammar parser for disambiguation, more advanced transfer modules, etc.).

Advertisement