Multilingual and Multiscale Subsentence Alignment

This site is related to :

Emmanuel Giguet et Pierre-Sylvain Luquet. 2006. Multilingual Lexical Database Generation from parallel texts in 20 European languages with endogenous resources. Poster Proceedings of the ACL-COLING-2006 International Conference. July 16-22. Sydney, Australia. [PDF]

The following outputs are generated from raw texts and without external ressources (no tagger, no chunker, no monolingual lexicon, no bilingual lexicon). The endogenous morphological analyzer was not activated.

Word alignment on about 250 English-French parallel documents of the Acquis Communautaire : en-fr word alignments

Word alignment on about 250 English-Dutch parallel documents of the Acquis Communautaire : en-nl word alignments

Word alignment on about 500 English-Dutch parallel documents of the Acquis Communautaire : en-nl word alignments

Mixed alignment on about 500 English-Estonian parallel documents of the Acquis Communautaire : en-et mixed alignments

Word alignment on about 1000 English-Estonian parallel documents of the Acquis Communautaire : en-et word alignments

Mixed alignment on about 500 English-Czech parallel documents of the Acquis Communautaire : en-cs mixed alignments

Mixed alignment on about 1000 English-Czech parallel documents of the Acquis Communautaire : en-cs mixed alignments

[homepage]