The former dictionary website Linguee, now called DeepL, is back with a new translator engine on the market. Based on deep Convolutional Networks, DeepL is expected to be more efficient at converting different languages than Google Translate, using a selection of the latest translation methods available.
The startup of DeepL (formerly Linguee) ended on Monday with its online translator engine going live. The free website translated seven languages in real time, getting on hand with Google Translate but, the latter masters 103 languages right now.
Linguee offered before to be a search engine for translation, and the now renamed company has gained the necessary training data for their translation services. In a blind test, users have chosen translations made by DeepL as the best translations three times in a row, rather than choosing Google, Microsoft or Facebook services – a very important fact, seeing that they are the direct competition. In a direct comparison of the new translators, DeepL is seen producing translations at the same level or even better sometimes – but it also makes similar mistakes.
Convolutional Networks instead of RNN
Unlike Google Translater, the DeepL engine is not based on Recurrent Neural Networks (RNN). Instead, it uses Convolutional Networks, a system more based on image recognition. The benefit: Convolutional Networks process all words in parallel and there are already highly optimized libraries for the calculations.
The structure of DeepL’s network isn’t open to the public right now bu,t we can figure out how it works thanks to a paper by Facebook and a paper by translate Google’s research division, both of which have achieved good results with translation methods involving Convolutional Networks.
Facebook’s research group combines convolution, Gated Linear Units (GLU) and Attention. DeepL uses a similar technique for its own engine.
In oral and written language, the meaning of the words used depends on the context of the sentence. Recurrent Neural Networks’ method analyzes the entire set from front to back (and often also from back to front) and stores the information in their internal database to translate words in the right context.
Convolutional Networks’ method emulates this structure via residual connections between words and many layers of context. This produces a very similar structure, which differs only in the way that Convolutional Networks can consider a fixed set of words. RNN analyzes everything the same way, regardless of the length of the sentences. In practice, however, this restriction isn’t as decisive as it would seem to be: the speed advantage by the possibility of parallel computation (a feature provided by Convolutional Networks) is still more important.
In practice, Convolutional Networks are only better than RNN thanks to an attention mechanism which can act on multiple levels simultaneously. Google Translate also uses an attention mechanism, but only at the interface between encoder and decoder networks to ensure that the meaning of all the words in the original sentence remains unchanged in the translated sentence. Such a mechanism, working at different levels at the same time, allows the networks to extract certain aspects of word meaning with more detail – at least that’s the conclusion of the researchers, based on the BLEU scores.
Beam-Search
Neural networks rarely produce directly a sentence with perfect meaning. Instead, they generally create a basic format with exchangeable words that cover all possible meanings. DeepL, on the other hand, uses a beam search algorithm: each word in the sentence is translated thinking of which word should be next. The algorithm multiplies the probabilities for all possible sentences, but at the same time, it processes every sentence in different parts, so only the best translations according to word structure and context are highlighted. The system can also choose a slightly different word than expected if, by doing so, the rest of the sentence flows better.
Probably DeepL’s system isn’t completely new technology. But the combination of Convolutional Networks, Attention and Beam Search is definitely a really effective mix of vanguard technology and the latest research results. Thanks to this, DeepL may currently have an advantage over the latest version of Google Translate, at least in terms of efficiency. Too bad that DeepL doesn’t explain its system as generously as the competition does.