A new benchmark for machine translation (MT) has emerged and, for once, it has very little to do with translation quality. The CodeCarbon package was used by researchers in India to, yes, compare carbon dioxide (CO2) emissions from MT drive engines, measuring the environmental (un)friendliness of different language pairs.
A four-person team from the Manipal Institute of Technology – Mirza Yusuf, Praatibh Surana, Gauri Gupta, and Krithika Ramesh – published the paper “Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine Translation” on the arXiv preprint platform in September. 26, 2021.
The authors felt it was “imperative” to explore the carbon efficiency of TM even though, relatively speaking, TM is not a big climate offender and climate activists are unlikely to boycott TM. anytime soon. According to the authors of the paper, language models “require a large amount of computing power and data to train, which consequently results in large carbon footprints.”
DeepL is one of the machine translation providers that has increased its access to computing power. The Germany-based company has established a data center in Iceland to help source its computing power, saying its supercomputer is one of the largest in the world.
The formation and large-scale development of TM (as well as NLP models more broadly), “could have adverse environmental consequences”, whether their power consumption is carbon neutral or not. In short, the energy used to drive MT motors “may directly or indirectly contribute to the effects of climate change,” the researchers said.
Carbon emissions per language pair
The researchers’ job was to evaluate six language pairs to assess the computing power required for training; that is, which pairs were more energy-intensive and, therefore, carbon-emitting.
By assessing the differences in carbon emissions per language pair, the researchers hoped to open the door to a more environmentally friendly approach to machine translation training that takes into account the specific performance of a language pair.
The experiments focused on English, German and French and their six possible language combinations. They compared the performance of two models: a sequence-to-sequence convolutional learning model (ConvSeq) and a transformer-based model with attention mechanisms, and used a dataset containing around 30,000 samples for each language. .
Researchers tracked carbon emissions released during training using the CodeCarbon package as well as improved BLEU scores for baseline and comparison.
MT environmentally friendly (no)
Not only did German target language pairs have the lowest BLEU scores, but they also took the longest to reach a BLEU threshold score of 25. The researchers said this second finding supports the hypothesis that “translation into German might be more computer-friendly than French or French”. English.”
In terms of training time required, the French>German, English>German and German>French language pairs take the longest to form and are therefore the most carbon-intensive pairs. The French>German language pair was “the most computationally expensive” in both models.
In contrast, English>French, German>English, and French>English, which each involved English as a source or target language, took less time to train and were the least carbon-intensive.
Interestingly, the German dataset was the most lexically diverse of the three – based on vocabulary by number of tokens. This “probably demonstrates that lexical diversity is directly proportional to training time to achieve an adequate level of performance,” the researchers noted.
Comparing the two systems, the Transformer models were found to be significantly less carbon-emitting than the ConvSeq models, which the researchers attributed to the former having comparatively fewer parameters. Processors also got higher BLEU scores.
The researchers concluded that there is a disparity between language pairs in terms of carbon emissions and that “language pairs involving English perform better than those that do not”. However, “many studies remain to be done to identify exactly what is causing the differences in emissions,” they said.
In addition to proposing ways “to reduce the carbon emissions released when training and deploying machine translation systems that are extensively trained on large datasets,” the researchers said future research could also be expanded. low-resource languages and those that do not follow the Latin script.