I believe an integrated translator could be a great step forward to reduce the language barrier for community members with different language background, specially for those not fit enough in english to engage themselves in the international part of the forum. Anyhow it would only make sense if
it is easy to handle (one click)
delivers useful translations
would not slow down performance
in other words: a practical real world solution.
I have to admit that I do not have any experience with such plugins. So far I use DeepL whenever I need an translation occasionally, just copying the text on screen, paste it to the DeepL window and get the instant translation (as far as the language is available). This works well and the translations are better than any others I have seen so far but working through the posts of an extensive topic and then preparing a reply in the same way is a time consuming job … in other words: not really a practical real world solution.
It wasn’t clear from the proposal language if this plugin was an active to-do item or just a “maybe”. Can you confirm it is a proper to-do in the roadmap for this discourse site?
The issue I see with adding the plugin far down the line is that the ability to translate posts will affect how people naturally organize topics and discussions. And, the way that topics become organized at the outset will be the ones which stick forever (some more thoughts on this here).
This is the real question - from an operations point of view I am sure there are no objections to installing the plugin except that we somehow need to estimate what usage is likely to be so that we can work out the likely cost and seek a budget for it.
Thanks for sharing this @cquest! I signed up for the forum and tested it out, it works great! ps openfoodfacts is a super cool project!
There’s a really nice post on the Discourse forums on how to estimate translation costs, using a few SQL queries to estimate how many users and average length of posts etc.
Sure we are at the beginning stages of the forums here so it’s hard to estimate with little data, but perhaps something similar can be run on the previous forum databases we have access to.
At least for DeepL, they offer to sign up for the free plan which doesn’t charge at all even if the free 500k characters is over (it just denies any more requests once monthly limit is reached). I see this as a way to test out the plugin so we can have a more nuanced discussion around it, with no risk.
What would you need to be willing to implement this @TomH?
Using the old forum posts, we can try to evaluate if the cost would be really high and non sustainable for the OSMF, but with 20$/MB of text (that’s quite a lot of text) I think it should not be a problem.
Regarding the DeepL fork, I do wonder why it exists at all because the discourse translator already supports adding different providers, so why is it not simply a Pull Request on the main branch? So maybe in that fork, some unsaintly hackery is involved?
Here is the comparison between the two branches: Turns out there is no hackery and I do wonder why this is not simply a PR
i have to say: I’m quite impressed by the quality of the automated translation! after a short reading, it seems the two slang words “Bock” and “Schalter” (in this particular context) are the only issues…
But that shows some basic problem as well: some “meta semantics” expressed via slang or common local saying might be removed in the translation process.
I wouldn’t worry about the unability to translate particular slang words. I mean if you visit a remote corner of your own country where you are are native speaker you might also have a hard time understanding everything people are telling you in their regional dialect. Wouldn’t expect an auto-translate to do better than myself in my native language.
As the goal is to remove the language barrier and enable cross-lingual discussions (probably mostly on tagging and handling real-world stuff in OSM) I would go for the best translation quality so we don’t have automated misunderstandings.
It is better if a language is not supported - that’s a clear and easily understood information - than if many more are supported but there is frequent mistranslation and miscommunication.
I guess this warrants more testing/looking up of translator quality. From what was posted here so far, it looks like we already have a very good candidate with DeepL