We Need The Discourse Translation Plugin

Hi @mDav and thanks for the request.

This is already part of the current languages and location proposal

We want to eventually experiment with it, probably once we have other big topics sorted out (specially the transition from Help OSM and the old forums).

Thanks for the pointer @nukeador

It wasnā€™t clear from the proposal language if this plugin was an active to-do item or just a ā€œmaybeā€. Can you confirm it is a proper to-do in the roadmap for this discourse site?

The issue I see with adding the plugin far down the line is that the ability to translate posts will affect how people naturally organize topics and discussions. And, the way that topics become organized at the outset will be the ones which stick forever (some more thoughts on this here).

Considering it is relatively trivial to install plugins would you consider implementing this before the transition?

1 Like

The proposal is still in the discussion phase, no decision has been taken yet.

See my reply on the other topic:

Iā€™ve installed this plugin on https://forum.openfoodfacts.org/ using the Google translate API (not my choice).

It works pretty well:

  • the globe icon appears when a post is not written in your selected interface language
  • the translations are stored to reduce the calls to the translation API
  • translation appears when you click on the globe, it does not show automatically if already available

A fork of this plugin supporting DeepL exists: GitHub - literatecomputing/discourse-translator
DeepL provides a very good quality, but supports less languages (24) than competitors.

All translation API have some costs: around 20$/MB for Google or DeepL (500K free per month)

1 Like

This is the real question - from an operations point of view I am sure there are no objections to installing the plugin except that we somehow need to estimate what usage is likely to be so that we can work out the likely cost and seek a budget for it.

Thanks for sharing this @cquest! I signed up for the forum and tested it out, it works great! ps openfoodfacts is a super cool project!

Thereā€™s a really nice post on the Discourse forums on how to estimate translation costs, using a few SQL queries to estimate how many users and average length of posts etc.

Sure we are at the beginning stages of the forums here so itā€™s hard to estimate with little data, but perhaps something similar can be run on the previous forum databases we have access to.

At least for DeepL, they offer to sign up for the free plan which doesnā€™t charge at all even if the free 500k characters is over (it just denies any more requests once monthly limit is reached). I see this as a way to test out the plugin so we can have a more nuanced discussion around it, with no risk.

What would you need to be willing to implement this @TomH?

Itā€™s not really up to me as Iā€™m not the lead on this project. I was just commenting with my OWG hat on as to what we would be concerned about.

Using the old forum posts, we can try to evaluate if the cost would be really high and non sustainable for the OSMF, but with 20$/MB of text (thatā€™s quite a lot of text) I think it should not be a problem.

Regarding the DeepL fork, I do wonder why it exists at all because the discourse translator already supports adding different providers, so why is it not simply a Pull Request on the main branch? So maybe in that fork, some unsaintly hackery is involved?

Here is the comparison between the two branches: Turns out there is no hackery and I do wonder why this is not simply a PR

Iā€™ve created a test PR at the fork as a pretext :smirk: to see if it is still maintained and more importantly to ask why it hasnā€™t been merged upstream yet (as the issue tracker is disabled here too):

3 Likes

Here is test on my migration test instance where I installed the plugin:

1 Like

i have to say: Iā€™m quite impressed by the quality of the automated translation! after a short reading, it seems the two slang words ā€œBockā€ and ā€œSchalterā€ (in this particular context) are the only issuesā€¦
But that shows some basic problem as well: some ā€œmeta semanticsā€ expressed via slang or common local saying might be removed in the translation process.

Which translation engine is this? Itā€™s really good!

I researched into the different translation services a bit and put it into another thread:

1 Like

Iā€™ve installed the DeepL modified plugin and DeepL translations are really very good as far as Iā€™ve seen with the languages I understand (french, english, spanish).

The goal is not to test translation quality by itself, but its integration into Discourse UX.

On another discourse Iā€™m administering, Iā€™ve setup Google Translate.

I wouldnā€™t worry about the unability to translate particular slang words. I mean if you visit a remote corner of your own country where you are are native speaker you might also have a hard time understanding everything people are telling you in their regional dialect. Wouldnā€™t expect an auto-translate to do better than myself in my native language. :slight_smile:

As the goal is to remove the language barrier and enable cross-lingual discussions (probably mostly on tagging and handling real-world stuff in OSM) I would go for the best translation quality so we donā€™t have automated misunderstandings.

It is better if a language is not supported - thatā€™s a clear and easily understood information - than if many more are supported but there is frequent mistranslation and miscommunication.

I guess this warrants more testing/looking up of translator quality. From what was posted here so far, it looks like we already have a very good candidate with DeepL

Iā€™ve not looked in depth at how the plugin works, but instead of selecting a single translation API, it should be possible to use multiple API:

  • DeepL first for available languages
  • a second one for other languages
1 Like

youā€™re right. good point!

:+1:

:+1:

Yeah, but thatā€™s also the most expensive! Could be a real scaling issueā€¦!

Exactly! Better to have no translation at all than unintelligible strings of words created by a hugh quantity / poor quality tool.

@ cquest

Bro, you are doing a great and valuable job here ā€¦ have not seen any post from you so far which would not make sense ā€¦ thanks for that, mate!

1 Like

During Fridayā€™s meeting there was agreement and testing will be run to evaluate the best way to implement this plugin