Migrating content from old forums

OK I think I’ve found the issue. It seems to be if bbcode is converted to <ol><li>Item 1</li>...</ol> syntax then markdown inside the elements is not parsed. If 1. Item 1 style is used then all good.

This new code is likely faulty.

2 Likes

Note we are testing topic migrations on the import test site. We have migrated topics in the imported Sweden, Brazil, Germany and Russian categories to the existing respective community sections.

So far all good. We are using the method described here (CLI): Bulk move many topics from one category to another - #34 by tshenry - support - Discourse Meta.

2 Likes

Wait there is more :slight_smile: I’ve got multiple issues by mappers who looked into the imported forum:

  1. oldnew: an entire block was made bold, likely because of "---" sequence in the text. Another case: old, new. “Equals” signs do that as well: old, new.
  2. Same post, look for “брр как всё запутано” phrase: it’s gray in the original, but the color has been lost.
  3. oldnew: asterisks in tags are mistaken for markup.
  4. oldnew: http links to the old forum were not updated. Same for https.
  5. oldnew: multi-level quoting is broken: text was stashed into the last quote.

Looks like 1 and 3 come from applying markdown markup to old posts, which should not be done obviously.

4 Likes

Erm - at what point do we think the migration process is “good enough” for what by definition are old messages? :slight_smile:

6 Likes

I agree that we don’t have to make the import ideal. But I think at least issues 3 and 5 are very important: asterisks are common in tagging discussions, and multi-level quotations are important. 4 was announced as finished by Harry (?) and I’m puzzled at why that didn’t work.

5 Likes

I think “good enough” will be the migrated message keeps reasonable formatting and the ability to understand the message is not changed.

  1. Weird parsing error, not caused by importer. I will look into workaround, unlikely to be fixed.
  2. Colour based markdown [color=gray] (bbcode source) is not supported by discourse. Unlikely to be fixed.
  3. Difficult, the importer input: [building=*] output is [building=*], but discourse is swallowing the *. Likely not “good enough”, but unsure how I’d approach this.
  4. This is already be handled by the permalink redirect code. Take the old url + parameters and use them on test forum URL. eg: https://forum-import-test.openstreetmap.org/viewtopic.php?pid=145548#p145548
  5. Ouch. Not good enough, message is changed and meaning could likely be changed. It is due to over eager regex, likely very difficult to fix. Here is pseudocode version: gist:88c45733449078f8fa061838f0eb899a · GitHub
3 Likes

You can use escaping (backslash) for symbols used in markdown:

“[building=] [building=]” vs “[building=*] [building=*]”

1 Like

OK, now do that as a regex that doesn’t conflict with any other bbcode :stuck_out_tongue:

regex

“Now you have two problems…” :thinking:

1 Like

I have just reset the test import site again. I am now running a new test import with a fix for above.

The test import run should finish in around 20 hours time.

1 Like
  1. Now Fixed https://forum-import-test.openstreetmap.org/t/josm/42662/1179
5 Likes

Reset again… Another test import now running with feature complete permalink (old forum links) code.

3 Likes

Topic Permalinks (old forum links) code confirmed working.

Reset again… Likely final changes: fixes for list items (list item inner formatting and fix for some <ul> incorrectly being converted to <ol>

Import should complete in around 12 hours.

3 Likes

Fixed: https://forum-import-test.openstreetmap.org/t/topic/48223

Thanks to @Harry_Wood

1 Like

I think we’re nearly there, final? test import run done: https://forum-import-test.openstreetmap.org/

I have also enabled logins (use login with OSM button at login) and check your imported message history.

The FluxBB Markdown (forum.osm.org) content conversion looking very good now.

Thank you to @TomH and @Harry_Wood for help fixing importer issues.

Now need to work out the steps for the real import. The import will likely mean the community site will be offline / read-only for a bit. I will properly schedule and announce the outage with at least a few days notice.

16 Likes

Thank you Grant and Harry and Tom for fixing the prominent issues! I’ve posted the update to the Russian forum.

On a side note, do we keep the imported threads separate, or merge them with the already existing regional forums?

1 Like

Post import the suitable categories (eg: regional forums) will be merged into existing community.osm.org categories.

3 Likes

I have tried, but doesn’t look like anything is recoverable. The old forum used to be hosted by an individual in the OpenStreetMap NL community. Many years ago the individual walked away from OSM without handing anything over. As part of the OSM.org operations team I was later able by brute force to get a dump of the database and re-host the site, but the unicode data loss likely happened then (bad dump?) or may have happened before then.

6 Likes

@Firefishy @TomH @Harry_Wood and everybody else who may be involved: Thank you VERY much for the immense amount of work!

The login works smoothly, and I think the posting history is preserved well. All the posts I checked looked very fine.

Will links to the old forum be changed to point to the new url, or will the old link be preserved, and then redirected to the new address?
We will need a redirect anyway for links from “outside”, but having the links inside the new community going directly to the new address will probably be a smoother user experience.

3 Likes

All old forum links will redirect to the imported content.

The redirect is as follows

  1. https://forum.openstreetmap.org/(.*)https://community.openstreetmap.org/$1 (note the regex .* and $1)
  2. community.openstreetmap.org will redirect to the correct content (importer populates the discourse permalink data)

You can test it yourself, find any URL on the old forum eg:

  1. https://forum.openstreetmap.org/viewtopic.php?pid=871339
  2. Take the url path + parameter /viewtopic.php?pid=871339
  3. Prefix it with https://forum-import-test.openstreetmap.org eg: https://forum-import-test.openstreetmap.org/viewtopic.php?pid=871339
  4. That URL will redirect to the imported content.
5 Likes