Should OSMF offer free raster tiles for end users (not for OSM map QA)?

I would absolutely describe “continuing to serve tiles containing known slurs against an individual many days after that has been removed from the OSM data” as “an operational problem”.

5 Likes

(apologies for slightly offtopic diversion but)

The various feeds are public, so you’re more than welcome to give that a go as some sort of “proof of concept”. An example diff is here (that just happened to be the latest one as I wrote this). You’re welcome to try and define a bunch of things that you can know from a diff and train even a simple Bayesian model on that as an input, and also look at “potentially negative community interactions” (blocks, unanswered comments, future reverts) as result data.

3 Likes

This is what a lot of downstream processors of osm data diffs do and that is why you don’t see the vandalism in third party produced maps or applications based on osm data by the processors that do this.

But you cannot do this for the Standard layer published via tile.openstreetmap.org as this one is build for mapper feedback - tile.osm.org is a tool for fast feedback for the osm mapping community and doing a tremendous job at this.

The problem is that the same layer is handed out massively as well to third parties to display maps at third party sites or apps. And for this it is the wrong tool and through this wrong use of the tool it is a massive amplifier for any vandalism actors.

1 Like

And this is - besides the fact that we as a community should do anything we can to support community members that are targeted by those slurs - even bigger than one might think (esp. with plans of moving the OSMF into the EU but the legal setting is probably similar in the UK) as this would also imply legal obligations for a publisher of user generated content (and practically, OSM is exactly this) to immediately stop the publication of harmful content (hate speech = the vandalism slurs) as those may in it’s scope fall under criminal law in regard to defamation and false criminal accusations and as such must not be published anymore by an ugc publisher as soon as the publisher becomes aware of this. (For further reading: Digital Services Act - Wikipedia - OSM might be happy atm to not have received a vlops letter yet - as did Wikipedia - but with the growth at tile.osm.org usage this might happen in the future.)

Which also implies, as @SimonPoole rightly already stated elsewhere, that those kind of vandalism acts we have seen in April and May should not only be reverted but redacted as well.

It might also be worthwhile to think about stopping tile delivery via tile.osm.org to third parties for as long as those kind of slurs are in the cdn caches and e.g. only serve the tiles in these kinds of occasions to those working on reverting and redacting the data or only at www.osm.org or JOSM etc. with a warning attached. And then resume the delivery of tiles via tile.osm.org to third party website/apps etc. when the rendering backends are not overwhelmed anymore and all cdn caches are “clean” again (this would also be in line with the tile usage policy that practically says: “no guarantee on delivery” and “delivery may stop anytime”).

Isn’t OpenStreetMap primarily about the data? There are numerous styles available within the ecosystem. Why not showcase those instead and limit tile.osm.org to mapping editing software as “an important feedback mechanism”? While this won’t eliminate vandalism, it would free up substantial OSMF resources to address more critical issues.

2 Likes

Mappers also receive feedback by looking at the map outside of an editor. I can’t be the only one who habitually force-reloads the main map after saving my changeset, to get that instant gratification. On the other hand, I’ve rarely if ever needed to load the rendered tiles in an editor – that’s for aerial imagery and other external sources.

8 Likes

Openstreetmap may well be about the data as far as many here are concerned. However, has anyone asked those who ‘just’ use it as a map, or unknowingly as a route planner? I do wonder if for many users,
the Openstreetmap raster map IS Openstreetmap? Has there been much market research?

4 Likes

From the other topic, but I think it better belongs here.

I do not think that is another idea, in both cases I see a need for a curated/clean/release stream of diff’s and planet files etc. that third parties can use to draw tiles. Or does somebody thinks that is no responsibility for OSM and providing such curated/clean/release stream of diff’s and planet files should also be done by third parties.

I see also a need for a unfiltered “dev”/“qa” stream for mappers to see what they are doing and to detect vandalism that ideally is not in the curated/clean/release stream.

Fine to me if OSM does not provide tiles for that curated/clean/release stream but something will be needed so they do not use the unfiltered “dev”/“qa” stream instead as they are doing now.

Instead of two different streams, wouldn’t it be sufficient to have an API endpoint that says: diff 990 is known to (= has been reported by DWG to) contain a bad case of vandalism, this wasn’t reverted until diff 998, so don’t render tiles based on a database that hasn’t had diff 998 applied yet?

Of course this only helps data consumers that delay replication.

2 Likes

I think the standard tile layer is a huge part of what makes OSM succesful, it is ‘free’ advertising on a lot of sites.
I do think it is useful to have separate tile servers for mappers (any application where you can log in using OSM) and background map users.

For the seperate layer:

  • If there is a DWG edit on this tile between 30 and 60 minutes ago, use all changesets up to the most recent DWG edit on this tile (not necessarily in that window)
  • Otherwise all changesets up to an hour ago.

If the DWG fixes all vandalism within half an hour, it should not appear in the render-on-request tiles.
This is because the DWG edit will shift into the 30-60 minute window before the vandalism edit goes outside the 60 minute window. Therefore both the DWG and vandalism changeset are applied both or neither.
The 30 minute window is there to give DWG the option to revert with multiple changesets.
Otherwise we might get vandalism - partial revert = partial vandalism.

I don’t know how technically feasible this is, as some changesets need to be applied based on which tile is rendered.

How technically plausible this is** depends on the granularity you want. If you want monthly and don’t want OSM buildings, then Facebook has already got you covered (at least for now). Anyone with half an ear to the forums and the global block list could probably pick a time weekly, or even daily, and say “OSM data is mostly clear at this time”. What is much harder to do is to provide a feed without any vandalism in it.

** and I’m writing this without saying whether this is a good idea or not.

1 Like

Please be careful as the planet.osm.pbf published by Daylight is not pure QA-controlled OSM data anymore. It once was but it got additions that will have the data differ in comparison to pure OSM data (and yes, I’m talking about the planet.osm.pbf and not the additions like ml-buildings, fb-roads, etc.). And depending on what you want you might have to do a QA again on the daylight planet.osm.pbf in regard to their additions.

A diary entry explaining what the actual differences are now would be really useful. I suspect a number that said “X% of Overture data is OSM” would also be informative.

When I last looked at in in detail (some months ago now) I was surprised at the lack of difference (beyond suppressing certain types of data altogether).

2 Likes

E.g. they introduced the preferred names directly into the planet.osm.pbf - while this is said to be human-controlled and has a feedback loop via daylightmap->esri. This loop would take way longer than e.g. correcting a name in osm.

But if you look into the blocked admission of the mt/omt vector tile layer into osm.org where one of the blocking complains last year had been the preference of wikidata names over osm names in omt (omt evolved in having a config to change that preference in recent versions), you do not have that option while using the daylightmap planet.osm.pbf as those additions are baked in.

1 Like

I am a mapper an for me a “dev”/“qa” stream is all that is needed. What I am thinking is is what OSM should provide to the outside world and I would like to see the discussion focussing on that.

I do not think an outside organization should provide that curated/clean/release stream of diff’s and planet files etc. because I see that as prime delivery of OSM. Furthermore I think that inside OSM there is the best information available to create such a stream. Would be good to heat what others think of this.

For the implementation part I do not know enough on wat is reverted over time but the time should be enough so that realistically enough all major vandalism is reverted and can be done with the least amount of effort.

1 Like

There’s some discussion of these preferred names in this talk:

From what I understand, the names don’t come from Wikidata per se but rather from the judgment of an internal review or localization team. The specific example they gave is that someone (yours truly) changed the Vietnamese name of San Francisco to the name that Overseas Vietnamese regard as proper but that younger people in Vietnam would never have heard of. (The wiki has than you ever wanted to know about Vietnamese exonyms.)

Meta overrode the name to prefer the alt_name over the name, but they still expose both. Even though I stand by my tagging choice, I concede that they made the right call from a product perspective, as their user base in Vietnam dwarfs the overseas community.

1 Like

No the Daylightmap names do not come from wikidata.

(The openmaptiles and e.g. also the openmaptiles-planetiler process once had wikidata preferred over osm as default but that is now configurable.)

I was rather looking into this process here that may have been the process they finally decided on for daylightmap localized names: Additional Translations in Daylight

And if you remember my question about city names in Arizona on the osm us slack language channel about Vietnamese: here is the current daylightmap answer for a Vietnamese (latin) example:

vietnamese-latin-example

And you will find many more examples in their ~39000 localized names that will result in a “huh, wait, what is that?” moment … all those are currently baked in in daylightmap planet.osm.obf where the feedback process (->daylightmap->esri->esri language experts->esri->daylightmap) seems rather cumbersome and complicated and long-running esp. if you compare the process with just changing an osm tag :wink: (ok yes, I understand that QA is cumbersome).

So in general I think that the osm community is for these kind of problems actually a better QA than a small team at daylight that for sure cannot handle all those languages (but in many other areas do a great job in QA). For that, I’m not convinced that this is baked into the daylightmap planet.osm.pbf (and it will get way worse with overture).

4 Likes

Haha, wowwwww. For those who don’t speak the language: “Chi Chà là” refers to the plant genus Phoenix, whereas Phoenix is named after the legendary animal. “Núi mặt bàn” is a literal translation of “mesa”. Mesa is named for the geographical feature, but no native speaker would translate that literally. I didn’t realize from reading their announcement that they were just feeding individual names into a machine translator, devoid of context about location or etymology.

They aren’t the only ones taking this approach. Esri, another Overture member, has translated both Reading, Pennsylvania, and its English namesake as “I’m reading”:

At one point, the Daylight team was talking about setting up MapRoulette challenges to encourage local mappers to add names in languages they don’t know directly to OSM. Thankfully I haven’t seen that happen yet.

11 Likes

That is why I once called this “AI fantasizing”. It is not, but it still looks like a machine doing this (look at all the Mesa “translations”) - even if is outsourced to a third party that may use machine translation not QA checked by humans… as maybe no expertise for some languages is available to do a QA check.

Edit (in regard to your Reading example): esri is involved in the daylightmap localized names project, so that is probably the same source…

And from their announcements I do have the feeling that with Overture, some of the currently separate files (like ml buildings) will be integrated in the main dataset without an easy chance of separating those but we will see how that develops.

1 Like
  1. make it harder for third parties to use OSMF resources (in line with the original spirit of the tile usage policy)

When organizations like the US Postal Service are hotlinking tiles hosted by OSMF, I think there’s something to be done here. I’m sure the sysadmins have a list of the most common referrer domains to the tiles? Just sort by number of hits descending and go down the list, investigate if it’s within “original spirit of tile usage policy”, and if not smack down the ban hammer.

Or are we already doing something similar?

1 Like