E.g. they introduced the preferred names directly into the planet.osm.pbf - while this is said to be human-controlled and has a feedback loop via daylightmap->esri. This loop would take way longer than e.g. correcting a name in osm.
But if you look into the blocked admission of the mt/omt vector tile layer into osm.org where one of the blocking complains last year had been the preference of wikidata names over osm names in omt (omt evolved in having a config to change that preference in recent versions), you do not have that option while using the daylightmap planet.osm.pbf as those additions are baked in.
I am a mapper an for me a “dev”/“qa” stream is all that is needed. What I am thinking is is what OSM should provide to the outside world and I would like to see the discussion focussing on that.
I do not think an outside organization should provide that curated/clean/release stream of diff’s and planet files etc. because I see that as prime delivery of OSM. Furthermore I think that inside OSM there is the best information available to create such a stream. Would be good to heat what others think of this.
For the implementation part I do not know enough on wat is reverted over time but the time should be enough so that realistically enough all major vandalism is reverted and can be done with the least amount of effort.
There’s some discussion of these preferred names in this talk:
From what I understand, the names don’t come from Wikidata per se but rather from the judgment of an internal review or localization team. The specific example they gave is that someone (yours truly) changed the Vietnamese name of San Francisco to the name that Overseas Vietnamese regard as proper but that younger people in Vietnam would never have heard of. (The wiki has than you ever wanted to know about Vietnamese exonyms.)
Meta overrode the name to prefer the alt_name over the name, but they still expose both. Even though I stand by my tagging choice, I concede that they made the right call from a product perspective, as their user base in Vietnam dwarfs the overseas community.
No the Daylightmap names do not come from wikidata.
(The openmaptiles and e.g. also the openmaptiles-planetiler process once had wikidata preferred over osm as default but that is now configurable.)
I was rather looking into this process here that may have been the process they finally decided on for daylightmap localized names: Additional Translations in Daylight
And if you remember my question about city names in Arizona on the osm us slack language channel about Vietnamese: here is the current daylightmap answer for a Vietnamese (latin) example:
And you will find many more examples in their ~39000 localized names that will result in a “huh, wait, what is that?” moment … all those are currently baked in in daylightmap planet.osm.obf where the feedback process (->daylightmap->esri->esri language experts->esri->daylightmap) seems rather cumbersome and complicated and long-running esp. if you compare the process with just changing an osm tag (ok yes, I understand that QA is cumbersome).
So in general I think that the osm community is for these kind of problems actually a better QA than a small team at daylight that for sure cannot handle all those languages (but in many other areas do a great job in QA). For that, I’m not convinced that this is baked into the daylightmap planet.osm.pbf (and it will get way worse with overture).
Haha, wowwwww. For those who don’t speak the language: “Chi Chà là” refers to the plant genus Phoenix, whereas Phoenix is named after the legendary animal. “Núi mặt bàn” is a literal translation of “mesa”. Mesa is named for the geographical feature, but no native speaker would translate that literally. I didn’t realize from reading their announcement that they were just feeding individual names into a machine translator, devoid of context about location or etymology.
They aren’t the only ones taking this approach. Esri, another Overture member, has translated both Reading, Pennsylvania, and its English namesake as “I’m reading”:
At one point, the Daylight team was talking about setting up MapRoulette challenges to encourage local mappers to add names in languages they don’t know directly to OSM. Thankfully I haven’t seen that happen yet.
That is why I once called this “AI fantasizing”. It is not, but it still looks like a machine doing this (look at all the Mesa “translations”) - even if is outsourced to a third party that may use machine translation not QA checked by humans… as maybe no expertise for some languages is available to do a QA check.
Edit (in regard to your Reading example): esri is involved in the daylightmap localized names project, so that is probably the same source…
And from their announcements I do have the feeling that with Overture, some of the currently separate files (like ml buildings) will be integrated in the main dataset without an easy chance of separating those but we will see how that develops.
make it harder for third parties to use OSMF resources (in line with the original spirit of the tile usage policy)
When organizations like the US Postal Service are hotlinking tiles hosted by OSMF, I think there’s something to be done here. I’m sure the sysadmins have a list of the most common referrer domains to the tiles? Just sort by number of hits descending and go down the list, investigate if it’s within “original spirit of tile usage policy”, and if not smack down the ban hammer.
The standard tile layer is for mappers. Quoting from the policy:
We are in principle happy for our map tiles to be used by external users for creative and unexpected uses – in contrast to most web mapping providers, which insist that you use only their supplied API.
We will ask users not following the policy to switch or block them. We have had no cases where a user following the policy has caused any technical problems.
The OWG revised the tile usage policy a couple of months ago to make it clear what is and is not required, and what unacceptable use is.
What resources would it free up? According to the budget request the annual cost of the layer is about 3 800 EUR expense and 6 500 depreciation. I’d have to check the asset register to be sure, but I think next year’s cost is going to be 4 800 EUR total. A policy of limiting to OSM.org would be easy, but limiting to OSM.org and mapping editing software would require development.
Several people have proposed implementations that would stop some edits from going to tiles. I’m not going to go through why they wouldn’t work one-by-one, but I haven’t seen any that are technically practical. We have a certain architecture forced on us by using a raster CartoCSS/Mapnik style. To consider an idea I would need to know which components need modifying, and any changes to how they communicate with each other. The components are
osm2pgsql-replication
osm2pgsql
tile expiry scripts
renderd
mod_tile/apache
CDN configuration
There are parts of mod_tile that are over 15 years old and haven’t really been touched since then. One of those caused the caching problems we had recently.
The state of the raster tile architecture is why the OSMF is funding work on vector tiles. It won’t magically fix all the problems, but by being designed around updates from the start, it will fix some. I’m doing additional anti-vandalism changes based on what I observed over the past month but am not making details available to the public right now.
Ultimately, the vandalism problem needs to be solved prior to the diffs going to consumers. It will require an approach with multiple defenses. Figuring out what to do where requires thought and knowledge of the OSM ecosystem.
The human resources of the DWG? I’ve personally dealt with about 80 tickets as a result of the current bout of vandalism, and that’s just me.
That said, I suspect that “limiting tile.osm.org to mapping (and) editing software” is a step far beyond what most OSMers would want to do - someone who hosts a local website of their town should be encouraged to use OSM tiles there. It’s the large commercial companies using OSM tiles “because they’re free” (but who are otherwise following the tile usage policy) that I have more of an issue with.
Please don’t do this. I’m using OSM as base map mainly outside of editor. E.g. I have it as base map on geocaching.com (via browser extension). And that’s how I started contributing - I saw issues that I wanted to fix.
But at the other hand, I think that have “stable” tiles for non mappers is a good idea. OSM will be less attractive to vandals.
At the risk of saying something very wrong here, would it help and/or be practical to have minutely updated tiles for the community’s internal usage (in the editors etc) and tiles with fewer updates (hourly, daily or even weekly) for external usage (website embeds etc)?
Simply using fewer updates to the database doesn’t help. E.g. updating only every Monday, vandals simply would update their stuff Sunday 23:59, so for sure their changes are in the weekly map and stay there for a week. You need some people judging whether this Mondays weekly diff is looking good and then if they are finished on Tuesday, apply it to the render database.
It would require a second render database, rendering and delivery of tiles
Perhaps it would be an option to create a database snapshot at time x, wait a week and only publish snapshot x if no data damage is discovered during this time.
Yes, that’s what I wrote. Though of course it’s hard to judge this in general. For example, for rendering OSMcarto, no body would notice if all the hiking-route relation gets messed up. But other maps might not want to apply such diff.
That need quite some storage, but not that complicated. Some kind of automated feed (from DWG?) about the data sanitary state would ideally complete the setup.
I agree with this. To adds quick comments (my personal opinion, unsure about others)
From a mapper’s perspective (already know OpenStreetMap), anywhere you see a base map knowing is from OpenStreetMap (it’s obvious with OSM Carto, but mau not with other styles) and find something missing, this make even dormant mappers go edit OSM again.
From “new” mappers perspective, quite often they start editing OpenStreetMap because lacking data (or wrong/outdated).
From “mappers” + “local website” showing a small region (smaller engouth to perceive problems, so it’s more like a city), if something go bad (or if a suddenly increase in data, often newer mapper), they actively go to check after some delay