Should OSMF offer free raster tiles for end users (not for OSM map QA)?

This question is obviously prompted by the discussion here. Despite the tile usage policy, and despite the OSM Carto layer being designed primarily as “an important feedback mechanism for mappers”, many third-parties use that tile layer too. The DWG** gets complaints about any problems a day or so after the direct ones from osm.org users, as they have to go through a third-party helpdesk before coming to us.

There are a few a few possible responses to the problem, which include things such as:

  1. make it harder for vandals to vandalise
  2. make it harder for “obvious vandalism” to pollute the CDN
  3. make it harder for third parties to use OSMF resources (in line with the original spirit of the tile usage policy)
  4. deploy a separate non-QA layer that third parties can use, which isn’t updated as frequently as OSM Carto currently is.

We can’t deliver (1) with forum posts; so we’re waiting for actual code there before proceeding. There’s some progress on (2) behind the scenes. I have some sympathy with (3), in that I see lots of third-party companies using OSMF infrastructure “because it’s free” and unfortunately I think that’ll continue while it is (or at least, appears to be - in practice there is no such thing as a free lunch).
This leaves (4) - do people think that that is a good, or even a practical, idea? I’m not convinced, but calls for “map tiles I can use” (in whatever project I’m working on) have been made in the past.

** I’m a member of the DWG, but I’m writing this in an entirely personal capacity.

12 Likes

I am not opposed to 4 tbh. In practice, the OSMF tiles are widely used far beyond their initial purpose: in the Greek section of the forum we endeavoured to document some usecases of OSM data in Greek local contexts and by a large margin OSMF tiles dominate.

What this would entail in monetary costs is a valid argument, but once again pretends that the use is not already being made with the current policy anyways. Its not like this new layer will be easier to implement and therefore see a skyrocketing of use. If anything it would put less strain on OSMF resources that third parties would use a less frequently updated tileset, am I mistaken?

First of all the tile usage policy would need a big red banner explaining that the tiles provided by the OSMF through tile.openstreetmap.org are prone to vandalism attacks and may contain graphic language of the worst kind imaginable or also other forms of vandalism that render them totally unusable for any business or institutional use (think of the destroyed Tel Aviv last winter). The same applies for other free “offerings” by the OSMF, e.g. Nominatim geocoding and the like.

The problem right now is also that the tiles by the OSMF are the easiest way imaginable to get a message out to millions of people (through third parties that use the tiles from tile.openstreetmap.org) in a no effort whatsoever way for any bad actor (much easier than e.g. hacking the Truth Social account by an orange haired guy but still reaching the same size of audience on a first level). The recent vandalism (and the one from yesterday again with Retardville etc.) clearly shows that the “viewability” of OWG produced tiles should better be restricted to osm.org and and some whitelisted third party tools (JOSM etc) for mapping (where mappers see them to see what the current state of the data is).

Sometimes the handling of this still get the feeling of 2009 when OSM was small and had nearly no publicity. But in the post 2018 world OSM data is the biggest source of geographic data on our planet (viewed mostly through third party applications or providers like Facebook, Mapbox, Esri, etc.) so instead of spending resources on providing free infrastructure with the tile offering OSMF should rather focus on the core again - providing and protecting OSM data and supporting mappers.

The free tile offering - even if sponsored by Fastly - is actually resulting in at least three problems:

a) a heavier load on rendering servers as tile access is growing like crazy since the cdn moved from donated (and overloaded) servers to Fastly. You may see this problem in the spendings by the OWG on new tile renders in the last couple of years.

b) a huge problem if Fastly would pull out. Currently Fastly is sponsoring OSMF with their cdn service in ranges of a list price of 120k to 200k USD per month. This sponsoring is far bigger than any overall budget the OSMF has. As the old donated cdn servers got decommissioned, OSMF would currently even get into a huge problem in not being able to serve tiles to www.osm.org and QA or mapping tools if Fastly would pull out.

c) as @SomeoneElse stated the heavy usage of tiles from OSMF also results in heavy amounts of support requests against a) the OWG and b) the DWG while both practically relying on volunteers. With the new state of the world since 2022 and the resulting occurrences of large-scale vandalism attacks against (or into) OSM data, the DWG (and to some part the OWG) get’s more and more complaints by third parties that should not end up in their hands and overstretches any volunteers patience. Remember, practically everyone here is a volunteer (except the SRE) and if e.g. the DWG esp. with @SomeoneElse and @woodpeck and all others from the DWG and some more community members would not have fought endlessly against large-scale vandalism attacks in 2023, OSM would not exist anymore today. (Esp. with the attack in Israel last Nov/Dec that needed weeks of hard work to dissemble).

So yes, No. 3) of @SomeoneElse recommendations are really worth considering. Currently, OSMF acts like if Wikipedia would say: “hey, yes, you can download our data (this is what Wikipedia and OSM does), - and: where should we send the free 80+ servers you would need to host the wikipedia data?” (this is what OSMF currently does).

4 Likes

I really like the idea of splitting off a separate non-QA layer for public use. I think it has a ton of potential benefits for the project and organization.

  • It reduces some of the conflict in getting items rendered on carto because it’s not trying to dual purpose as a public tile layer and a mapping feedback layer
  • It could help significantly with vandalism, as you noted, especially if the current tiles eventually made their way to being fully mapper feedback and could be left only for logged in users or something - though maybe that would never be possible due to the wide deployment.
  • A new public layer could be deployed with an API key requirement or something that could enable better outreach to users and, whether formal or informal, some kind of payment to help with upkeep from very large users. Some might disagree with whether payment should ever be involved, but I think an API key would still be an important feature just to ensure people provide a point of contact for use of the tiles.

But I also know the hard part of all of this is - who will build and maintain it. Last time a similar discussion came up, when asked for people who wanted to build and maintain a tile layer, nobody stepped up. Still, I wanted to express my support for your idea.

13 Likes

If the vandals figure out the timing of the next update, they can time their actions to cause damage just before the update occurs, resulting in the damage being visible for a longer period. So, I’m not really sure that the effort and cost of maintaining a separate non-QA layer will actually solve the problem.

5 Likes

Would it be technically possible to prevent a changeset from being applied to the database if an algorithm detects it as suspicious? This could also apply to changesets where mappers have requested a review.

Sure, good changesets reviewed late might lead to merge/conflict data issues, but if the algorithm is effective, it would significantly reduce the efforts needed to revert problematic changes that become publicly visible.

2 Likes

This (4) solution could even use some solution as (now going to be defunct) Meta Daylight. In other words, data not released instantaneously (around 1 month) and maybe with some QA applied. Nothing new, nothing invented. Just see many pros:

1 - osm.org Carto can still be used to help mappers only (as the initial goal)
2 - we could still provide tiles to small usage, showcasing OSM to the world
3 - in this new tileset, we could even (can you imagine!) put an OSM logo in the lower corner, so no one could hide OSM attribution (it’s so easy to do that in Leaflet!)
4 - we can do that in a way to not break anything in the millions of Leaflets deployed around the world.
5 - using Daylight (or now Overture, heh!) to create these new tiles, there’s no need to invent anything new. No new efforts in building and maintaining.
6 - long shot: we could even create a new tileset, simple, clean, “beautiful”, generic, in the future, for the public (not osm.org).

1 Like

I think the goal should be to provide a clean/release stream of diff’s and planet files etc.

This stream should be used by people/organizations, not being mappers, rendering their own tiles. It makes sense to provide also a tile layer for this and I think this is what you should see when you visit www.openstreetmap.org

The current stream of diff’s and planet files should be renamed to a “dev”/“qa” stream.

3 Likes

It would be cool to have a good guide on generating / self-hosting raster tiles. Option 3 sounds good, assuming the transition is smooth and well documented. That also applies to Overpass

What would be needed beyond what’s documented in the switch2osm guides for what you’re looking for? That also includes the ability to apply updates at a time of your choosing - allowing you to do a “periodic release” at your convenience.

This guide looks good, thanks for the link! A couple of notes:

  • It mentiones mod_tile, Mapnik and osm2pgsql without really explainig their purpose and interplay. I’m not familiar with any of those components, but I assume it shouldn’t be hard to learn how to maintain them. Am I right to assume that Apache and PostgreSQL are hard dependencies?

  • After reading this page, I clicked on a Debian 12 guide, and it looks rather long. I don’t like Docker and I prefer native installation, but it looks like there are too many moving parts, nuances and hard deps to try to replicate it on any of my machines (most of them are running Arch, so I assume the difference would be significant). I’ll try the Docker way and post the results.

  • I still have no idea on how much disk space is needed in order to serve the whole world. The PBF file is under 100 GB, but that’s before the import. I saw 1 TB SSD + 24 GB of RAM requirement being mentioned indirectly, but it didn’t give me an impression that it’s an absolute minimum requirement. Will it work with 16 GB of RAM and 500 GB SSD? VPS aren’t cheap, the project I’m working on is open source and it’s on a tight budget, so I’m just trying to assess if it’s even a feasible option.

  • My mental picture of a tile server was very different. I expected it to be a simple set of small, equally sized raster images, organized in a well-known hierarchy with no extra software on top, except for a static web server. Is it possible to pre-generate all the tiles and make it fully static? I would gladly sacrifice a few zoom levels in order to simplify this setup. I’m also thinking on self-hosting or colocation as a way to cut costs in case the storage requirements are significant.

1 Like

You are going quite substantially off topic, if you want to contribute to the switch2osm repo, that would be the best place to raise issues.

If you simply want a non-updatable set of tiles, there are many ways of providing that, but depending on the maximum zoom level you want to have tiles at, assuming raster tiles, you are going to need quite some space, so dynamically rendering high zoom tiles is what is typically done if you want the whole globe.

See OSM - Tile Calculator

4 Likes
I still have no idea on how much disk space is needed in order to
serve the whole world. The PBF file is under 100 GB, but that’s
before the import. I saw 1 TB SSD + 24 GB of RAM requirement being
mentioned indirectly, but it didn’t give me an impression that it’s
an absolute minimum requirement. Will it work with 16 GB of RAM and
500 GB SSD?

No, not world-wide. Even 1 TB is on the low end if you want regular
updates. And for RAM I don’t think it’ll work with less than 64 GB but
you could of course try.

My mental picture of a tile server was very different. I expected it
to be a simple set of small, equally sized raster images, organized
in a well-known hierarchy with no extra software on top, except for
a static web server. Is it possible to pre-generate all the tiles
and make it fully static?

Yes, it is easy if you can live with z0-12, doable for z0-13 and painful
for z0-14. See
https://tools.geofabrik.de/calc/#type=geofabrik_standard&bbox=-179,-85,179,85
for an idea of how many files and how much disk space you’re looking at
when hosting static tiles.

Consider that you’ll need some path to update these though, which will
likely then involve the whole Postgres setup that you tried to avoid.

I would gladly sacrifice a few zoom levels
in order to simplify this setup. I’m also thinking on self-hosting
or colocation as a way to cut costs in case the storage requirements
are significant.

You might want to look into vector tiles where you can generate tiles
for the whole world within a few hours (with e.g. “tilemaker”, no
database or auxiliary software required); tile rendering then happens on
the client side which means a little more complexity in the client.

4 Likes

That would be ideal, vector tiles are certainly the future, but I didn’t find any usable open source native Android libs last time I checked. Thanks for the links, I’ll try to pick a most suitable option

It’s directly related to my comment on option 3, which is suggested in the original post. I think it’s a good option (perhaps in combination with others), especially if it’s possible to reduce the friction, complexity and resource requirements of such a transition

Clearly there is a trade-off here :slight_smile:

The idea is that you get most things via your operating system’s update mechanism (something people should be doing anywhere).

To add to @woodpeck’s answer, before deciding that you want a map of everywhere, it’s reasonable to ask “how much detail do you want on that map” and (in the case of raster tiles) what “native zoom levels do you need to support”? If you want a background for an app I’m sure that you won’t want tiles showing every tree, stile, playground swing etc.

(and yes - the detail around this would be better in another thread - but asking these questions at all is a huge step further than some of the DWG’s complainants have managed)

2 Likes

So the problem is that currently some of the diffs are “bad” so that if you update at the wrong moment, or run replication with a delay, you’ll still get the vandalism, correct?

What if a data consumer could learn which of the diffs are “bad” so if they’re using a delay, they can avoid applying the change without also immediately applying the revert?

That would make it possible to produce a set of clean but delayed tiles.

I am no expert on this topic so I may be missing something here!

2 Likes

With the recent advances in vector tiles, it’s now possible for anyone to run their own global tile server for very very little money as noted above. Because of that, it’s no longer critical that OSMF provides tiles for the purpose of promoting access to OSM data. I don’t think we should just yank the rug out from people that aren’t expecting it, but if there’s a strong reason for ending or sharply curtailing the service, it would be a completely reasonable decision to work towards ending it.

2 Likes

Yes, if you use a delayed stream you just get the problems delayed.

Using a snapshot from “2 weeks” old also does not work as is:

So some functionality is needed to check for significant reverts and if a snapshot has a changeset by one of these significant reverts in it that is later reverted skip that snapshot.

1 Like

(apologies for getting into the weeds here, but)

You;ll get the vandalism shortly followed (seconds later) with the removal of that vandalism. You don’t have to render tiles during that process.

That’s not really going to be an option, since diffs re time-based, and any one diff that contains vandalism will contain good stuff too.

My impression before I got into mapping was that OSM was an ugly map, because of the standard tile scheme

As it’s purpose is QA, that’s no surprise – but additionally it’s not rendered in HiDPI mode such that text is blurry

So either the OSMF should go down a route to offer delayed, higher quality tilesets/vector maps, or stop the current service

There is definitely value in offering a free map service for people with little technical knowledge, so if more resources are required, it could be a consideration to cooperate more with Wikipedia that also benefit greatly from the free map data?