Intermittent 404 errors on tile server with renderd master and slave

Hello,

For my firm, I need to set up an OSM environment. I have 4 servers under Debian 12: postgresql, tiles-master, tiles-slave, and webserver. On webserver, there’s only Apache with a map under Leaflet.

My concern is about the tile servers. When I heavily use the map via Leaflet, I get 404 errors for some tiles. If I reload the browser, the missing tiles appear immediately.

On tiles-master, there’s Apache, mod_tile with renderd, and Mapnik. The configuration of renderd is:

[renderd]
stats_file=/run/renderd/renderd.stats
socketname=/run/renderd/renderd.sock
num_threads=6
tile_dir=/var/cache/renderd/tiles

[renderd1]
stats_file=/run/renderd/renderd1.stats
socketname=/run/renderd/renderd1.sock
num_threads=6
tile_dir=/var/cache/renderd/tiles1
iphostname=tiles-slave
ipport=7654

[mapnik]
plugins_dir=/usr/lib/mapnik/3.1/input
;font_dir=/usr/share/fonts/truetype
font_dir=/opt/src/openstreetmap-carto/fonts
font_dir_recurse=true

; ADD YOUR LAYERS:
[default]
URI=/
XML=/opt/src/openstreetmap-carto/mapnik.xml
HOST=tiles.myserver.com
TILESIZE=256
MAXZOOM=20

On tiles-slave, there’s renderd and Mapnik. The configuration of renderd is:

[renderd1]
stats_file=/run/renderd/renderd1.stats
socketname=/run/renderd/renderd1.sock
num_threads=6
tile_dir=/var/cache/renderd/tiles
iphostname=tiles-slave
ipport=7654

[mapnik]
plugins_dir=/usr/lib/mapnik/3.1/input
font_dir=/opt/src/openstreetmap-carto/fonts
font_dir_recurse=true

; ADD YOUR LAYERS:
[default]
URI=/
XML=/opt/src/openstreetmap-carto/mapnik.xml
HOST=tiles.myserver.com
TILESIZE=256
MAXZOOM=20

According to the logs:

  • tiles-master generates tiles and transmits requests to tiles-slave

  • tiles-slave receives the requests and correctly generates tiles in the cache:

Got incoming request with protocol version 3
avril 09 19:24:03 tiles-slave renderd[535]: Got command Render fd(12) xml(default), z(15), x(20637), y(12426), mime(image/png), options()
avril 09 19:24:03 tiles-slave renderd[535]: START TILE default 15 20632-20639 12424-12431, age 0.00 days
avril 09 19:24:03 tiles-slave renderd[535]: Rendering projected coordinates 15 20632 12424 -> 5195271.938490|4833266.172531 5205055.878110|4843050.112151 to a 8 x 8 tile
avril 09 19:24:04 tiles-slave renderd[535]: DONE TILE default 15 20632-20639 12424-12431 in 1.102 seconds
avril 09 19:24:04 tiles-slave renderd[535]: Creating and writing a metatile to /var/cache/renderd/tiles/default/15/0/83/0/152/136.meta
avril 09 19:24:04 tiles-slave renderd[535]: Sending message Done to 12
avril 09 19:24:04 tiles-slave renderd[535]: Sending render cmd(3 default 15/20637/12426) with protocol version 3 to fd 12
  • tiles-master generates several errors of this type:
Sending render cmd(3 default 15/20640/12428) with protocol version 2 to fd 11
avril 09 19:33:19 tiles-master renderd[2298]: Data is available now on 1 fds
avril 09 19:33:19 tiles-master renderd[2298]: Failed to read cmd on fd 11

What I don’t understand is how mod_tile on tiles-master access the files generated on the disk of tiles-slave? There’s very little documentation on using two tile servers, one in “master” mode and one (or several) in “slave” mode.

Thank you for your help, maybe my choice of architecture is not optimal.

What are ModTileRequestTimeout and ModTileMissingRequestTimeout set to? Have a look at the end of Manually building a tile server (Debian 12) – Switch2OSM . I’d expect that (a) you’ll want to pre-render low zoom tiles to avoid 404s and (b) you’ll get 404s if there really isn’t a tile yet. Tweaking those apache settings tells the client to wait a bit longer.

Thansk for the reply. I’ve modified this parameters like this:

ModTileRequestTimeout 30
ModTileMissingRequestTimeout 30

I followed the Switch2OSM tutorial you mentioned, but it’s designed for a single tile server, not a configuration with one master and one slave. Some tiles are pre-rendered as explained in the tutorial, but 404 errors occur when I quickly browse the map. If I reload, the previously missing tiles appear correctly. I suspect that tiles from the slave renderd cannot be accessed or served by mod_tile. Perhaps Apache send a 404 before the tile is ready.

It will send a 404 if it can’t render the tile before the timeout (assuming there isn’t an old dirty tile to send) but it will still carry on rendering the tile so next time you ask you will get it if rendering has finished by then.

3 Likes

That’s what I thought indeed, seeing as they appeared immediately upon the browser’s reload. What I understand less and can’t trace for certain is how the tiles generated by the slave server “arrive” at the master server. I’m having trouble making the connection between the logs and the files present in the cache.
Thank you for your help anyway. :slightly_smiling_face: