Overpass area creation fails - Dispatcher_Client::request_read_and_idx::protocol_error

Hi all, I just set up a new overpass instance that is running fine - I can issue queries via my own instance of Overpass Turbo, minutely updates are working etc.

However I can’t get rules_loop.sh to create areas although I have two dispatchers running, using --osm-base and --areas respectively. The issue that keeps popping up is

runtime error: open64: /osm3s_osm_base Dispatcher_Client::request_read_and_idx::protocol_error

Looking at transactions.log, this is the query that fails (I’m using the unchanged rules from Overpass version 0.7.62.2):

2024-07-30 21:04:19 [1064317] requesting <?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<osm-script timeout=\"86400\" element-limit=\"4294967296\">\n\n<union>\n  <query type=\"relation\">\n    <has-kv k=\"type\" v=\"multipolygon\"/>\n    <has-kv k=\"name\"/>\n  </query>\n  <query type=\"relation\">\n    <has-kv k=\"type\" v=\"boundary\"/>\n    <has-kv k=\"name\"/>\n  </query>\n  <query type=\"relation\">\n    <has-kv k=\"admin_level\"/>\n    <has-kv k=\"name\"/>\n  </query>\n  <query type=\"relation\">\n    <has-kv k=\"postal_code\"/>\n  </query>\n  <query type=\"relation\">\n    <has-kv k=\"addr:postcode\"/>\n  </query>\n</union>\n<foreach into=\"pivot\">\n  <union>\n    <recurse type=\"relation-way\" from=\"pivot\"/>\n    <recurse type=\"way-node\"/>\n  </union>\n  <make-area pivot=\"pivot\" return-area=\"no\"/>\n</foreach>\n\n</osm-script>\n\n
2024-07-30 21:04:52 [1064317] Dispatcher_Client::request_read_and_idx::protocol_error /osm3s_osm_base 0 Success

I was wondering why the areas query tries to access the osm_base dispatcher … but maybe that’s by design and the areas dispatcher only comes into play once the area files are created.

Anyway, any ideas how to solve the issue?

Thanks, Ulf

Bumping this … maybe @drolbr has an idea how I might be able to fix the issue?

One more thing: I’m using Ubuntu 24 on arm64.

Bumping this … maybe @drolbr

Thank you.

Could you please have a look into database.log? That file should
contain a line that contains request_read_and_idx of process 1064317
where the 1064317 is the proc id seen in the excerpt of transactions.log.

Process 1064317 wants indeed to collect a reading permission from the
osm_base dispatcher to read the relations to make areas from.
The ::protocol_error there means that for process 1064317 it looked
like the dispatcher process for osm_base never answered. So it would be
interesting to know whether the dispatcher process for osm_base ever
attempted to answer to process 1064317 or not.

Thanks Roland for your swift reply!

These are the corresponding lines from database.log:

2024-07-30 21:04:19 [1064317] request_read_and_idx() start
2024-07-30 21:04:52 [1060771] hangup of process 1064317.

Looks like it’s rather 1064317 that “hung up” …

@drolbr any idea what I might check? Here’s some more context from the logs:

==> transactions.log <==
2024-08-05 15:21:10 [1997330] requesting <?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<osm-script timeout=\"86400\" element-limit=\"4294967296\">\n\n<union>\n  <query type=\"relation\">\n    <has-kv k=\"type\" v=\"multipolygon\"/>\n    <has-kv k=\"name\"/>\n  </query>\n  <query type=\"relation\">\n    <has-kv k=\"type\" v=\"boundary\"/>\n    <has-kv k=\"name\"/>\n  </query>\n  <query type=\"relation\">\n    <has-kv k=\"admin_level\"/>\n    <has-kv k=\"name\"/>\n  </query>\n  <query type=\"relation\">\n    <has-kv k=\"postal_code\"/>\n  </query>\n  <query type=\"relation\">\n    <has-kv k=\"addr:postcode\"/>\n  </query>\n</union>\n<foreach into=\"pivot\">\n  <union>\n    <recurse type=\"relation-way\" from=\"pivot\"/>\n    <recurse type=\"way-node\"/>\n  </union>\n  <make-area pivot=\"pivot\" return-area=\"no\"/>\n</foreach>\n\n</osm-script>\n\n

==> database.log <==
2024-08-05 15:21:10 [1997330] request_read_and_idx() start

==> transactions.log <==
2024-08-05 15:21:43 [1997330] Dispatcher_Client::request_read_and_idx::protocol_error /osm3s_osm_base 0 Success

==> database.log <==
2024-08-05 15:21:43 [1060771] waited idle for 7 cycles.
2024-08-05 15:21:43 [1060771] hangup of process 1997330.

Thanks a bunch, Ulf

I’m sorry for the late answer. I’m prospective that the point release v0.7.62.3 will address the “protocol error” as well.

As of August, I simply had not been able to reproduce the problem. So it got to a list of mysteries to keep in mind whether related things happen on other events. In the meantime, the are creation on the public instances has every now and then stopped and needed a manual restart without apparent reason. It turned out that both things might have a common root cause: requests that

  • do carry localhost as the requesting client’s address and
  • get not the necessary resources to run

are turned away with “protocol error” instead of “timeout” (or “query rejected”). After reviewing the use cases for requests on localhost, it makes sense to let such requests bypass the resource control instead.

Some special cases exist where this is undesirable: there is a public instance where a proxy hides the ip address from the Overpass component such that the Overpass component can not evenly distribute resources over clients. But in that case the bypass for localhost can still be disabled, making localhost requests at least subject to the deadlock protection. The error message is also improved, but getting a better error message only is not a justification for the hassles of an update.