Power outage corrupted Overpass .idx files

An unscheduled power outage just took out my Overpass server. I thought I might have gotten lucky, but no. When I try to restart it, I’m getting “Unsupported index file format version” errors for all the .idx files.

Has anyone ever managed to recover from this state without downloading a fresh database copy?

3 Likes

I have encountered this and I’ve never solved it. I always re-download the database and deal with a temporary outage for my use case. I would love it if overpass were resilient to this issue.

3 Likes

I’ve never had exactly that issue. But I’ve often had weird errors like that. Like @ZeLonewolf, I just reimport the Overpass database and forget about it.

“Hit it with a big hammer”

Hi Kai,

An unscheduled power outage just took out my Overpass server. I thought
I might have gotten lucky, but no. When I try to restart it, I’m getting
“Unsupported index file format version” errors for all the .idx files.

If you still have then could you please send me the *.idx files, the
*.idx.shadow files and or run

for i in $DB_DIR/*.idx $B_DIR/*.idx.shadow; do { echo "$(basename $i)
$(hexdump -C  -n 1 <$i)";

and send me the result (this one is much smaller and could be pasted in
principle even here, opposed to the larger files themselves)?

In principle, the updates process is designed such that at any given
moment in time the *.idx or the *.idx.shadow files are a valid index,
and I would be highly interested in seeing a state where this is no
longer the case to harden against that.

3 Likes

Thanks for the offer to help!

Unfortunately, I don’t still have the files. But I will keep the offer in mind for next time. We’re on a 100 year old power circuit and although the local utility tries to keep it up, we do have unscheduled outages a couple of times a year.

So, I’m sure we’ll have another opportunity to debug this situation.

I am currently receiving the following error on overpass queries to my private server:

Error: runtime error: open64: 0 Success /osm3s_osm_base
Dispatcher_Client::request_read_and_idx::duplicate_query

Appreciate any tips to troubleshoot. I’ve preserved the .idx files (168M of them) for now, but I’m going to need to rebuild again to get operational.

Error: runtime error: open64: 0 Success /osm3s_osm_base
Dispatcher_Client::request_read_and_idx::duplicate_query |

I’m sorry for the late answer. The database files are fine, but this is
a new restriction policy intended for the public instances. Please turn
it off with a parameter --duplicate-queries=yes to the dispatcher.

Some background: there have been multiple occasions where people have
sent the exact same request from dozens of IP addresses from a public
cloud multiple times per minute or second.

As the public instances distribute resources per IP address, this can
substantially clog the public instances. Interrelating IP addresses from
the same C-net is not a solution, because in many cases they are used
independently and many cloud providers use multiple disjoint IP address
blocks.

Duplicate queries are for the moment being also accepted on the public
instances, as there is for the moment being enough capacity.

2 Likes

According to the error output, the correct switch is actually:

--allow-duplicate-queries=(yes|no): Set whether the dispatcher shall block duplicate queries.

1 Like

We had another power outage last night. This time I was able to recover and get Overpass back up and running. So far it looks like it’s in good shape.

At first I was having some trouble because the names of the lock files had changed and they weren’t being deleted properly. So I had to edit my scripts and in the process, I took the time to make them more robust.

The launch.sh script now checks to see if each of the processes are running before starting them. If any process dies immediately after launch, the script shuts everything back down. And launch.sh doesn’t delete the shared memory or lock files unless the associated dispatcher process isn’t running.

The shutdown.sh script is also more robust. I’ve cleaned up some of the scripting to better handle some edge cases.

I had posted the scripts in my diary, but I’ll be moving them to a wiki page sometime soon.

2 Likes

It would be great if you could turn all of this into a github project for administering overpass. It would help a lot with all the ad hoc scripting everyone is doing to keep overpass running!

1 Like

We already have one GitHub project for Overpass. If @drolbr would be interested in having some of this as contributions or combining it with existing scripts, maybe that would be a good place for it?

We already have one GitHub project for Overpass. If @drolbr
https://community.openstreetmap.org/u/drolbr would be interested in
having some of this as contributions or combining it with existing
scripts, maybe that would be a good place for it?

I’m highly enthusiastic to collect all the scripts that fix real life
problems.

I’d prefer to have them as separate scripts in pull requests for the
moment being to have all of them in parallel and then reconciling once
we understand the details. Otherwise, we risk to have some improvements
in apply_to_osc.sh, some in fetch_osc_and_apply.sh and none
containing all.

4 Likes

@drolbr I’m using the docker image v0.7.61.8 and the docker healthcheck seems to run into this error with the duplicate query - thus the container stays “unhealthy” and maybe gets restarted by the docker daemon. I didn’t find any option to allow the “–duplicate-queries” through an ENV var - maybe that one got lost? Is there a workaround?

UPDATE 17.12.23: created a pull request configure "allow-duplicate-queries" through env var / fix healthcheck script by lukey78 · Pull Request #116 · wiktorn/Overpass-API · GitHub