Or maybe if search fails with 503 error I should try other known to be not broken and check is Nominatim down altogether?
Or avoid any searches with Japanese/Chinese letters? (are other alphabets also affected?)
A 503 error has rarely to do with servers being overloaded. You usually get a 504 when that happens.
Nginx sends back a 503 when you are temporarily blocked for going over the rate limit of 1 req/s. And the Nominatim software itself sends a 503 back when a SQL request to the database takes to long. You can distinguish those two by looking at the body. nginx usually sends some elaborate HTML message. From the internal process you get a simple text back with the message ‘Query took too long to process.’ (unless you requested debug output in which case the debug output is still available in the body).
I agree that 503 might not be the best error code for the latter case because it is permanent more often than not. (It might sometimes happen because another query has put a lock on the database but that is really rare.) So what is a better error code? A simple 500? Or maybe 507 - Insufficient Storage ?
As for Chinese script, that is a known issue with logographic scripts without word spaces. Commas will certainly help. Also, don’t even try addresses that include unit or floor numbers. Nominatim cannot resolve those. Once you remove the 1&2楼 part of the query, you get a nothing-found response pretty quickly (which, too be honest, smells like a bug in Nominatim).
507 Insufficient Storage - HTTP | MDN is claimed to be describing temporary problems - though 400 series describe client problems and it is server-side inefficiency
Ah true, the case where the request has been in the waiting queue for too long. That is a 503. So it might not be completely trivial to change the error code for the other cases. I have a look.
Japanese addresses should be working okayish. We’ve added a split algorithm for them as part of GSOC 2023. In any case, Hirakana and Katakana are not an issue. It’s just with long sequences of Kanji/Han script. Korean might be another candidate for trouble but I haven’t had a chance yet to learn more about that.
In the meantime I’ve found the root cause for your failing query. The word statistics on the server are outdated after more than a year of applying updates. Updating them now.
Not sure if 507 is telling a normal user the right thing. They might consider a “storage” their fault, their hard disk or RAM. In my understanding, it might be something like too many requests in that corner of the network, which identifies as storage as well, too many payloads coming towards a server. Perhaps the user doesn’t want a detailed message, just needs to know they have to try again later, without a number.
this errors should be intercepted by data consumers and more clear ones shown
but API ideally would return to data consumer more clear feedback what went wrong to distinguish overall server problems from specific query being problematic