Looks a good idea to me! Does someone have more knowledge than me in parsing this specific data in Overpass?
This could be a starting point:
In keeping with the mood in this thread, you could add all the other quotation mark characters as alternatives to the ASCII double straight quotation mark.
Something like this I think: nwr[opening_hours!~"[\"âââ]"]
(I donât have them all on my keyboard)
This doesnât exclude tags such as opening_hours=opens on fridays
of course
Just now I could touch the data. Iâve manually checked these 37+44 elements, through MapRoulette, and I âfixedâ the dashes (also fixed OH where I could, although many complex stuff I didnât touch).
After checking all these elements, I can safely affirm that an automatic change would not negatively affect the data at all. Of course, as a first iteration, it was good to be more conservative, but in the future, an automatic edit can be used even for these more specific cases.
Now, next step is to write the wiki page with the proposal. When I have some free time again Iâll work on that and let you all know.
Initially I just want to work with opening_hours
tag, and I donât want to mess with those other dashes in other tags (eg. name
). Since I was having trouble with regexp and text editors, I lazily came up (AKA ChatGPT) with a very simple Python code:
Made some manual checks and seems to work. This also includes all dashes mentioned in a Wikipedia page mentioned here, so itâs an improvement.
I uploaded a changeset covering just Brazil, and you can find the wiki page here.
Anything else I should consider?
Out of curiosity, how many of the non-standard dashes are just regular time interval separators?
/\d\d:\d\d\s?NONSTANDARD\s?/\d\d:\d\d/
(or equivalently, weekday intervals)
If thatâs the wast majority of the cases, itâs perhaps worth it to just deal with those âsimpleâ cases mechanically, and review everything else.
Not sure if I understood properly but using a regex online service + latest output from Overpass, couldnât find any match. Did you take a look as well?
10 days since last message, so I performed the edits.
You can find all here (check latest edits from my user): Changesets by matheusgomesms-import | OpenStreetMap
It was a good exercise, was an easy fix that I think it will be valid to correct many POIs. Obviously there are many things still to be fixed, so a MapRoulette task can be used (local language knowledge required, though).
Also, what stood out was that in Japan there were some OH in wrong format due to different charset used there. A more focused task could also be done there regarding this (updating numbers chars and colons, for example).
I intend to perform this maybe every 6 months or once a year, letâs see in 6 months howâs the situation in OSM.
Here are some numbers on opening_hours
(plus collection_times
etc.). The data should be quite recent, not sure if your change is included in the numbers, though.
-
there is a total of 3502562 opening hours strings in OSM (not including opening hours strings within conditional strings)
-
96.10% are considered validÂč
-
2.99% are invalid but can usually be unambiguously parsed by a lenient parser
of the latter, here is a list (the number in front is the number of times that unique string appears in OSM):
Âč what exactly is considered valid differs a bit from parser to parser. E.g. the âreference implementationâ parser does not understand everything that is in the spec, while understanding other constructs that are not in the spec. In this case, considered valid by my own osm-opening-hours parser
FWIW, StreetComplete considers opening hours that can be unambiguously parsed but are invalid according to the spec as immediately due for re-survey, i.e. an opening hours quest is created. (And completely invalid opening hours strings anyway.)
When the user then acknowledges that the displayed times are still correct or edits the opening hours, a valid opening_hours
in canonical form is saved.
In general, the app asks if any opening hours are still correct once every year. This only works if either the shop hasnât been edited for at least one year or a check_date:opening_hours
with a date that is more than one year old has been set. I.e. the app wonât ask if the opening hours are still correct for most shops whose opening hours syntax you corrected just now for another year now.
6 months later, here I am again!
Interesting to see, according to Overpass there are now 526 nodes 144 ways and 1 relation with non-standard dashes. I expected a smaller number in 6 months, but it is what it is.
Does anybody oppose that I do another round of fixes? I believe the first one didnât break anything, on the contrary.
Well, I do wonder how these dashes are inserted in the first place. An n-dash or m-dash does usually not exist at all on any normal keyboard, right? So, maybe it is a certain app or automated import that is causing this. If this is the case, it would be more meaningful to root out the source of that.
The obvious downside of a mass edit is that the last-edit date of the elements are also updated, causing those elements to appear as if they are up-to-date to software that evaluates that (like StreetComplete) even if they are not.
Also, there are many more opening hours strings that are invalid according to the spec but still unambiguous enough that a lenient parser can understand them. From the numbers above, this would be about 100,000 opening hours. So, your edit would just fix less than 1% of these invalid but unambiguous opening hours.
From a visual inspection, I believe the main source is someone copying/pasting from websites. Some mappers just do this, others fix the syntax, but forget/canât see the different dashes (almost impossible to see this difference on iD, for example).
I also believe some languages use different chars, leading to this problem too.
Iâm not trying to fix all ambiguous OH, the idea here is to quickly fix non-standard dashes (right now Iâm taking more time to create this message than to perform the fix).
On a quick glance on Overpass, you can see some cases that OH will be COMPLETELY fixed by this quick fix:
- Node: âȘAmato's Auto Body⏠(âȘ12254761722âŹ) | OpenStreetMap
- Way: âȘMOTO-X NITERĂI⏠(âȘ1323688757âŹ) | OpenStreetMap
- Node: âȘUniversidade Federal ABC⏠(âȘ11948535806âŹ) | OpenStreetMap
- Way: âȘChangoMĂąs⏠(âȘ825271785âŹ) | OpenStreetMap
Doesnât StreetComplete use check_date
for that?
Other automated edits can be performed to fix other parts of Opening Hours (such as MO â Mo; Monday to Friday â Mo-Fr etc), but this fix is not intended to fix those cases.
Copy and paste from something on the internet would be my guess - but youâre right that âactually asking people about the sourceâ is the way to go.
Another source of these characters is that some operating systems helpfully convert typewriter punctuation to âsmartâ punctuation. iD explicitly disables the autocorrection feature in Safari on macOS and iPadOS, but both Go Map!! and Every Door insert smart punctuation by default on iOS. This mainly affects curly quotes, but you can also get an em dash by typing --
, for example.
After this new round of discussion, I did the fix again:
https://www.openstreetmap.org/user/matheusgomesms-import/history#map=2/19.8/10.5
547 nodes, 142 ways, and 1 relation were edited. It took me 30 min, because I was manually trying to upload smaller bboxes (some of them are not small and for sure Iâll receive some complaints). If not for that, it would take me about 2 minutes to edit all of them in the whole world.
Letâs see how it goes again in 6 months. Thank you all for the discussion!
You seem to have missed some.
There is a question about what to do with opening_hours:covid19
? Are they still a thing anywhere?
Have you tried following advise of @SimonPoole ? If some editors keep adding it - maybe it would be worth reporting to maintainers of that editing software?