Phone number formatting and proposed automated edit

I have created a website to validate and fix phone numbers in OSM data. Sweden was recently added by @HrCalmar

It has found 8085 numbers (34% of all values) which are invalid or not formatted ‘correctly’.

These can be easily seen and fixed from the website itself.

I have a bot edit running in a number of other countries to automatically fix the simple cases. I would like to propose enabling that in Sweden, however first a discussion is needed on the target formatting. A lot of the issues detected are because of hyphens being used as a separator. This is not the standard format (per the wiki) globally or nationally, although some countries do prefer using hyphens.

Let me know what you think of the changes suggested on the website, if any adjustments need to be made. Additionally, please discuss and vote on the target format.

  • Spaces as separators
  • Hyphens as separators
  • Consider either all spaces or all hyphens as valid
  • Consider any combination of spaces and hyphens as valid
  • No separators in a number value at all
0 voters

At first I voted for “spaces as separators”, but a second later my thought was.. “this is a display format, which should be handled on the consumer end, not be stored in the database”

Any thoughts about that?

2 Likes

Ideally, numbers should be entered without any formatting in the database and any user-facing interface can decide how to display them rather than force a specific format.

3 Likes

Definitely a valid thought, hence the other options in the poll. I suppose it is good to have a convention. Since both hyphens and spaces are only really used as separators within a number, it doesn’t make much difference (unlike a forward slash, for example).

My opinion is that one of the principles of OSM is that it should be mapper-friendly. And it is much more difficult to look at something like +4670346949 and work out which digit was omitted, as opposed to looking at +46 70 34 69 49.

It is trivial for data consumers to reformat a number and display it how they like, and since a lot of the data in OSM has spaces and/or hyphens in phone values, data consumers need to be able to deal with that. The standard library (used as part of the backend here) for parsing and formatting numbers deals with common ways of entering numbers.

To answer the follow-up question of why do anything, it’s to then see the edge cases more easily, like duplicate numbers or the wrong separator being used between multiple numbers. The website suggests fixes for these, but they would not be done by the bot.

1 Like

I think that the numbers should be formatted so the consumer/presentation layer don’t have to implement rules for formatting. This is especially valid for applications serving multiple countries.

If an application wants to its own formatting it can easily remove the existing formatting and apply its own.

5 Likes

All phone numbers I have added to OSM, I have added with +46 prefix, but without any separators (spaces or dashes). This is how phone numbers should be stored in any database, and is in fact how they are stored in the database for the contacts list on your phone. It is exactly because formatting has been added to the database, edits like this is even needed. I probably seen four or five competing ways to store phone numbers here in Sweden in OSM.

Ideally, OSM would also store phone numbers without formatting, but if we cannot convince data consumers (and OSM editors) to format the numbers themselves, which we realistically cannot do unless the acceptable ways to store phone numbers for that tag is changed, I guess all spaces as separators as shown in “Föreslagen åtgärd” in the first picture is the best solution. That is how my phone formats numbers in its UI at least, so it is kind of a standard I guess.

Yes, and it’s bad. I find it incredibly jarring to read e.g. 060 (as in the example) with +46. My phone does that. I’d much more prefer to have it presented as written by the business.

However we choose to format phone numbers, I don’t think we can avoid having the country prefix in there. OSM is an international database after all, and many users of OSM use it when traveling in different countries, and they need their app to call the number in the right country if they choose to call it. In fact, the wiki site does not permit any other way to store phone numbers than with initial country prefix (eg +46). And one cannot infer phone number country prefix from within what country boundary the POI is in, this was really only possible back when landlines was the norm, but today you can buy a SIM card with any country prefix.

With that said, ideally, what is in the database is not what you will read. The app you use is supposed to present the data in a reasonable way, just like how opening hours aren’t presented like “Mo-Fr 9:30-16:00; Sa 11:00-15:00” to the end user, even if that is what is stored in the database. Instead eg Organic Map will say, “Opens in 23 minutes”, or “Closed today”. And even editors like StreetComplete gives a proper UI to edit it, since typing it manually is error prone. I don’t see why phone number is any different.

3 Likes

It looks like this is all of the feedback and voting we are getting for now.

I wouldn’t like to proceed without consensus and I see the arguments for both sides.

I find this argument compelling, combined with the fact that it would be unfeasible to change editors and data consumers to store and expect no formatting for numbers.

At least OSMAnd and Organic Maps display phone numbers exactly as they are in the database, at present.

Additionally, note that phone values would only be changed by the bot if they were ‘invalid’ in some way, such as missing a country code or having other symbols there. No changes are made if the only difference is addition or removal of spaces. So if you map a number as +4660147170 then it would not be touched by the bot.

I have to say that… its probably easier for consumers of phone number data to STRIP spaces, than to find a rule to add them back in, since everyone does it differently, and swedish numbers dont all have the same length :slight_smile: I also agree that its easier for the user to edit osm with spaces.

My position is that standardisation is good, even if its just a little bit.

5 Likes

As a programmer myself, I feel like this isn’t a matter of opinion, but a matter of fact. It should not be stored with any spaces. Country prefix is fine of course.

I know this is a community decision and I obviously don’t actually get to make the decision, but really, it does not make sense to have the spaces in the DB. And like @Nadja15 said:

opening hours aren’t presented like “Mo-Fr 9:30-16:00; Sa 11:00-15:00” to the end user, even if that is what is stored in the database

A DB is just not meant to contain formatted data…

2 Likes

I completely understand where you’re coming from, but that isn’t where we are with phone numbers at the moment:

  • Many/most apps do display them exactly as stored to users
  • Many mappers enter numbers with spaces or other formatting
  • Most editing software does not validate entry of phone numbers

Again, any numbers entered without spaces would not be touched by the bot, if they are valid. This will only touch values with an issue of some sort.

I’m all for that data should be data and formatting is something else BUT I have not seen a single OpenStreetMap tool that formats phonenumbers so to remove the spaces in the data would to me be making entering and using this information way worse. If a tool wants to format there is one extra line of code to remove the spaces before formatting in all languages that I know of.

1 Like

It seems most OSM users are not programmers then :) phone | Keys | OpenStreetMap Taginfo

No fields in OSM are encoded as machine-readable data. Its all just key values. Just because the key says phone doesnt mean the value field is a numerical field.

Regardless, it’s not enforceable. Spaces or no spaces, I think standardising so that we don’t also_ have dashes or whatnot is of some value here. I’ve looked through many of the suggested edits, and all the suggestions i saw were sensible and helped standardisation. It also seems to me that all the invalid phone numbers are typos and various mishaps.

4 Likes

Although the vote is in favour of spaces as separators, it is only marginal. I would therefore like to know if those who voted for having no separators would be totally opposed to the bot edit proceeding and formatting with spaces, bearing in mind the above arguments for including spaces.

Bear in mind that this would only apply to numbers without a country code or with other symbols (hyphens, brackets etc.). So it would not change a number if it only meant adding or removing spaces. i.e. feel free to map numbers with a country code and without spaces and they would not be touched by the edit.

The current situation of doing nothing does not seem ideal as it pleases no-one.

Amen to that

Personally I’d prefer no separators, but applying proper formatting like you propose is preferable to doing nothing

3 Likes

Thank you, with no further feedback I therefore plan to enable the bot with formatting as proposed.

1 Like

The bot has started its work, leaving only the interesting cases left on the website.

321 invalid numbers, of which 161 have a suggested fix.

How would we best help getting the remaining ones fixed too, the ones the bot cannot fix?