Proposal to replace documented syntax for telephone extensions

Last November, the phone=* syntax was unilaterally redefined to accommodate telephone extensions and more theoretical details. The changes are technically problematic in several respects. I propose to undo this change and welcome the community’s ideas on what to replace it with.

History

Since 2008, the documentation for phone=* has recommended formatting the value according to the International Telecommunication Union’s E.123 standard for human-readable phone numbers. The documentation also noted that separating groups of digits by spaces is consistent with the German DIN 5008 standard for German-language publications, and that separating groups of digits by hyphens is consistent with RFC 3966. (This was a bit of a non-sequitur, because E.123 tolerates hyphen separators according to national norms, while RFC 3966 is a standard for machine-readable tel: URIs.)

In the years since, quite a few data consumers have added support for both syntaxes, but there’s been a lot of confusion about how to tag PBX telephone extensions. Although the documentation never mentioned it, E.123 recommends a decidedly non-machine-readable syntax:

To show an extension number of a PABX without direct in-dialling, the nationally used word or abbreviation for “extension” should be written immediately after the telephone numbers and on the same line as the word “telephone”, followed by the extension number itself.
[…]
Example 2: Telephone international +22 607 123 4567 ext. 876

In 2009, contact:*=* was first documented with an example requiring the DIN 5008 syntax for telephone extensions:

+<country_code> <national_destination_code> <subscriber_number>-<direct_inward_dialing>

Unfortunately, this syntax conflicts with E.123’s tolerance for hyphen-separated digits.

Last November, in response to a question about whether it’s a good idea to tag a telephone extension, the following parameters were added to the phone=* documentation, ostensibly borrowed from the RFC 3966 standard:

Parameter Description Example
\;ext= Extension phone=+1 859-255-0270;ext=2
\;isub= ISDN subaddress Irrelevant to OSM?[1]
\;phone-context= Emergency number or other service code (e.g., 9-1-1 or 4-1-1 in North America) None[2]

As far as I know, this syntax was added to the discussion without a proposal or informal discussion beforehand.

As of writing, the \;ext= syntax has seen some uptake but not nearly as much as the alternatives:

Syntax Example Prevalence
DIN 5008 +49 3831 2681-0 83,928
E.123 +43 1 71613 ext. 0 756
Informal North American +1-905-688-5550x3369 174
Escaped semicolon +34 96 352 54 78\;ext=4298 151

Problem

Of all the possible syntaxes we could’ve chosen for telephone extensions, \;ext= is the least intuitive and least interoperable syntax:

  • Unlike the rest of the phone=* syntax, it is designed for machine readability at the expense of human readability.

  • It conflicts with both the ext. syntax from E.123 and the - syntax from DIN 5008.

  • It cherry-picks a part of RFC 3966 that suffers from poor support in mobile operating systems – the very operating systems that matter for this key. Instead, the industry best practice is to use just a comma or semicolon, for example, tel:+15555555555,2.

  • The semicolon has long been documented as separating multiple phone numbers, so a backslash was added to escape the semicolon, even though the longstanding general syntax for multiple values uses ;; as the escape sequence. Thus this edit also introduces a novel escape syntax for the semicolon that is valid only in phone number keys but nowhere else.

As far as I can tell, \;ext= is currently unsupported by OSM data consumers. The main OSM website is unable to detect the value as a phone number. Overpass turbo fails to remove the backslash, resulting in a malformed URI.

Both Organic Maps and OsmAnd remove the \;ext=, causing iOS to treat the extension as part of the subscriber number. In North America, the public telephone system will ignore these extra digits, so the user will have to enter the extension manually. On the other hand, in a country with variable-length subscriber numbers, the additional digits may connect the user to an altogether different line elsewhere in the country.

Certainly, we could write off these issues as mere bugs, not our problem. But honestly I think these are symptoms of a poorly thought-out redefinition of the longstanding phone=* syntax.

Proposal

The documentation for phone=* should stop recommending the \;ext= syntax. These edits should be undone.

Since there’s clearly a need for telephone extensions in phone numbers, we should come up with something to replace \;ext=. Here are some options that come to mind:

  • Refer mappers to the E.123 notation, ext. or the local translation, expecting data consumers to equate any xzy. with ext. and extract the extension number.
  • Refer mappers to the E.123 notation, ext., but require a particular language such as British English rather than allowing it to be localized.
  • Allow an actual tel: URI for any phone number that can’t be represented by E.123 notation, such as phone=tel:+1-888-828-4798,2334.
  • Choose some other delimiter that doesn’t already have a special meaning, such as a comma (which would be consistent with tel: URIs) or x (a very common North American notation).

Note that the hyphen specified by DIN 5008 would not be viable as a global recommendation, as it conflicts with the hyphens used throughout the North American Numbering Plan Area as part of E.123 notation. However, as a practical matter, I’m not currently proposing to deprecate DIN 5008 notation in German-speaking regions.

(FYI to those who have participated in related discussions or touched the relevant wiki documentation recently: @bkil @Kovoschiz @Mateusz_Konieczny @user_5589.)


  1. In theory, an ISDN subaddress would be analogous to an extension. However, I have never seen a shop, office, or other POI advertise an ISDN subaddress as its public point of contact. ↩︎

  2. Both emergency:phone=* and emergency_telephone_code=* are typically set to the number verbatim, without any attempt to conform to a broader scheme. ↩︎

9 Likes

Here I have a quick question: I am German and therefore not very familiar with the American telephone system: Do I understand correctly that the telephone numbers there are meant in such a way that I call +22 607 123 4567 and then have to say or enter the number 876 again to reach the valid line when I am connected?

I don’t feel that way. Because an extension system is generally unusual in Europe. Here, every telephone has an internal number. So the extension number can be reached directly from outside. A company therefore doesn’t have 1 external telephone number, but instead books 10, 100 or 1000 telephone numbers, etc.
You have the main number e.g. +49 391 54 XXXX (example Magdeburg city administration) and the direct dial-in to the connection directly in the telephone number. If you now dial +49 391 54, you would arrive at a completely different subscriber and not at the city administration. (If the number is taken)

That’s how it works in the NANPA (most of North America). If you don’t pause for a couple seconds before entering the extension, then the public phone system will simply drop the rest of the numbers. This is called “overspell”, and it’s how some businesses get away with advertising phonewords that are longer than seven or ten digits.

I understand that the regions with this system seem to be mutually exclusive of the regions that use hyphens in written phone numbers. However, the problem is primarily that mappers and software developers have been misled over the years.

The wiki documentation calls the extra digits after the hyphen “direct inward dialing”, but DID is not exclusive to the situation you describe. A business in the U.S. can set up DID, but the caller is still required to pause before dialing the extension. From the caller’s perspective, it’s no different than a PBX that doesn’t use DID. So at the very least, that portion of the format needs to be described differently.

Anyhow, that’s tangential to my proposal to remove and replace the current guidance about \;ext=.

1 Like

I agree that the change to the Wiki should be reverted, but coming from Germany, what you’re trying to achive just isn’t a thing here, so I don’t have a strong opinion on it, other that, strictly speaking, the extension doesn’t seem to be an actual part of the phone number from what I understand, so maybe phone:ext=* could be used instead of adding things like commas or text to the phone-field. Please correct me if I’m misunderstanding this.
The reason, why I am proposing this, is that the phone number seems to be working even without the extension, but you reach some sort of “central hub”, correct?

4 Likes

Possibly, but if the business sets up DID, then it might be the difference between calling a national hotline and a local landline that might have its own (hidden) ten-digit number.

Practically speaking, if an office’s sign indicates an extension, then we have to assume that the extension is required for contacting the office. By analogy, we set website=* to the full URL to the store location’s webpage, not the homepage of the store chain (brand).

We already use addr:*=* subkeys to make mailing addresses more structured, but I don’t think we should extend this approach to phone=*. For all these years, mappers and software developers have had every reason to believe that phone=* is self-contained. They would definitely be surprised if we redefine phone=* to be just part of a phone number.

Definitely, but data consumers could also be surprised if the allowed characters now included letters instead of only digits, + and -. If the extension was in phone:ext, old libraries would still parse the phone=*-part fine, and new ones would be able to infer the full number.

But as I said: I’m not affected by this, just wanted to add some thoughts. The whole thing sounds a bit “hacky” to me :slight_smile:

1 Like

Unrelated: “;;” can’t be used as universal escape sequence, because there are tags that may have a list of values that may be empty (e.g. destination tagging).

This seems like the best option to me.

That might be the best option, if we figure out that there are more special cases like this somewhere in the world.

2 Likes

It’s an awkward situation, to be sure: we insist on making phone=* human-readable to aid in data entry but then turn around and attempt to parse it as if it’s machine-readable. Ideally, we would standardize on tel: URIs for all phone numbers and rely on editors and data consumers to pretty-print the URIs using off-the-shelf libraries. There are off-the-shelf libraries for detecting and parsing phone numbers, too, but they reject this exotic \;ext=* syntax.

In case it’s any comfort, an extension is normally written as part of a NANP phone number, as either “1-555-555-5555 ext. 123” or “1-555-555-5555 x123”. Phone number libraries already handle these notations without any problem. I would illustrate this point with a photo of either notation on a shop sign, but extremely few shops would have an extension in their primary phone number. That’s a more common practice among obscure offices.

1 Like

Unrelated: “;;” can’t be used as universal escape sequence, because there are tags that may have a list of values that may be empty (e.g. destination tagging).

you can use „;;;;“ for these :wink:

4 Likes

Here are some examples of popular software libraries for detecting, parsing, and formatting phone numbers:

These libraries are used very widely across the software landscape and would be very unlikely to add any special affordances for an OSM-specific format. Some of them have live demos that you can use to prototype potential syntaxes.