Proposal to replace documented syntax for telephone extensions

Limiting these statistics to the North American Numbering Plan Area, where PBX extensions are common in the real world, we can see that the informal syntax is the most common syntax, about twice as common as the standard E.123 format (in English, French, and Spanish):

Syntax Example Prevalence
Informal North American +1-905-688-5550x3369 183
E.123 +1-978-750-1900 ext. 4309 94
Escaped semicolon +1-802-464-1100\;ext=4697 66
DIN 5008 +1 410 235 8744-5 1

This syntax is spread across multiple metropolitan regions, suggesting organic usage. It’s prevalent across English- and French-speaking regions, though we have only one example from a Spanish-speaking region:

If we make the period in “ext.” optional and allow the plural form “exts.”, then the E.123 format rises to 150 occurrences, putting it in a close second, with a few occurrences in the Caribbean.

Although the E.123 format is standard globally, it’s too fragile and not as familiar. There are even more occurrences of this syntax if we allow for other typos. So we would have to be very fussy about how to format the “ext.” token in each language, only to turn around and tell data consumers they should accept any language’s abbreviation in that position.

I’m leaning toward documenting the unofficial x123 syntax as the preferred syntax for POIs within the NANPA, alongside the existing allowance for extra dashes in this same region. The guidance about DIN 5008 in German-speaking regions would remain unchanged. Are there any objections to this regionalized approach?

1 Like

So you’re saying if I click on phone number using such extension in say OsmAnd or CoMaps, they will deal with that correctly (i.e. call only a base number, and then ask me to dial the extension? Or something else which works?)

It seems to me Key:phone is not documented for either case (or at least quick skimming of page did not find “Phone Extension(s)” section. So it would seem to me:

  • actual existing popular usages should be documented on wiki (regardless if they were “recommended” or not). According to the numbers, it is currently only DIN what has more than a thousand uses (but E.123 is close).
  • popular apps at least should widely support those popular usages. If the support for existing tags is missing in apps, then I would prefer recommending E.123 worldwide, as it is both human and machine readable (as opposed to both DIN and Informal North American, which are totally confusing to anyone not experienced with them).

For automated PBXs, the phone is supposed to pause for a second or two before dialing the extension. Dialing only the base number could lead somewhere so generic as to be unhelpful. Dialing the extension immediately after the base number will have the same effect, because the public system ignores extra digits. Otherwise, if the office still has a human operator, anything dialed after the pause will be ignored and the user will have to speak to them anyways.

OsmAnd and CoMaps both have custom logic for converting a phone=* value into a tel: URI (which is what iOS requires for placing a call). As far as I can tell, both would mishandle every possible syntax for specifying an extension. The good news is that it’s a simple fix: replace x with ,, and exempt , from the step that strips non-digits.

For more robust phone number formatting and parsing, these and similar applications could migrate to any popular library, as long as we tag either x123 or ext. 123 (E.123) syntax. (Maybe DIN syntax also works for German numbers; I have no idea.)

The phone=* documentation currently recommends “the ITU-T E.123 and the DIN 5008 pattern”. Back when I started this discussion, it also recommended a nonstandard \;ext= notation inspired by RFC 3966 (but not actually compatible with it). I’ve since removed that recommendation.

For some reason, the contact:*=* documentation is more explicit, specifying:

phone and fax number format (DIN 5008): +<country_code> <national_destination_code> <subscriber_number>-<direct_inward_dialing>

Some mappers have interpreted this as specifying DIN’s DID notation for PBXs too. So we have German-speaking mappers familiar with this syntax occasionally tagging an unusable syntax in North America based on a global English wiki page. The global statistics I posted earlier have an important caveat: 83,928 looks like a huge number, but it technically isn’t possible to distinguish DIN’s DID notation from the hyphens that North Americans predominantly use (consistent with E.123). If we exclude hyphens in the NANPA from this figure, it drops down to just 5,353.

Given the questions about expected behavior in this thread, I’m still not really sure if PBX is really practically the same thing as the “direct inward dialing” that we document with DIN notation, so I am comfortable with recommending a different standard for a different technology.

For me, it’s really a tossup between E.123 and the x notation even in North America. I’m leaning toward the x notation based in part on very positive feedback in OSMUS Slack but can still be swayed toward E.123. The main consideration to me is how we’d ensure that mappers adhere to this very finicky format that’s almost an afterthought in the standard. The E.123 standard is explicitly for human readability, not machine-readability:

This Recommendation applies specifically to the printing of national and international telephone numbers, electronic mail addresses and Web addresses on letterheads, business cards, bills, etc. Regard has been given to the printing of existing telephone directories. The standard notation for printing telephone numbers, E-mail addresses and Web addresses helps to reduce difficulties and errors, since this address information must be entered exactly to be effective.

As OSM gains traction on mobile platforms, data consumers increasingly parse phone=* instead of just leaving it verbatim. This makes E.123 problematic overall and especially for its extension notation. However, the standard remains somewhat relevant, because phone=* started out as a human-readable key, and “ext.” and its many variants do have broad use in OSM. If we do standardize on E.123 for PBXs, editors and data consumers would definitely have to adopt a rigorous parsing library instead of rolling their own custom logic. This includes not only mobile applications but also osm-website and tag2link (used in JOSM, Overpass turbo, and others).

Far and away the worst choice here would be the DIN standard with a hyphen separator. Machine readability is limited because the hyphen is skipped over entirely, not adding the requisite pause when dealing with an automated PBX. Human readability is also compromised because the use of a hyphen separator directly conflicts with the commonplace practice in North America for separating area codes, local exchange codes and subscriber codes from each other (XXX-XXX-XXXX). (Which is why PBXs skip over the hyphens
)

It’s pretty much totally unfamiliar for anyone outside of—at the very least—Europe, if not only Germany.

1 Like