This discussion in OSM-US Slack’s #local-maine channel points to the USPS recommendation on stripping special-characters from addresses. I’m currently working on an E911 address import for Vermont and the state’s E911 dataset also excludes all special characters in the source data. The USPS also prefers all-caps and abbreviated St/Ave/Rd/etc but in OSM we change case and expand abbreviations to improve completeness, legibility, and text-to-speech use-cases. Are apostrophes, hyphens, and periods any different?
While the USPS may not like special characters on mailing labels, many roads may rightfully contain these on their street signs with possessive apostrophes (e.g. Gordie's Path, Wissler's Lane), proper-name apostrophes (e.g O'Brien Farm Road), proper-name initials (e.g. F. Scott Fitzgerald Square), proper hyphenated surnames in a street-name (Robert Baden-Powell Avenue*, or place-names hyphenated in the road name (Derby-Springfield Highway*). Stripping these special characters when forming addresses to USPS guidelines is trivial compared to figuring out when and where to add them in automatically.
What do you think? When should we add special characters to addr:street=*?
Always add if written language context would imply them, even if they are dropped from signage for compactness.
Only add them if they appear on street signage. (or maybe other local government data sources such as parcel data)
Never add them — adhere strictly to USPS addressing guidelines.
Something else?
* Made up name, but I’d bet that similar things exist.
I think it makes sense to import it without trying to add the special characters. If it’s good enuf for USPS & Vermont GIS then it should be good enuf for OSM. Often the names are not consistent between various sources, & the signs anyway, especially special characters. There’s a creek near me called Green, or Greens, depending on the source, but I’ve never seen it called Green’s.
We kinda already make our own rules though right? We enjoy the ‘better’ casing (ex: McCown) and name expansion (ex: “FOO RD” → “Foo Road”).
My preference is for including these things as it feels like a natural extension of the choices we’ve already made about name representation in OSM. But I understand that’s a vibe based argument not a technical one.
I spent a considerable amount of time looking at address tagging in United Kingdom*. Based on that, my take on it is that the value in addr:street should be the same as the name tag value on a nearby highway=* feature. So if you expand out for highway name tags, then do the same for addr:street tags.
As my experience is from a different country, feel free to take or ignore my view as you please.
In my local county, Baltimore County, population 800k, a decision was made to also strip all punctuation from official county address data and from all signage. The following rules were applied
Any possessive apostrophe, e.g. Elliott’s Road, would simply drop the possessive apostrophe, e.g., Elliotts Road.
Any hyphen was dropped for a space
Any other apostrophe was dropped and the word would be smashed together. There are only two instances of this: L’Hirondelle Rd (French) became LHirondelle Rd on signage and M’Lady Rd (think, ‘my lady’, medieval style) became MLady Rd.
Since OSM follows what the signs say, or what’s observed, I think it would best to model these based on how the local jurisdictions decide to sign their streets. I believe FHWA guidance is to have mixed case signs (Elliott Road not ELLIOTT ROAD).
I wonder if it makes sense to treat decisions like this as typography/database-design decisions or to treat them as actual official renaming decisions of the “this road shall forever be known as …” type.