This discussion in OSM-US Slack’s #local-maine channel points to the USPS recommendation on stripping special-characters from addresses. I’m currently working on an E911 address import for Vermont and the state’s E911 dataset also excludes all special characters in the source data. The USPS also prefers all-caps and abbreviated St/Ave/Rd/etc but in OSM we change case and expand abbreviations to improve completeness, legibility, and text-to-speech use-cases. Are apostrophes, hyphens, and periods any different?
While the USPS may not like special characters on mailing labels, many roads may rightfully contain these on their street signs with possessive apostrophes (e.g. Gordie's Path, Wissler's Lane), proper-name apostrophes (e.g O'Brien Farm Road), proper-name initials (e.g. F. Scott Fitzgerald Square), proper hyphenated surnames in a street-name (Robert Baden-Powell Avenue*, or place-names hyphenated in the road name (Derby-Springfield Highway*). Stripping these special characters when forming addresses to USPS guidelines is trivial compared to figuring out when and where to add them in automatically.
What do you think? When should we add special characters to addr:street=*?
Always add if written language context would imply them, even if they are dropped from signage for compactness.
Only add them if they appear on street signage. (or maybe other local government data sources such as parcel data)
Never add them — adhere strictly to USPS addressing guidelines.
Something else?
* Made up name, but I’d bet that similar things exist.
I think it makes sense to import it without trying to add the special characters. If it’s good enuf for USPS & Vermont GIS then it should be good enuf for OSM. Often the names are not consistent between various sources, & the signs anyway, especially special characters. There’s a creek near me called Green, or Greens, depending on the source, but I’ve never seen it called Green’s.
We kinda already make our own rules though right? We enjoy the ‘better’ casing (ex: McCown) and name expansion (ex: “FOO RD” → “Foo Road”).
My preference is for including these things as it feels like a natural extension of the choices we’ve already made about name representation in OSM. But I understand that’s a vibe based argument not a technical one.
I spent a considerable amount of time looking at address tagging in United Kingdom*. Based on that, my take on it is that the value in addr:street should be the same as the name tag value on a nearby highway=* feature. So if you expand out for highway name tags, then do the same for addr:street tags.
As my experience is from a different country, feel free to take or ignore my view as you please.
In my local county, Baltimore County, population 800k, a decision was made to also strip all punctuation from official county address data and from all signage. The following rules were applied
Any possessive apostrophe, e.g. Elliott’s Road, would simply drop the possessive apostrophe, e.g., Elliotts Road.
Any hyphen was dropped for a space
Any other apostrophe was dropped and the word would be smashed together. There are only two instances of this: L’Hirondelle Rd (French) became LHirondelle Rd on signage and M’Lady Rd (think, ‘my lady’, medieval style) became MLady Rd.
Since OSM follows what the signs say, or what’s observed, I think it would best to model these based on how the local jurisdictions decide to sign their streets. I believe FHWA guidance is to have mixed case signs (Elliott Road not ELLIOTT ROAD).
I wonder if it makes sense to treat decisions like this as typography/database-design decisions or to treat them as actual official renaming decisions of the “this road shall forever be known as …” type.
For various reasons besides punctuation and case, street names in addresses can differ from the street names that are usable for wayfinding. For example, the houses along this street use “Delhi Road” and “Delhi Pike” interchangeably in addresses, while the signs leave the situation quite ambiguous. Another common case involves a road that carries a numbered route; the standard address may read “US Highway 44 W” while the signs refer to “U.S. Rte. 44”, “U.S. 44”, or simply “Hwy. 44”.
I think addr:street and the street’s name should be accurate with respect to each feature. Whenever there’s a mismatch, I tag the street’s alt_name so that a geocoder will hopefully match the street and address anyways.
The MUTCD preference for mixed case is relatively new; we’re still in a national period of transition from uppercase.
Street name signs are an important factor but not necessarily the final word on the subject. For one thing, some jurisdictions drop the suffix from all their roads’ street name signs as a matter of policy, even if they acknowledge the suffix in writing and speech. (Maps do this too to squeeze more labels in tight spaces.)
The MUTCD also prohibits periods, apostrophes, and hyphens from guide signs, regardless of the road’s actual name. Thus, John F. Kennedy’s name is signposted as “John F Kennedy”. But whenever an abbreviation is warranted, retaining the period is useful to data consumers, particularly those that send the street name to a text-to-speech engine. This is more relevant to name than addr:street, but geocoders do benefit from exact matches between the two keys, and the USPS has very robust systems to ensure that periods in addresses do not delay the delivery of snail mail.
Somewhere there should be a definitive source for what the name of a road is, and whatever that source says is what goes in the name tag. I wouldn’t put much stock in USPS, or, in the case of Illinois, anything the 911 standards suggest.
If, for example, the subdivision plat says “Zook’s Nook” (that’s a real example!), then I would expect OSM to show name=Zook's Nook. I don’t think it needs to be more complicated or conditional than that.
I’ll admit, though, that finding the original source of a street’s name isn’t always straightforward. Try to dig a little deeper than parcel data, though. Our county’s assessment database strips special characters and is in all caps, so the displayed street names on parcels are not representative of the legally recorded names.
In the county I worked, that county was indeed the authority. However, there was a regular mailing list where any changes/updates/adds would be sent to a plethora of external street name/addressing database maintainers. Our list included:
State Highway Admin
Verizon
Power Company
USPS
Cable TV provider
Google Maps
Each of these had their own version and our mailing list was designed to keep them all on the same page.
I’d add in those quotation marks, if only to add to the ever-growing list of “Things software developers incorrectly believe they can assume about addresses and street names.”
No joke, this is one of the reasons signmakers used to employ quotation marks, to distinguish “I” from “1” and “N” from “north”. If the quotation marks only appear on certain signs, another interpretation would be not:name – just as long as it’s recorded somehow so we can chuckle at it in the future.
A more typical, modern reason for quotation marks is when a street is named after someone who goes by a nickname.