The addr:*=* subkeys generally correspond to distinct pieces of information that can be assembled into a full address. The main USPS address format has only one “slot” for the postal city, which is generally tagged as addr:city=*, regardless of what the postal city refers to. Nevertheless, some imports have introduced other address subkeys that data consumers may not expect as part of a U.S. address.
There are 6,326 occurrences of addr:hamlet=*, addr:town=*, addr:township=*, or addr:village=* in the United States. The vast majority of them are in Indianapolis and Cold Springs, California.
To the extent that these tags occur without addr:city=*, I suspect it would be safe to simply rename the key to addr:city=*. However, it would probably be a good idea to investigate each region in depth to make sure these tags weren’t derived from administrative boundaries, which could make them misleading as postal cities.
There are 439,573 occurrences of addr:county=* in the United States, concentrated in several states and particularly in several counties within those states.
At a glance, most of these addr:county=* occurrences amount to something that we would’ve tagged as is_in=* in the olden days. Would there be an appetite for removing them en masse?
Tidying up is good and keeps new folks from getting the wrong idea. I support the bulk edits as described (removal of hhaddr:county, review and bulk move of the others)
You sort of mention this, but it’d be interesting to see if these are mostly dual-tagged with addr:city or have the other tag instead, and if it’s the former, whether the tags have the same value.
Moving addr:hamlet, addr:town to addr:city makes sense to me.
But I think I would also check to see if there is a ZIP code tagged. For each ZIP code the USPS typically gives the preferred name and some alternate names. If the hamlet, village, or town name is one of the names the USPS associates with that ZIP code then I think moving the value to the addr:city tag would be pretty safe.
I support this change. I ran an overpass query on Lowndes County, GA for the county tag and there’s only 2 results, both libraries.
These were recently added from the institute of museum and library services by @AriannaL so I suspect other libraries ingested by this import have the addr:county tag. Perhaps Arianna will have more thoughts on this.
Interesting, was there an Aldi import at some point? I have a gut feeling that most of these occurrences are from imports, where there might’ve been a thought to map each field of the external dataset to an OSM key, one for one.
I’m missing the full context of the is_in=* deprecation - what is the benefit of removing the addr:county=* tags? The accurate ones, I mean, not the typos of addr:country=*.
@watmildon and I have gone back and forth a bit over addr:country I generally don’t see the harm of leaving them.
is_in=* was deprecated many years ago when it became redundant to the first generation of OSM-based geocoding software. Some subkeys like is_in:state=* and is_in:city=* remain on route relations, but only for disambiguating otherwise confusable routes. (In other words, those keys affirm the intended scope of the relation.)
addr:county=* could potentially mislead data consumers to think that the address can contain this information. Unlike with addr:country=*, inserting a county name into the address would likely cause mail delivery issues, unless the county happens to serve as a postal city (possibly in some rural areas). The USPS isn’t the only addressing authority, but county addressing authorities are unlikely to consider the county part the address either. Similar interoperability issues could occur when using OSM address data with other systems.
I noticed the erroneous usage of these keys after iD added support for addr:town=*. That key is legitimate for some countries like South Korea where the address format has a spot for the town as opposed to some other place. It doesn’t appear in iD when editing a feature in the U.S., but some mappers using the raw tag editor might still encounter the key due to its global prevalence.
Between you explanations and some digging through USPS documentation (like this) I can only conclude that addr:county=* does indeed merit an automated cleanup, your plans seem solid, and your effort would be clearly beneficial and appreciated. Happy mapping!
I grew up 2 miles from the Wyoming border and have traveled extensively in the northern high plains, and I’ve never seen that anywhere. I mean, sure, putting “Crook County, WY 82721” (for example) might work if you get the zip code right but USPS will find the nearest city to use as the postal city, even if it’s 60 miles away.
Counties as postal cities definitely aren’t the norm, but I’m not confident about ruling them out completely, because there are nearly 80,000addr:city=* County occurrences. Are all of them erroneous?
After looking at the map of that Overpass query (very slow!), a lot of these appear to have come from an erroneous McDonald’s import. For the others I saw, none look right to me. Given the nature of how USPS assigns zip codes, a mechanical edit would be kinda hard here, I would feel more comfortable doing it manually.