Waukesha County, Wisconsin address points import

In this case would it be not addr:street=Green Dragonfly Island but rather addr:place=Green Dragonfly Island?

Usually I’ve seen it written here it’s usually just smooshed together.

  1. I am not sure of the consequences to the data consumer of that approach. Will most consumers of address data in the US expect only addr:housenumer, addr:street, addr:city, addr:postcode, and addr_state? Maybe it is ok, or even preferred, idk.

  2. In “cleanned_addresses.osm” you had this as:

addr:street=Green Dragonfly Isl
(addr:street not addr:place)

  1. Either way, while the QC tools might flag this, it will be easy for someone doing the import to dismiss the flag, as there is no street, but that false positive masks the real issue of “Isl” not being expanded to “Island”.

  2. If you do want to use addr:place, you should search for addr:street tags that end in “Isl”, “Island”, and “Plaza” (and perhaps others).

  3. Incidently, the USPS doesn’t seem to recognize the address in question, perhaps it is just for E911 purposes (which is fine, just pointing that out).

Anyway, from my point of view, with just a few important changes this import should be ready to go.

Unlikely. I think we should think of keys like addr:street and addr:city as mapping to different slots in a prototypical address, rather than thinking too hard about the semantics of the values. Some addressing systems have a slot that normally isn’t a street name, or that comes before or after a street name. addr:place might be more applicable and useful to addresses in Puerto Rico, which differ from mainland addresses, but I don’t have a good sense of how it’s being used there.

That is a good way to put it @Minh_Nguyen. A data consumer will probably expect addr:street rather than addr:place for an address in CONUS regardless of whether addr:street corresponds to an actual street.

I finally got around to fixing the issues which were pointed out. The “tracts” as well as cleaned_addresses.osm.gz have been updated. A few notes:

I opted for “US Highway” rather than “United States Highway” after seeing that the USPS lookup tool only accepts “US Highway”.

In this case it appears that the OSM was in error (at least USPS seems to prefer it without “Court”). I left a note so someone can see what it says on the sign.

After investigation, it appears that both OSM and the dataset were wrong, the sign says “Campbell Trace” on Bing Streetside.

I saw a lot of these in my Milwaukee import, where a building on a corner would have two addresses on two different streets depending on where the entrance to each unit was located.

I investigated this as well, it turns out a leading space messed up the parsing. In any case, I added these 4 back manually.

I also uploaded the file with address points which were excluded based on the comments field (comments like “not sure if valid” etc.):

Thanks for looking into these.

So are you keeping both, if not, how are you deciding which one to keep? If you are keeping both, will you displace them a few meters in the direction of the street to which the addresss refers?

Good detective work! Perhaps also open a ticket with the parser software. A leading (or trailing) space shouldn’t break things.

Sorry for not clarifying, I kept both offsetting slightly for workability into the direction of the road they are associated with.

Agreed. There’s no need to aggressively expand abbreviations that no one would need to expand in a practical use case. (Another example is middle initials: we can keep naming the bridge “John F. Kennedy”.) I suspect TIGER expands it to “United States” because its base name field is case-insensitive; US would create ambiguity between “U.S.” and “Us”. The TIGER import papered over this nuance by title-casing the name fields.

The USPS aggressively eschews punctuation for similar databasey reasons. I personally prefer to add periods to abbreviations wherever they’re missing. OSM geocoding results can be embedded in prose, e.g., “The trace then heads down US Highway 14,” and unlike TIGER or the USPS, we represent street names as unstructured, freeform text where individual words become more ambiguous. Data consumers can more reliably strip out punctuation than add it in themselves. Whether to include periods in “U.S.” is a matter of personal taste, and the popular style guides contradict each other on this point.

Obviously this is the least of your concerns, but don’t feel pressured to strip out punctuation just because the USPS doesn’t use it.

I think I’ll keep it without the periods, partially out of personal preference, partially out of the current trend that the usage with addr:street=* seems to be in favor of this formatting: addr:street | Keys | OpenStreetMap Taginfo . But it may make sense to open a different topic either here or on the US Slack to see if there is any value in normalizing this to one formatting or another.

1 Like

I think that’s fair. This is an address tag, anyhow, not a name tag on a roadway. A geocoder ought to be able to match addresses to streets flexibly enough to account for this distinction. It’s the “United States” spelling that’s the real outlier.

I finally got around to getting started on the import. Feel free to contribute by taking one or more tracts (see the “Workflow” section of the wiki page for instructions and put your name next to the tract you are working on)

Did a section this afternoon. Looking forward to doing more in the near future!

The import is now complete. Thanks for everyone’s help! :grin:

1 Like

Wahoo! Great work. I look forward to whatever your next adventure is. Hopefully I’ll have some more time the next round!