Waukesha County, Wisconsin address points import

It’s a good idea (although it may be odd to see lots of building=yes + building:use=residential). I’ll take note of the potential for using this in the future, as at the moment I’d rather focus my attention to addresses.

I couldn’t find one. Would this help in any way that couldn’t be done by comparing with the surrounding road names? I could ask the data owner if it has enough of a benefit to us.

I guess a lot of this work has already been done, so it’s moot now. But maybe we missed something, so if they have documentation handy, it couldn’t hurt.

You are very welcome. Thanks for considering feedback. Of course some are more than “formatting errors”

That is great, but sometimes there is no street associated with the address. For example:

addr:city=Pewaukee
addr:housenumber=N30W27391
addr:postcode=53072
addr:state=WI
addr:street=Green Dragonfly Isl

Of course all require investigation as to local use.

I have looked further into this and indeed in the original source data there are two address points at the same location (or as near as I can tell). I suggest they both be left in the data, and upon import, perhaps separated by a couple of meters for cartographic purposes. It would be interesting to talk to local authorities (or do some field work) to determine why this is done. Perhaps these are duplexes and there really are two separate residences in the same building?

I have investigated these

In the first case, it seems the StreetNumber does not appear in the Full_Address, but it is available in its own field, so it could have been pulled from there:
image

In the second case the Full_Address is obviously invalid, yet there is a StreetNumber and a PostOffice that should have made it into “cleaned_addresses.osm”:
image

The third case is like the first, the Full_Address does not contain the StreetNumber, but all of the necessary information is present elsewhere:
image

In the forth case there doesn’t seem to be anything odd about the source record and yet in “cleaned_addresses.osm” there is only addr:state and addr:postcode:
image

I think this is unlikely. If we look at a more typical address point and compare the StreetNumber to the Full_Address, the StreetNumber is always the grid identifier for the street, followed by the house number.

As I mentioned in another reply, there are two address points in the source data at the same location (as near as can be determined visually) here. I think we agree that, baring some insight from local authorities to the contrary, both should be represented in the final data to be imported - although that is an imperfect solution. Here is the source data for one of them (I have outlined the relevant parts in red):
image
My contention is that this should be tagged in OSM as:

addr:housenumber=W179N11766
addr:street=Medinah Court
(plus other relevant tags)

Here is the source data for the second one:
image
My contention is that this should be tagged in OSM as:

addr:housenumber=N117W17890
addr:street=Augusta Court
(plus other relevant tags)

This approach is similar to how the address to the northwest was handled by the importers. Here is the original source data for this address:
image
Here is how this address was tagged in the proposed import data:
image

In fact there are 74,657 addr:housenumber tags that have a similar format (JOSM query “addr:housenumber”~“[1][0-9]+[NSEW][0-9]+.*$”) in the proposed import data out of a total of 163,290 addresses.

So my suggestion as to how the first two addresses should be tagged is consistent with the way the rest of the data in the proposed import is being handled.


  1. NSEW ↩︎

Sorry, I was mistaken in thinking the import was generally mapping the Site_Number field to the addr:housenumber key rather than including the whole StreetNumber. This makes a lot more sense now. The USPS ZIP code lookup tool recognizes “W179N11766 MEDINAH CT” (though I can’t tell from aerial and street-level imagery whether these residences even get mail delivery). In terms of formatting, is it more correct to smoosh the two coordinates together, as the USPS does, or separate them by a space or dash, as seen in this article?

Both Medinah Court and Augusta Court are in a condominium development, which could explain the overlapping address points. (Ignore the links in this press release; the original development’s website has been repurposed for a similarly named development elsewhere.) In my neck of the woods in California, these are called “air parcels” and some GIS departments keep them in a layer separate from address points per se. I guess the ideal for OSM would be to scoot each address point toward the corresponding entrance, disregarding actual property rights, but obviously that isn’t something a bulk import could be expected to do in a first pass.

I don’t have a strong opinion (other than to be consistent within Wisconsin), but given the source data, the USPS data, and the way the proposer of this import has formatted the data, it seems like “smoosh” is the way to proceed.

It also recognizes “N117W17890 AUGUSTA CT”, so best to include them both, and as you suggest, when possible, move them apart and toward what we assume to be their applicable entrance.

In this case would it be not addr:street=Green Dragonfly Island but rather addr:place=Green Dragonfly Island?

Usually I’ve seen it written here it’s usually just smooshed together.

  1. I am not sure of the consequences to the data consumer of that approach. Will most consumers of address data in the US expect only addr:housenumer, addr:street, addr:city, addr:postcode, and addr_state? Maybe it is ok, or even preferred, idk.

  2. In “cleanned_addresses.osm” you had this as:

addr:city=Pewaukee
addr:housenumber=N30W27391
addr:postcode=53072
addr:state=WI
addr:street=Green Dragonfly Isl
(addr:street not addr:place)

  1. Either way, while the QC tools might flag this, it will be easy for someone doing the import to dismiss the flag, as there is no street, but that false positive masks the real issue of “Isl” not being expanded to “Island”.

  2. If you do want to use addr:place, you should search for addr:street tags that end in “Isl”, “Island”, and “Plaza” (and perhaps others).

  3. Incidently, the USPS doesn’t seem to recognize the address in question, perhaps it is just for E911 purposes (which is fine, just pointing that out).

Anyway, from my point of view, with just a few important changes this import should be ready to go.

Unlikely. I think we should think of keys like addr:street and addr:city as mapping to different slots in a prototypical address, rather than thinking too hard about the semantics of the values. Some addressing systems have a slot that normally isn’t a street name, or that comes before or after a street name. addr:place might be more applicable and useful to addresses in Puerto Rico, which differ from mainland addresses, but I don’t have a good sense of how it’s being used there.

That is a good way to put it @Minh_Nguyen. A data consumer will probably expect addr:street rather than addr:place for an address in CONUS regardless of whether addr:street corresponds to an actual street.

I finally got around to fixing the issues which were pointed out. The “tracts” as well as cleaned_addresses.osm.gz have been updated. A few notes:

I opted for “US Highway” rather than “United States Highway” after seeing that the USPS lookup tool only accepts “US Highway”.

In this case it appears that the OSM was in error (at least USPS seems to prefer it without “Court”). I left a note so someone can see what it says on the sign.

After investigation, it appears that both OSM and the dataset were wrong, the sign says “Campbell Trace” on Bing Streetside.

I saw a lot of these in my Milwaukee import, where a building on a corner would have two addresses on two different streets depending on where the entrance to each unit was located.

I investigated this as well, it turns out a leading space messed up the parsing. In any case, I added these 4 back manually.

I also uploaded the file with address points which were excluded based on the comments field (comments like “not sure if valid” etc.):

Thanks for looking into these.

So are you keeping both, if not, how are you deciding which one to keep? If you are keeping both, will you displace them a few meters in the direction of the street to which the addresss refers?

Good detective work! Perhaps also open a ticket with the parser software. A leading (or trailing) space shouldn’t break things.

Sorry for not clarifying, I kept both offsetting slightly for workability into the direction of the road they are associated with.

Agreed. There’s no need to aggressively expand abbreviations that no one would need to expand in a practical use case. (Another example is middle initials: we can keep naming the bridge “John F. Kennedy”.) I suspect TIGER expands it to “United States” because its base name field is case-insensitive; US would create ambiguity between “U.S.” and “Us”. The TIGER import papered over this nuance by title-casing the name fields.

The USPS aggressively eschews punctuation for similar databasey reasons. I personally prefer to add periods to abbreviations wherever they’re missing. OSM geocoding results can be embedded in prose, e.g., “The trace then heads down US Highway 14,” and unlike TIGER or the USPS, we represent street names as unstructured, freeform text where individual words become more ambiguous. Data consumers can more reliably strip out punctuation than add it in themselves. Whether to include periods in “U.S.” is a matter of personal taste, and the popular style guides contradict each other on this point.

Obviously this is the least of your concerns, but don’t feel pressured to strip out punctuation just because the USPS doesn’t use it.

I think I’ll keep it without the periods, partially out of personal preference, partially out of the current trend that the usage with addr:street=* seems to be in favor of this formatting: addr:street | Keys | OpenStreetMap Taginfo . But it may make sense to open a different topic either here or on the US Slack to see if there is any value in normalizing this to one formatting or another.

1 Like

I think that’s fair. This is an address tag, anyhow, not a name tag on a roadway. A geocoder ought to be able to match addresses to streets flexibly enough to account for this distinction. It’s the “United States” spelling that’s the real outlier.

I finally got around to getting started on the import. Feel free to contribute by taking one or more tracts (see the “Workflow” section of the wiki page for instructions and put your name next to the tract you are working on)

Did a section this afternoon. Looking forward to doing more in the near future!