Rerun TIGER Name Expansion?

I recently stumbled upon a handful of road names that still had the beginning:

  • N. = North
  • E. = East
  • W. = West
  • S. = South

or ending with things like:

  • Ave. = Avenue
  • Ln. = Lane

Like these name:


It seems like in 2012, there was a bot ran to mass clean up some of this.

See the:

Any possibility something like this can be spun up again and ran across the entire US to normalize these road names?


Note on # of usages: I did a simple search in Taginfo, and saw a few dozen leftover in Pennsylvania / Florida.

Unsure how many are still there across the entire US, maybe a few hundred or a few thousand. (Most names I quickly saw only seemed to have about 1–4 hits.)

I don’t know about the bot but I’ve been using some JOSM validator rules to do pretty decently large bulk cleanup of things like this. I have a repo with some ones I’ve written that I think are pretty robust at this point. Here’s the one for directional prefix expansion. There are a few others in that directory that are also worth including if you’re wanting to do some “bulk” cleanup.

Combining those with an overpass query like below should get you pretty far pretty quickly.

[out:json][timeout:25];
(  
  way[highway][name]({{bbox}});
);
// print results
out body;
>;
out skel qt;
1 Like

For context, back when we did this, we were a much smaller community. We’re still spread pretty thin, but nowadays we have several validators that could be configured to flag likely abbreviations. Scripts like @watmildon’s can help streamline the process of fixing them, though I just do it manually when I see it. It’s usually just a trickle from newer mappers, not a big deal. Most data consumers reabbreviate the words, mitigating the impact of any temporary brevity.

One reason we’re less likely to rerun the script universally is that we’ve had to fix many false positives like North E Street, E-Business Way, and St. Clair Avenue and would have to fix them again.

3 Likes

I am also happy to fix things for folks… just let me know rough bounds and the kinds of issues and I’m happy to chew through it. I have been cursed with the knowledge of how to do this relatively quickly at high accuracy so perhaps that is my burden.

1 Like

Looks like a lot could be newer users also.

And then, clicking around, looks like they accidentally introduced entire clusters of “shortened names” around the area too.

For example:

but nearby, there’s also:

  • Old Somers Cv.
  • Pine Timber Pt.
  • Cedar Shake Ct.
  • Grotto Pl.
    • A tiny bit further E.

These were quickly found by mostly just going to:

typing in the shortened ones, sorting alphabetically, then seeing what was there and clicking on the Overpass links to take a closer look.

I suspected someone out there had a much more robust prefix/suffix search, so would be able to mass find/expand these things much faster than a manual one-by-one. :slight_smile:


Side Note: Looks like some of these may have accidentally leaked into addr:street too, probably from users using ID’s auto-suggestions of nearby streets. But that can be a separate fix. :slight_smile:

Good news! My validators also patch up addr:street!

At the top of each one is an overpass query that can help identify things it should be fixing. Here’s the really beastly one targeted at PA… it takes a while to run but coughs up a ton of things around the state that could be patched up.

2 Likes

The original TIGER name expansion made use of tags such as tiger:name_direction_prefix . All of those present have already been applied, and there should be no opportunity to confidently run a mass expansion.

Many of these are easy to spot and fix with manual inspection, but there are many which would result in an incorrect directional if encountered while reviewing a series of manual expansions one by one.

@watmildon Is this based on your fantastic:

I was inspired by that post, thinking there was some sort of mass “Expansion Validator” that could just be run.

I know that there was a lot of work with (Python) tools trying to correct/expand/normalize Address data before imports…

So I thought something along those lines could be run on some of these streets with unexpanded prefixes/suffixes! :stuck_out_tongue:

I’ll definitely have to poke around that massive overpass query you put together. And boy oh boy, that mess looks even bigger than I expected…


Note: I did spot one minor issue while skimming the map.

The rule:

  • Cres.$

accidentally matches correct road names with fully-spelled out:

  • Crest

It also botches “Mall” for the same reason and I’m not sure why the extra dot is in there. Hmm, I’ll have to poke around and get that fixed. This query is generated by the C# in this repo.

It’s fortunately super quick to scan a long list of names and identify issues this this and only have the fixup run on the ones that are needing expansion.

1 Like

I should make sure we’re all on the same page about my validators and fixups: The goal of the validators is to save you lots of typing. I haven’t made any real effort to beat out common false positives or do anything really clever… which would be hard given the tools you have using this validator syntax.

But if you’re willing to scroll big lists of things, oh my does this save you a lot of typing.

The rule:

  • Cres.$

accidentally matches correct road names with fully-spelled out:

  • Crest

@watmildon I’ll do you one better, even: I know exactly what the issue is, and it’s an easy fix: all of those .s in the rules need to be \.. Oldest regex slip-up in the book :laughing: and easy to miss, because . matches any single character… including a literal . character.

3 Likes

Oh goodness. Thank you!! I stared at this for a while and didn’t even see it. Lol.

The validator generator has been updated here and the new validator file has been put on github here.

1 Like

I noticed that one too.

According to USPS’s list:

The short form of “Mall” is “MALL”… lol. So you can probably get rid of that check (or only search for accidental ALL CAPS versions still, but ignore the Title Cased version).

Yep, exactly. And the computer already knows things like:

  • BLVD = Boulevard

If it was just one button push to correctly expand that, that would be easier.

I suspect more people would accidentally make typos trying to manually type that full word out!

(That’s usually one where I just rely on the red squiggly and Right-Click and say “Yep! That’s what I meant!” :stuck_out_tongue:)

Yep, exactly.

Even just seeing it alphabetically sorted in taginfo made things pop right out. Tying that to a map you can scan / click on, even better! :slight_smile:


In JOSM, Coloured Streets lets you visually see:

  • Roads with no name
    • Get a glaring red highlight + “name?”
  • Houses with a missing # / Street Name.
    • addr:housenumber / addr:street
    • Get a glaring red highlight + “number?” / “street?”

Each road / house also gets its own unique matching color. So if you spot:

  • “a green house” on “a pink road”
  • “a row of all green houses” + “a purple one”
  • “a blue” chunk of road on an otherwise “red street”…
    • Accidental split/name issue!

it makes it very easy to visually “scan the map” for any oddities.

(That 3rd type was key for spotting weird/different road names!)


Combine that with MapWithAI’s JOSM validator… where if you have houses with different/misspelled addr:street on them:

  • “Addresses are not nearby a matching road”

and it alerts you that something funny is going on.

That’s also how I was able to more methodically catch unexpanded oddities like:

  • E. 2nd Street

or hundreds of accidental mistakes like:

  • Example Rd → Example Road
  • Example Road ↔ Example Drive

But it required address data first, THEN as an accidental side effect I’d get the weird road names pointed out! :stuck_out_tongue:


Note: I actually first noticed this because I spotted an old split road based on Coloured Street’s colors.

It was one of those numbered roads in a grid-based city that isn’t just one solid vertical line straight through, it’s multiple distinct pieces with large gaps in the middle where houses are…

Looks like the original TIGER data was something like:

  • E. 18th Street

The bot swooped in back in 2012, but only expanded 1/3 of the pieces:

  • East 18th Street

so there were 2 orphaned "E."s still left over… untouched in all these years!

(So where there was 1… I suspected this could’ve been a larger issue.)

1 Like

Is there a way to do a MapRoulette challenge by state? Spun that up for South Dakota but had a hard time flipping back and forth between tabs (am very tired and have the attention span of a squirrel that fell into a vat of coffee)

1 Like

MapRoulette supports using an Overpass query to create a challenge. @mvexel has a video about it here. LMK if you need help getting it set up or want something reviewed.

My usual workflow is to load the data into JOSM and scroll around running the validator on a chunk at a time or to use the ToDo plugin and go item by item.

2 Likes

Happy to help @SD_Mapman
I would love to see this in MapRoulette.

2 Likes

So after a couple false starts, I think I got it uploaded properly:

South Dakota Name Expansion - Challenges - Browse - MapRoulette

To make these for other states, I think you would just need to change the area ID in “area(id:3600161652)->.a”.

1 Like

The tiger name expansion was comprehensive. Any street that had an unmodified TIGER name with prefixes or suffixes un-expanded had them expanded.

This was expanded already. I expect if you check in TIGER the street’s name is “N. Michigan”.

The name in TIGER for this street was D-172

2 Likes

Okay, I took your query and poked around some more in Overpass Turbo.

The regular expressions Overpass uses are a little weird, but if you want to match “an actual period”, you can’t just use the normal backslashed:

  • \.

but you have have to use:

  • \\.

My “N.E.W.S.” Searches

1. This Overpass Turbo link finds all streets with:

  • N.
  • E.
  • W.
  • S.

at the very beginning or end of the name.

2. This Overpass Turbo link finds all addresses (addr:street) with:

  • N.
  • E.
  • W.
  • S.

at the very beginning or end.

Usage Note: All you have to change is that Texas line to your area/state. :slight_smile:


Overpass Turbo Regex Note: Depending on what you want, you could also adjust:

  • \\.

by adding a QUOTATION MARK after it:

  • \\.?

in order to find “0 OR 1 PERIODs”, which would help catch mistakes like:

  • S Red Leaf Road
  • S Coulter St

but would come at the expense of a lot more false positives. For example, it looks like a ton of street names in Texas might actually be called:

  • Street E
  • E Street

A lot more to look through, but a lot more weird mistakes to fix too! :slight_smile:

Personally, I would do a pass squishing all the PERIOD versions first, then go looking through all the NON-PERIOD streets/addresses later. :slight_smile:


Watmildon’s Searches (Adjusted)

3. This is @watmildon’s massive list of PERIOD prefixes/suffixes, and searching all the street/addresses with the fixed \\. trick I mentioned above:

4. This is the original massive list, condensed with the “ONE OR MORE PERIOD” \\.? trick:

I assume this one should run a little bit faster, since it eliminates 2 of the nw/nwr lines. :slight_smile:


Raw Searches

Here’s the 2 basic searches:

The N.E.W.S. Street Names
[out:json][timeout:300];
{{geocodeArea:"Texas, United States of America"}}->.a;
(
  wr[highway][highway!=platform][highway!="bus_stop"][amenity!=shelter][!bus][name~" N\\.$| E\\.$| W\\.$| S\\.$"](area.a);
  wr[highway][highway!=platform][highway!="bus_stop"][amenity!=shelter][!bus][name~"^N\\. |^E\\. |^W\\. |^S\\. "](area.a);
);
out body;
>;
out skel qt;
The N.E.W.S. Addresses
[out:json][timeout:300];
{{geocodeArea:"Texas, United States of America"}}->.a;
(
  nwr["addr:street"~" N\\.$| E\\.$| W\\.$| S\\.$"](area.a);
  nwr["addr:street"~"^N\\. |^E\\. |^W\\. |^S\\. "](area.a);
);
out body;
>;
out skel qt;

Adjust the pieces as needed. :slight_smile:

1 Like

I don’t know about that… at least looking at the TIGER 2020->2023 layer in ID… many of the ones I’m coming across have spelled-out “North” instead of the current “N.”.

(But I only quickly skimmed a few dozen. I didn’t extensively go looking behind the History of every single street.)

Were they originally somehow based on the:

  • tiger:name_direction_prefix

instead? And if the street was missing a TIGER prefix during that 2012 expansion, it may have accidentally missed them?

Anyway, no big deal now that we have the working Overpass query. These should be much easier to find/fix now. :stuck_out_tongue: