TIGER data quality

You should be, and you’d know because you’ve looked into this pretty deeply. My heuristic for separating wheat from chaff is:

  1. highway=residential
  2. No name=*
  3. existence of tiger:cfcc and / or tiger:reviewed=no
  4. Last touched date long ago (I usually query for <2012)

There is a bunch of “advanced” overpass queries that have more elaborate criteria, @Minh_Nguyen would know where to find these.

2 Likes

My last experience with this was in 2009 when I asked for all of Greene County, Ohio, to be deleted – because it had been imported twice, every road duplicating another copy of the road without any connections between them. I had already made many edits in the area and had haphazardly deleted many roads from one or the other import, but it only took me a couple months to recover. I don’t think we could’ve done something that clean in 2011. Granted, Greene County is much more developed than some of the counties in Utah where you map.

Here are some Overpass queries for unedited TIGER ways. As @Richard points out, there are many false negatives because of driveway editing and such. The public Overpass instance can’t handle querying for TIGER unedited ways beyond a small area.

A couple years ago, I developed some SPARQL queries to determine the most deserted TIGER desert counties, and even refined it down to individual ZIP codes. Unfortunately, Sophox is no longer reliable for these queries, but I made this snapshot in 2020 that might still be useful.

3 Likes

This is an Overpass query I’ve used in the past that trades off reasonable speed for reasonable accuracy:

rel(161993);map_to_area->.a; // state of utah
way // consider ways that...
  [highway=residential] // are residential 
  [!name] // do not have name tag
  ["tiger:cfcc"] // have tiger:cfcc tag which was created as part of the import
  (if:timestamp() < "2013-01-01T00:00:00Z") // have timestamp before 2013
  (area.a); // are in the defined area
out meta geom; // output geometry and metadata

This yields 2176 ways. If you leave out the highway=residential criterion it’s up to 8000+.

There’s possibly an argument that some of these highway=residential roads should be automatically retagged as highway=road - which itself is effectively a fixme tag.

NM is probably the state with the most un-road-like A41s IME. Some of the geometry in WV is pretty shocking though…

3 Likes

I can see that it might make sense to remove untouched TIGER in ‘wilderness’ areas, but even then by someone in the region who has familiarity.

But locally it would make no sense to take any sort of automated or mass action. I’ve been going over counties and setting the surface and geometry where not hidden under trees, using the US Tasking manager. Even that requires knowledge of regional road construction, soil, and any possible better local authoritative sources of road names and geometry. I’ve also found that the commercial ‘driveway mappers’. have often taken the time to improve roads nearby where the original TIGER was wonky.

4 Likes

Welcome to the new forums, @MikeN !

As a compromise, I think it is fesaible to do MapRoulette challenges for smaller areas. I discovered an interesting dividing line at the Navajo Nation boundary where inside NN there’s many old, untouched TIGER residential roads, and outside NN (at least on the Utah side) almost none. I don’t know if this is just selective mapper activity or inconsistencies in the TIGER data coverage, or something else. But I made a MapRoulette challenge to encourage people to help with building a better road network for the Navajo Nation: Martijn van Exel: "Mapping Inequality The map below shows SE Utah w…" - En OSM Town | Mapstodon for OpenStreetMap

2 Likes

This is a neat little snippet of OT code, thanks Martijn. I’ve entered something similar for my county (in California) linked in our (county-level) wiki, it produces a bit “richer / deeper” a set of data (both nodes and ways).

I really, really miss the wonderful, deprecated (summer of '19?) ITO World “TIGER Cleanup” (I think it was called) renderer. I used this to clean (and clean, and clean, and clean…) my county until I got to something like 75% or 80% “done” (I might give my efforts a solid B-?!) and then that particular renderer quit. So sad.

I’ve looked for other “prettified” helpers / renderers to aid in TIGER cleanup, as in many cases, automation is quite ad hoc, specific to a county, state, aboriginal_land (again, Martijn, thanks for the tip about Navajo Nation, I’ll go take a look). Alas, there aren’t any renderers that suit my fancy, so what little work I now do (in my county, really) to improve TIGER is from my OT query. Somehow, because it isn’t as pretty as that ITO World version, I clean up TIGER less than I used to. I think it was the color-scheme (red, orange, light-blue, dark-blue, I think) and rather clever reasons (including “3 year aging since last edited”) that made it truly useful. I know if we got a replication or close to it, I (for one) would slash away towards 90%, then towards 100% (again, in my county, where I concentrate my mapping efforts, especially for TIGER fixup).

I do recall one august volunteer in this project (I have a lot of offline email conversations with him) calling TIGER, in many cases, “not much better than an hallucination.”

Anyway, I’m dedicated to improving TIGER data, locally, more widely (statewide, and indeed, there is a lot to be said for state-by-state “divide and conquer,” as we’ve done a decent job of whacking rail data from TIGER down, though there’s still tens of thousands of rail miles to go, and these aren’t getting easier to quantify). I’d love to know that better tools are available. OT queries are good, but they’re wonky and largely used by the more geek-inclined (no offense to geeks, I actually proudly have the word on a license plate of a car of mine).

Yes, it might be the 2040s before we clean it all up. My sleeves are rolled up, and have been for a while.

2 Likes

Richard, I don’t know if you’ve been to West Virginia or know much about it, but it’s an outlier among states in some interesting ways. For one example, it seems to be deliberately “radio signal quiet,” I believe part of that was or is for a radiotelescope near there that needs to attenuate interference, improving its signal-to-noise ratio. A number of things seem to “fall off the map” when you enter West Virginia, it’s hard to explain. I’m sure there are reasons for such things, they seem beyond me. Maybe there’s an article written about why.

cycle.travel can be used in some areas. It shows unfixed TIGER residentials in rural areas as a faint grey dashed line, like this:

But it won’t be 100% reliable for this purpose - in many areas it has additional heuristics to guess what might be a usable road, and it updates roughly once a month so it’s not ideal for real-time fixing.

While I’ve perused cycle.travel on my little county before (I did develop and propose to the transportation commission the “CycleNet” bicycle local bike route numbering protocol), thanks to your “unfixed TIGER residential = faint grey dashed lines,” it visually now makes much more sense! I’m not sure how you determine / calculate “rural areas,” but my eyeballs are quickly getting retrained as they parse your semiotics. Thanks!

1 Like

I think this is a reference to the National Radio Quiet Zone, which also extends into a good chunk of Virginia. It is indeed an area where you’re guaranteed to lose cell reception, but the data quality issues in West Virginia aren’t limited to this zone by any means. I’ve cleaned up many roads that corresponded to old mining roads or roads predating mountaintop removal. Even the many roads that legitimately exist have poor geometry because most roads follow winding rivers in dense woodlands within narrow hollows – tough for both GPS reception and aerial survey.

TIGER’s data quality issues are generally endemic to specific counties, but for West Virginia they seem to be pretty consistent statewide. I wonder if this is because West Virginia maintains the entire public road network outside incorporated cities, in contrast to most states that rely more heavily on local highway departments, which are typically responsible for sending road network data to TIGER.

2 Likes

Yes, Minh: the Green Bank (,West Virginia) radiotelescope et al. Thanks for your link, thanks for your mapping of the “quiet zone” polygons.

Making an explicit reply here because Zeke’s excellent suggestion got a like from me and resonates with my experience with that (no longer functional) ITO World render I mentioned. Goin’ through a bit of “already” ground here (for over five years of history) at TIGER Edited Map - OpenStreetMap Wiki. The topmost render is what I’m talking about. That “overview” or “heat map” (something about those red-and-orange turning to sky-blue, then darker-blue as we’re done) clinched it as “visually parsable semiotics which make a lot of sense to my mind,” driving forward TIGER cleanup.

It worked, is what I’m saying. Replication (or something close) WOULD rekindle that fire, fairly easily, I speak for myself.

1 Like

I can speak generally about some local governments’ I consult for. The Census bureau maintains relationships with GIS managers in local administrative jurisdictions (call them Counties in Maryland). They routinely exchange data, not just centerline but also boundary annexations, things like that.

However, not all jurisdictions participate. Some smaller ones do not have the resources to pass quality data back and forth with the Census. Some really small ones may just have a single GIS person and all of their CL could be in an incompatible format. Lots of these tiny jurisdictions around.

2 Likes

This overpass query works to find all the untouched tiger imports. In areas that have not seen much improvements this works.

// gather all Dave Hansen TIGER import edits in an area
[out:json][timeout:25];
// gather results
(
  way(user:"DaveHansenTiger")({{bbox}});
);
out body;
>;
out skel qt;
1 Like

Seems as if MS has released new ‘road detections’ GitHub - microsoft/RoadDetections: Road detections from Microsoft Maps aerial imagery this would be completly unremarkable, except that they found ~800’000km of ‘missing’ roads in the US*. Now while there is likely to be lots of ‘not actually a road’ in that data, it still would indicate that they found lots of segments that didn’t match up with the existing data. My hypothesis is that might be due to TIGER derived OSM data with geometry issues that didn’t allow matching and that the MS data might allow pinpointing that.

Can’t test that right now as I’m on the road, but maybe it is worth looking in to.

* conventional wisdom is that OSM already has more roads than actually exist thanks to TIGER.

1 Like

Some of the chat on OSMUS’ Slack about this has been along the lines of “oh no, not again”, from people who’ve had to tidy up after building imports based on MS’ “building detections”. As with buildings, the risk here is that some people will think that “Microsoft thinks it is a road, therefore it must be”, even though a glance at the imagery with open eyes would show that it obviously isn’t.

This development instance can handle larger ones: overpass turbo

3 Likes

Just the usual PSA: importing ODbL licensed data has always been and continues to be a tremendously bad idea.

Using it for QA is naturally perfectly fine.

1 Like

Really nice work (that OT is beautiful) and truly true things said here just now. I, for one, do appreciate these things. I’m clapping my hands, here.