HIFLD Open discontinued

HIFLD Open is being discontinued. How do we make sure as much as possible of this data is being saved?

They have 447 datasets. HIFLD

I’ve made a (hidden) MapRoulette Challange (for UT) banks, I’ll make it public if I get no objections.

MapRoulette already has a reputation problem because it is often abused for stealth imports. Please don’t make it worse. Your challenge has almost 4,000 “banks” at Null Island, and the random spot checks I’ve done have frequently shown me pictures like this

or pinpointed a “bank” vaguely in between 6 buildings none of which was identifiable as a bank from imagery.

If you want to go ahead with this, you absolutely must provide clear instructions to any participants of the task, along the lines of: “Under no circumstances should you trust this data and add anything to OSM when aerial imagery does not clearly and unambiguously support the data, or where OSM already has information about a bank POI at that location”. You should also provide clear instructions to participants - currently the description is

This is HIFLD OPEN FDIC Insured Banks, filtered by SERVTYPE_DESC == “FULL SERVICE - BRICK AND MORTAR” or SERVTYPE_DESC == “FULL SERVICE - RETAIL”.

which doesn’t really say what people are supposed to do with it.

Personally I have doubts whether the data is worth having.

4 Likes

The 24 @ null island have addresses attached.

1 Like

The insured banks dataset is ostensibly based on the FDIC’s BankFind Suite. I’ve manually mapped many bank branches in OpenHistoricalMap based on BankFind’s historical bank branch acquisition data. I’ve rarely seen any problems in BankFind’s bank branch coordinates, but I can easily spot inexplicable errors in this HIFLD dataset, making me wonder where the coordinates are coming from.

For example, the first of 1,000 records on Null Island corresponds to this Capital One branch that I mapped many years ago. BankFind has an entry for it and correctly geolocates it to the building’s centroid (and plots it atop OSM with proper attribution, natch):

The FDIC offers bulk downloads for their data, but the coordinates are only available from their history API. Perhaps you could look into writing a scraper for AllThePlaces or request the full data from the FDIC.

The FDIC’s dataset only includes federally insured bank branches (whether state or federally chartered). The FFIEC National Information Center publishes a much more comprehensive dataset that includes credit unions, savings and loan associations, and privately insured banks. The NIC only provides addresses, not coordinates, so you have to do the geocoding yourself.

My understanding is that HIFLD aggregated datasets from other agencies with some postprocessing. We’ve looked into importing some of these datasets before but have gotten stymied by data quality issues even when the proposal otherwise dotted i’s and crossed t’s. Like any interagency open data aggregator, it could still be useful for discovering rawer, more obscure datasets that we can put more faith in.

They ran the addresses through an automatic geocoder. When the geocoder can’t find the address in question, it defaults to ever increasing generalize locations (e.g. zip centroid), until finally - if it can’t find any other location - it places the location on null island.

In addition to the auto geocoding, by now the data is quite old.

Right, I assume BankFind did the same. I would’ve thought both federal agencies would’ve used the TIGER geocoder, but maybe there was some bug or poor assumption in HIFLD’s geocoding process causing seemingly normal addresses to fail. I’ve seen some government sites incorrectly assume that postal cities are equivalent to incorporated areas. For example, Nominatim also fails to find “1996 Segnette Blvd, Westwego, LA 70094”, because it’s almost a mile outside the city limits in unincorporated Jefferson Parish, yet it carries a Westwego address.