Toward a national system for functionally classifying populated places


My primary source for this exploration was the Census Bureau. To facilitate future research, I imported the entire set of 2020 urban areas into Wikidata. This required getting a new property approved for urban area census codes.

As far as I know, Wikidata is the only place online where you can obtain the principal and secondary places that appear in the title of each urban area. Unless you game out the complex rules for determining an urban area, it isn’t necessarily obvious that, for example, the bi-state Kansas City urban area is named after the city in Missouri but not the one in Kansas, whereas the Texarkana urban area is named after both Texarkanas. This required many hours of manual conflation in OpenRefine.

To complete the etymological data for the whole country, I had to lavish special attention on Puerto Rico. Puerto Rico has no incorporated places, only municipios (analogous to New England towns), which are partitioned into barrios, one of which is typically the barrio-pueblo, the seat of government. Since none of these structures corresponds to where people live, the Census Bureau has defined two different kinds of CDPs: a zona urbana includes the barrio-pueblo and seat of government, while a comunidad does not.

I had to import all the zonas urbanas into Wikidata and rework many comunidades that had been poorly represented by the Cebuano Wikipedia. Meanwhile, OSM had incorrectly conflated each populated place with its surrounding municipio. As in New England, this resulted in inflated population=* tags, sometimes by orders of magnitude. I retagged each zona urbana and any comunidad that appeared in the title of an urban area, but I haven’t gone back and added population figures to the remaining comunidades.

I used QGIS to produce the custom maps in this post. They’re just a quick demonstration, nothing particularly rigorous from a cartographic perspective, but if you want to reproduce it, I threw the project up on GitHub along with notes for reproducing the various layers. This repository contains a GeoJSON file of all the OSM place points in the United States annotated with the major/minor/rural environment and any relationship to an urban area title. You can use this file to create a mashup that overlays the proposed places on other OSM data. Another GeoJSON file contains all the current place=town nodes that lie in rural areas.