From where does OSM get street address data?

I’m building a small calendar application that need a geocoding service. My first thought was OSM because I’ve been trying to de-google my life. Unfortunately, the first few street addresses I look for cannot be found in Nominatim. What can be done about this? From where does OSM get its street address data? Is this data available somehow from government public records? Or does each address have to be added individually by volunteers?

Example query that’s not found: “66031 Twentynine Palms Hwy, Joshua Tree, CA 92252”

1 Like

If there is a source under an appropriate license then it likely will have already been imported (unless recently released). If that’s not the case, then it comes down to mappers adding it in.

1 Like

This specific address hasn’t been mapped but the building has been

From what I’ve seen, address coverage in the USA is… patchy at best

A bunch of addresses were added by volunteers. I did it for the town I used to live in. And I did it for the town I live in now. Lots of walking over quite a few weeks.

But I think nowadays addresses are more likely to be imported from appropriately licensed data sets like the national address database. Unfortunately, California has not gotten its act together in pulling addresses from the the towns, cities and counties that actually assign addresses and then uploading them to the national database.

Even if the addresses are available with appropriate licensing someone still needs to do an import which takes effort to do properly (conflation with existing OSM data, QA checks, etc.).

depends on regions

in some addresses were already imported

in some they wait for import, in some official data does not exist or license is bad

in some it relies entirely on mappers adding them - if you can add some without copying from problematic sources (say Google Maps, or official data on bad/unknown license) then it is highly welcome! I would encourage you to add at least few of them

2 Likes

If there’s some way to import the data automatically I’d be willing to help with that. I found a lot of good data at openaddresses.io, including this location. This dataset comes San Bernardino County, and while I can’t find a license per se on their website, I’d assume it’s freely available to all since it belongs to the government. SBC Address Points

Can this be setup as a pipeline to import automatically whenever the county updates it?

Freely available to all doesn’t mean it’s under an appropriate license, and you need to confirm that license as part of the import guidelines.

2 Likes

@fortera_au The California Supreme Court has ruled that datasets created by governmental entities are public domain. So unless there was some weird business deal where San Bernardino County bought the addresses from a private company then they should be good to import into OSM. Since the counties are responsible for assigning addresses in unincorporated areas it seems highly unlikely that a private company created San Bernadino’s address dataset.

All that said, I know that for Kern County and for Orange County there is some fine print on their web portals that indicates you can use their data, just don’t blame them for errors. I would be surprised if San Bernardino County’s GIS portal was much different.

To @llamafilm once you verify the licensing, there is still a lot of work to do on an import. There are likely at least some manually created addresses in OSM for that area. You will need to come up with a work flow that, among other things, converts the county tagging to that used in OSM, identifies duplicates, conflates duplicates, etc. regardless if the address data currently existing in OSM is on points or polygons. There have been other address imports in the US and each import was supposed to create a wiki page describing everything you propose to do starting with where the dataset is coming from and what the licensing is on it through the tool chain you are using and the QA steps to be used. You may want to look at some of those older imports to get ideas on how to approach this.

1 Like

Sounds like it’s perfectly fine to use then. My point was more that if you’re doing an import, you need to be able to confirm the license first.

2 Likes