How to massive update/upload postcodes?

Greetings,
I would like to upload/update many “osm_id” by adding/correcting Postal Codes of Italy but I can’t manually update 2milions records of ways, nodes and relations…

Is there a way to do it in a somehow fastest/batched way?
I can do/test it locally on my own server by doing massive UPDATE Queries directly on the database, but for official Online Database is there a way?

Thx in advance.

1 Like

Hello emandt and welcome on the forum!

The way you phrase this it sounds like an import, so the Import Guidelines would apply:
https://wiki.openstreetmap.org/wiki/Import/Guidelines

And the Automated Edits code of conduct:
https://wiki.openstreetmap.org/wiki/Automated_Edits_code_of_conduct

That number of changes cannot be put into one changeset. They would need to be split up into smaller geo limited changesets.

One way to do this is write a script that directly calls the API, see specification:
https://wiki.openstreetmap.org/wiki/API_v0.6

Or you could split the changes locally and update that inside of JOSM

2 Likes

Very useful, thank you.

I didn’t knew API or JOSM so I’ll read about them.

I much appreciated your help.

What kind of objects you’re planning to update?

Typically there are 3 kind of objects with addr:* tags:

  • ways/relations with building tag
  • addr nodes
  • POIs (as nodes)

How are you joining external data with OSM?

What about other addr:* tags? Does the dataset has any other useful fields?

I recently started to run monthly script that updates addr tags for Estonian buildings. During this period I’ve imported almost 100 000 postcodes to OSM.

1 Like

Also please verify if your data source is compatible with OSM’s licence: Copyright - OpenStreetMap Wiki

4 Likes

Of course.
These are public Open Data without any limits or copyrights :wink:

Using Nominatim I encountered something that was similar to an Importing bug: many Ways, Nodes and Relations was associated with a wrong postcode, so I started to investigated.
In OSM database there weren’t present any postcode for those entities, so I contacted Nominatim team that explained me that postcodes, if lacking from OSM DB during its import, are “guessed” by nearby entities (neither with a postcode). This “guessing procedure” assigns wrong postcodes to entities and many searches are wrong.

So I checked OSM database and discovered many and many entities without ANY postcode, or (the worst) which have a wrong postcode in the first Way record but the correct one in its parent (its Hamlet or City/Village for example).
It means a lot of data is wrong if we need some kind of postcode filter.

I was looking to extract an association “Streets - Hamlet - Postal codes” (~740k total Streets in our 8000 Hamlets here in Italy) from OSM or Nominatim DBs, but all these lack discouraged me until this data is fixed.

So I decided to find a way to associate Open Data postal codes to OSM entities. Certainly Ways entities, but for other types I’ll need to investigate more.
Maybe I’ll write something in Python, PHP or Java (I’m a developer) to automatize this association in my local DB while testing.

The OpenData dataset I have containes some governative data that could be useful for all (Cadastre Codes of each Hamlet like Wiki data already has).

Even uploading 10 records per minute, it could be 14k records per day. Not very fast but neither very slow.
I’m at the first beginning now, so I’ll investigate and try in the future.

1 Like

I did a quick check in ohsome and found that over 90% postcodes in Italy are in addr nodes. Make sure you follow existing local convention for addr:* tags. And for better context - there are over 14 million buildings and only ~2.2 million address nodes with postcodes.

2 Likes

Exactly, addresses in Italy do not identify a building, but an entrance. So we don’t add addresses to building outlines.

See: https://wiki.openstreetmap.org/wiki/IT:Addresses#Regole_specifiche_per_l’Italia

especially:

Dato che il numero civico contraddistingue un accesso esterno che immette alle unità immobiliari, le informazioni sugli indirizzi possono essere aggiunte solo come semplice nodo in corrispondenza dell’accesso esterno o, in alternativa, a un nodo di ingresso di un edificio o sito. NON bisogna aggiungere le informazioni di indirizzo a edifici, siti o altre tipologie di aree.

I’m not interested to add house_numbers but only Postal Codes, so this kind of information how does it helps me? :sweat_smile:

I’m interested just in this:

A un indirizzo si possono aggiungere i tag addr:postcode=* e addr:city=* per specificare rispettivamente il CAP e il comune in cui si trova.

Usually postcodes go together with other addr tags. You will need to discuss this with Italian OSM community if or how you could import postcodes without housenumbers.

But what you can do is download pbf extract and use osm2pgsql to import existing nodes with addr:postcode to db. If you are able to compare this with local government data and verify what % of postcodes in OSM are incorrect then you will have better understanding of the problem.

2 Likes

Un indirizzo senza civico non è un indirizzo. Ci sono rari casi di attività commerciali che non hanno il civico (nohousenumber=yes) o hanno un chilometraggio (addr:milestone=*, principalmente benzinai). Se vuoi mappare l’area di un CAP vedi anche: Tag:boundary=postal_code - OpenStreetMap Wiki

In ogni caso, prima di procedere con import notifica la community italiana: Italia (Italy) - OpenStreetMap Community Forum

2 Likes

(original Nominatim issues and explanation are here: Possibile wrong initial import for postcodes · Issue #3180 · osm-search/Nominatim · GitHub)

The “problem” is that a tons of ways, rels and nodes doesn’t have a postcode associated to them.
Relative to just one Region here in Italy (to speed up checking and queries) I searched for records which have “addr:postcode” or “postal_code” tags in it and:

  • planet_osm_ways: 1% of records has them
  • planet_osm_roads: less than 0.3% have them
  • planet_osm_point: 4% have them
  • planet_osm_polygon: 1.2% have them
  • planet_osm_rels: 1.3% have them

At THIS specific moment I’m not sure if all table should containing “addr:postcode” or “postal_code” in their tags, but it could be…
If OSM database wants that a postcode is assigned to buildings rather than only streets/hamlets, than it could be possibile that milions and milions of records lacks of this data which Nominatim relys on without guessing the postcode using a nearby one.

For the avoidance of doubt - most things in OSM do not and should not have postcodes. For example, trees.tend not to receive mail and tend not to have them.

1 Like

Lately noticed with some newer mappers they only put addr:housenumber and addr:street, no CAP, no city, no place and wondered where this was coming from (Think Osmose does not report this as a problem).

Interesting is that the Germans map their Postleitzahl areas and found a site that lists all streets by name which fall under the same CAP so wondered if something similar would work for Italy. At least, not seen anything here on OSM. When searching for CAP 66100 big G actually shows the area outline. /OT

Of course yes.
My prev percentages was calculated using a simple query without filtering, but the “roads” table could be the most significative in that result because I suppose that records tends to have/need postcodes rather than other tables. O not?

Unfortunately I don’t think this kind of data exists for Italy, neither by paying a fee.
I found an Open Data for all italian Hamlets and their PostalCodes, and here in Italy postalcodes are assigned to them not to single building/houses or streets as most of the words do.

So all building and streets that are part of a specific Hamlet (as an area/polygon) will inherit that postalcode.
Unfortunately few Hamlets (very big ones) uses multiple postalcodes even if the Hamlet/City remain the same.
For example Rome have 73 different postalcode as an Hamlet (not as a Province!) and all its streets shares 90% of their parent information (hamlet, town/city, region, state, etc…).

1 Like

The response you got from the github issue explains quite nicely how Nominatim handles postcodes.

Setting up boundary=postal_code relations to define areas for each postal code is a complex task and most probably require a lot of manual effort.
There are some existing examples in Italy:
https://overpass-turbo.eu/s/1zAf

In case existing admin boundary aligns perfectly with postal code boundary you can also use postal_code tag directly:
https://overpass-turbo.eu/s/1zAg

Both of these approaches are opposite of easy and fast imports that you mentioned in your first post.

Only easy solution would be to verify if the existing 2.2 million addr nodes have correct postal code. If not, updating addr:postcode tag is fairly straightforward task.

1 Like

I added a pair of “postal_code” to OSM DB and seems your Overpass script now detects it well.
I need to check and verify if Nominatim will use this new tag or rely only to “addr:postcode” of single OSM entities (nodes, ways, etc…)

1 Like

Actually a bit of a pleasant surprise just across the regional boundary in Marche

image

And whole of Sardenia to boot… someone spent serious time putting those together.

2 Likes