Please help - export from osm to country->adm unit->city db structure

Hello everyone! This is my first post on this forum.
I know that those kind of posts are irritating, but before you judge me please understand, that I’m a man with a knife on my throat.

I’m stuck on my task and I don’t know what to do.

The job is:

  • extract data from Open Street Map
  • create database model containing following schema and data

Schema looks like this:

+---------+     +------------+      +----------+
| country |     | adm_unit   |      | city     |
+---------+     +------------+      +----------+
| id      |--\  | id         |--\   | id       |
| name    |   \-| country_id |   \--| adm_id   |
+---------+     | name       |      | name     |
                +------------+      | coord_x  |  //coordinates - longitude and latitude
                                    | coord_y  |
                                    +----------+

by “administrative unit” I understand top-level administrative unit for each country (eg. state in US, vovoidenship in Poland).

So far I managed to:

  • download map itself (only for Poland to test it)
  • import it to database using osm2pgsql tool * perform some selects and realize, I don’t know how to do it.

The main problem is: I can’t see any relations between different items of map. Taking Poland as example: I can find Warsaw (capitol city), as well as “masovian vovoidenship” performing few selects, but there is no relation between those two. There are lot of empty columns in my db instead. Also none of column names suggests that it should hold any relation to “parent”.

I don’t know how to find those relations. Is it something with imports that i’m missing? Should I change anything in my default.style (I haven’t changed anything inside it)? Or maybe there’s some kind of other approach to solve my problem? Maybe there are free databases that contain data I need and are less painful to work with? Geonames is no-go, their database is full of errors.

I’m really green when it comes to gis, I’m PHP programmer, that’s completely different world for me. Unfortunately, even if I want to explore it, my deadlines won’t allow that.

I will appreciate ANY help with my issue, any direction, any solution! Big big big thanks in advance!

You’re apparently looking for how to do spatial querys on OSM-Data.

Yopu already had an look at http://wiki.openstreetmap.org/wiki/Relation:boundary and its following pages?

Is your aim similar to http://osm.wno-edv-service.de:8080/boundaries/ ?

Isn’t there any other approach? Alrogythm I imagined was:


1. Find all cities
2. For each city:
    * Save city and coordinates
    * Find parents until parent is top level administrative unit
    * save parent (if not already exists)
    * save relation from adm. unit to city
    * find country for adm. unit
    * if country doesn't exist, add it
    * save relation from country to adm. unit

Maybe I should do it reverse way?

I surfed through the osm main site noticing that there are parent-child relationships, when, for example, I will find a city. I thought that this is a simple relation, not done by checking if one area contains another… how this could be efficient? (In terms of programming).

Anyway I understand that if I’m wrong, I won’t change the world to fit my needs. I will have to adapt.
I noticed that my pg database has plenty of functions installed. I’m pretty sure I will have to use them, but I don’t know how to achieve my goal, even if I have documentation.

UPDATE
stephan75 - thanks for your answer. I haven’t noticed it at first. Gonna check it out.

Small steps… I started working with my data. I think it might be corrupted, or I’m doing something wrong.
I wanted to list all countries from data I imported from http://download.geofabrik.de/europe/poland.html.
I thought I will get only one record - Poland, or maybe few records - Poland + countries around.
Please take a look at this screenshot:

First, you have to understand that osm2pqsql was mainly designed for rendering (drawing the map like mapnik app)

Second, for you needs and if you continue with the default osm2psql import style, you will create a huge amount of data into a huge database, especially if your plan is to make it worldwide. You have alternative solutions to only import what is required.

Third, you should understand how boundaries and places are modelized into OSM before you write any query. Basically, country boundaries are modelized into a “relation” of “type=administrative” or “type=multipolygon” + “boundary=administrative” + “admin_level=2” (2 for country level). For Poland, it is the relation id:49715 visible here http://www.openstreetmap.org/relation/49715#map=6/52.127/19.138 (wait few seconds to see it drawn on the map). The relation is a huge polygon (closed way) composed of many smaller OSM “ways” which are themselves tagged with “administrative=boundary” (e.g. http://www.openstreetmap.org/way/100180848))

Next, you have to understand how osm2pqsql is converting this data into the postgis database. You should really read these wiki pages:
http://wiki.openstreetmap.org/wiki/Osm2pgsql
http://wiki.openstreetmap.org/wiki/Osm2pgsql/schema

Then you will understand that the small boundary segments are imported into the planet_osm_line but the country polygon itself (the relation) is normally imported into planet_osm_polygon.

Finally, you will have to query spatiallly the db to find all “place” entities within this polygon. Check some postGIS tutorials for that. Also don’t forget that “place” entities can be modelized in OSM by simple nodes (going to the table planet_osm_node) usually in the city/town/village centre or next to townhall or closed polygons surrounding the urbanized area (then in planet_osm_polygon) (in which case you could calculate and use the centroid coordinates of the polygon - again something easy to do with postGIS). Important here is the selecting tag “place” with “town/city/village/hamlet” values

Hope you got some help here.

Pieren - thanks!

I managed to install newest version on postgis and osm2pgsql, and then to reimport data.

I read some of postgis documentation and… first success! I am able to select country and states without problem:


select
	osm_id, name, ST_AsText(way), *
from
	planet_osm_point
where
	place = 'state'

resultset:


505016447;"województwo lubuskie";"POINT(1708031.76 6817479.66)"
475386053;"województwo zachodniopomorskie";"POINT(1732816.85 7084466.23)"
505025705;"województwo dolnośląskie";"POINT(1826035.02 6689369.53)"
504981968;"województwo wielkopolskie";"POINT(1931039.57 6827112.6)"
505029615;"województwo opolskie";"POINT(1992932.91 6602233.4)"
505013148;"województwo pomorskie";"POINT(1993257.17 7197313.56)"
505045821;"województwo kujawsko-pomorskie";"POINT(2053438.48 6999958.18)"
505034126;"województwo śląskie";"POINT(2137745.67 6541639.4)"
505043292;"województwo łódzkie";"POINT(2152325.05 6705513.44)"
504966405;"województwo małopolskie";"POINT(2266118.48 6411970.93)"
1172157011;"województwo świętokrzyskie";"POINT(2305504.51 6576439.87)"
505008682;"województwo warmińsko-mazurskie";"POINT(2351744.56 7157135.11)"
504977488;"województwo mazowieckie";"POINT(2360790.33 6899493.19)"
504970284;"województwo podkarpackie";"POINT(2470364.12 6442701.52)"
505006035;"województwo lubelskie";"POINT(2538416.04 6616351.32)"
505006928;"województwo podlaskie";"POINT(2543937.42 7032510.46)"

Looks great at first, but as you can see, my way is only point, while what I’m looking for is area/polygon/shape. (I hope that this is a good approach).

Let’s take województwo mazowieckie for my example.

It has id: 504997488. I tried to find it in planet_osm_polygon and planet_osm_roads like this:


select 
    *
from 
    planet_osm_roads
where
    osm_id = 504997488
    or ref = '504997488'

with no luck.

Only way to find it is to perform below query:


select 
    *
from 
    planet_osm_roads
where
    name ilike 'województwo mazowieckie'

this gives me 27 rows of results, each one with way as geometry(lineString, 900913).
I guess my next step would be combine them into one area and find each city it contains using appropriate query. Unfortunately I can’t figure how to do that (even after reading docs).

I tried this:

select st_bdpolyfromtext(
	(select 
		array_to_string(array(select ST_AsText(way) from planet_osm_roads where name ilike 'województwo mazowieckie'), ',
'))
	, 900913)

but it throws an error:


ERROR:  parse error - invalid geometry
HINT:  "...6 6850868.61,2204583.54 6850805.81),
" <-- parse error at position 4145 within geometry
CONTEXT:  SQL function "st_mlinefromtext" statement 1
PL/pgSQL function "st_bdmpolyfromtext" line 8 at assignment

Don’t use “place=state”. You don’t need it if you want all cities in a country. You should contact the OSM community in Poland to understand how the places tags are used in which cases. But the place=state in this country is not used for a city neither for the country.

Wrong number : it is 504977488

It is used for vovoidenship (in Polish: “województwo”) and it is top-level administrative unit in Poland, like state in United States. I get your point but I want to select all cities from each vovoidenship separately, so I can save a relation vovoidenship->cities.

Thanks :slight_smile:

But then you should not search nodes with “place=state” but rather the polygon defining the boundary:

http://www.openstreetmap.org/relation/130935

Again it is represented by a relation (thus planet_osm_polygon) which can be selected by “type=boundary” or “type=relation” + “boundary=administrative” + “admin_level=4” (and perhaps “is_in:country=Poland” if you care about unwanted adjacent countries)

Thanks Pieren, now I understand it completely.

Could you please provide me with working query example on how to get the cities inside, for example, mentioned relation 130935?

UPDATE

I managed to get cities from most of vovoidenships (14 of 16) using ST_Contains and data Pieren suggested. However I still have problem with remaining two. They are described in different way - not as single polygon, but as set of lines (for example relation 130935) in corresponding table. I’ve tried to convert it to polygon with no luck - 130935 line is unclosed and this is how I try to close and convert it:

SELECT St_GeomFromText(St_MakePolygon(St_AddPoint(

(
	SELECT 
		string_agg(ST_AsText(way), '\n') 
	FROM
		planet_osm_line
	WHERE
		osm_id = -130935
	), ST_StartPoint(
		(
			SELECT 
			way
		FROM
			planet_osm_line
		WHERE
			osm_id = -130935
		LIMIT 1
		)
	)
)))

Unfortunately this results in error:


ERROR: parse error - invalid geometry
State SQL: XX000
Tip: "010300000001000000B" <-- parse error at position 19 within geometry

Found the solution.

Some of vovoidenships were saved as closed polygons in planet_osm_polygon, some as set of records (lines) in planet_osm_line.

In case of polygon, thing was pretty simple, it was all about proper query:


    INSERT INTO 
    	out.city 
    		(region_id, city_name, coordinates, place_type)
    	(
    		select 
    			15,
    			name, 
    			ST_AsLatLonText(ST_Transform(way, 4326)), 
    			place 
    		FROM 
    			planet_osm_point 
    		WHERE 
    			place IN ('city', 'town', 'village', 'hamlet')
    			AND ST_Contains((SELECT way FROM planet_osm_polygon WHERE osm_id = -130919), way)
    			AND name != ''
    			AND name IS NOT NULL
    
    	)


Things got worse with second case (set of linestring’s). Most of them were unclosed. I struggled for hours trying to find working solution to join the lines, transform to shapes and search inside them.
I gave up and got this brilliant idea - what about repairing this myself using some sort of GUI? Things ended up in

  • installing quantum gis,
  • drawing missing lines,
  • merging them together,
  • saving back to database (which qgis handles very well).

After that All I had was to perform something like this:


    INSERT INTO 
    	out.city 
    		(region_id, city_name, coordinates, place_type)
    	(
    		select 
    			17,
    			name, 
    			ST_AsLatLonText(ST_Transform(way, 4326)), 
    			place 
    		FROM 
    			planet_osm_point 
    		WHERE 
    			place IN ('city', 'town', 'village', 'hamlet')
    			AND ST_Contains((SELECT St_GeomFromText(St_AsText(St_MakePolygon(St_AsText(
    			(
    				SELECT 
    					string_agg(ST_AsText(way), '\n') 
    				FROM
    					planet_osm_roads
    				WHERE
    					osm_id = -223408
    				)
    				
    				
    			))), 900913)), way)
    			AND name != ''
    			AND name IS NOT NULL
    			ORDER BY name
    
    	)

Worked like a charm! I hope this helps someone in the future. Thank you all for your help anyway :slight_smile: