gis database optimization

I’ve my own OSM database / tile server. When I checked the postgis log file it looks like there are few queries talking long time to execute. I’ve created an index " CREATE INDEX planet_osm_point_place_index ON planet_osm_point (place); " and it looks like query executes bit faster.

Would anyone advise how to optimize and tune the GIS database please?

Example:


  1. Index

gis=£ explain analyze select place from planet_osm_point where place in (‘country’,‘state’) or (place in (‘city’,‘metropolis’) and capital=‘yes’);
QUERY PLAN

Bitmap Heap Scan on planet_osm_point (cost=733.43…95183.51 rows=19107 width=8) (actual time=54.796…5863.578 rows=1407 loops=1)
Recheck Cond: ((place = ANY (‘{country,state}’::text[])) OR (place = ANY (‘{city,metropolis}’::text[])))
Filter: ((place = ANY (‘{country,state}’::text[])) OR ((place = ANY (‘{city,metropolis}’::text[])) AND (capital = ‘yes’::text)))
→ BitmapOr (cost=733.43…733.43 rows=38214 width=0) (actual time=46.118…46.118 rows=0 loops=1)
→ Bitmap Index Scan on planet_osm_point_place_index (cost=0.00…361.94 rows=19107 width=0) (actual time=42.572…42.572 rows=1149 loops=1)
Index Cond: (place = ANY (‘{country,state}’::text[]))
→ Bitmap Index Scan on planet_osm_point_place_index (cost=0.00…361.94 rows=19107 width=0) (actual time=3.544…3.544 rows=6064 loops=1)
Index Cond: (place = ANY (‘{city,metropolis}’::text[]))
Total runtime: 5864.319 ms
(9 rows)


  1. Without index

gis=£ explain analyze select way,highway,aeroway,name,ref,char_length(ref) as length, case when bridge in (‘yes’,‘true’,‘1’) then ‘yes’::text else bridge end as bridge from planet_osm_line where waterway IS NULL and leisure IS NULL and landuse IS NULL and (name is not null or ref is not null) ;
QUERY PLAN

Seq Scan on planet_osm_line (cost=0.00…2266230.42 rows=17817674 width=1148) (actual time=4.820…87766.793 rows=18556799 loops=1)
Filter: ((waterway IS NULL) AND (leisure IS NULL) AND (landuse IS NULL) AND ((name IS NOT NULL) OR (ref IS NOT NULL)))
Total runtime: 90062.921 ms
(3 rows)


Many thanks in advance.

Kind regards,
Shamim

The osm2pgsql tables are designed to be queried using bounding boxes, which is why they are indexed on the primary key and the geometry columns: I see no bounding boxes on your sample query plans.

In the absence of bounding boxes I would expect to see table scans. Adding indexes on the tag-based columns can certainly help performance, but penalises update times. In Postgres you can add indexes with conditions as well, which may well improve index selectivity for

Please provide more context: are these mapnik queries (they look like it to me), are these the complete queries, what data do you have loaded (country, rowcounts from planet_osm tables), what is the area that you are querying. Have you vacuumed all the tables? What platform & resources do you have available?

Thank you!!

Yes, I vacuumed all the tables. Please have look below the details.

==============================================
Platform & resources

64 bit Openvz ( Centos 5)
Linux 2.6.18-194.26.1.el5.028stab081.1 #1 SMP Thu Dec 23 20:17:23 MSK 2010 x86_64 x86_64 x86_64 GNU/Linux

kernel.shmmax = 2147483648

Disk space:

Size Used Avail
350G 225G 126G

==============================================
Memory: free -m

              total       used       free     shared    buffers     cached

Mem: 16011 890 15121 0 0 0
-/+ buffers/cache: 890 15121
Swap: 0 0 0

==============================================
postgresql.conf

shared_buffers = 788MB
tcp_keepalives_idle = 240
max_connections = 100
effective_cache_size = 128MB

==============================================
Running process on the server

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
postgres 16005 0.0 0.1 847576 19760 ? S Jun14 0:00 /usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data
postgres 16007 0.0 5.0 848588 821072 ? Ss Jun14 0:21 postgres: writer process
postgres 16008 0.0 0.0 847972 1072 ? Ss Jun14 0:06 postgres: wal writer process
postgres 16009 0.0 0.0 19116 1028 ? Ss Jun14 0:00 postgres: stats collector process
postgres 18308 0.0 0.0 871360 9840 ? Ss 09:50 0:00 postgres: postgres gis 10.0.10.81(33161) idle
postgres 18312 0.0 0.0 871360 9852 ? Ss 09:50 0:00 postgres: postgres gis 10.0.10.81(33163) idle
postgres 18313 0.0 0.0 871360 9848 ? Ss 09:50 0:00 postgres: postgres gis 10.0.10.81(33164) idle
postgres 18316 0.0 0.0 871360 9856 ? Ss 09:50 0:00 postgres: postgres gis 10.0.10.81(33167) idle
postgres 18373 0.0 0.0 871364 9360 ? Ss 09:53 0:00 postgres: postgres gis 10.0.10.81(40450) idle
postgres 18379 0.0 1.0 875820 174832 ? Ss 09:53 0:02 postgres: postgres gis 10.0.10.81(40454) idle
postgres 19643 0.0 0.0 871364 9720 ? Ss 10:30 0:00 postgres: postgres gis 10.0.10.81(56593) idle
postgres 19851 0.0 0.0 871364 10296 ? Ss 11:00 0:00 postgres: postgres gis 10.0.10.81(46062) idle
postgres 20032 0.0 0.0 871364 9888 ? Ss 11:20 0:00 postgres: postgres gis 10.0.10.81(32813) idle
postgres 20037 0.0 0.0 871360 9472 ? Ss 11:20 0:00 postgres: postgres gis 10.0.10.81(32814) idle
postgres 20038 0.0 0.0 871360 9472 ? Ss 11:20 0:00 postgres: postgres gis 10.0.10.81(32815) idle
postgres 20042 0.1 0.9 875796 158344 ? Ss 11:20 0:02 postgres: postgres gis 10.0.10.81(32818) idle
postgres 20111 0.0 0.0 871364 9192 ? Ss 11:30 0:00 postgres: postgres gis 10.0.10.81(32871) idle
postgres 20115 0.0 0.0 871364 9192 ? Ss 11:30 0:00 postgres: postgres gis 10.0.10.81(32872) idle
postgres 20116 0.0 0.0 871364 9192 ? Ss 11:30 0:00 postgres: postgres gis 10.0.10.81(32874) idle
postgres 20117 0.0 0.0 871364 9196 ? Ss 11:30 0:00 postgres: postgres gis 10.0.10.81(32873) idle

==============================================
Example mapnik queries from postgresql log file.

==============================================

[18379] [2011-06-16 10:27:09 BST] db=gis,user=postgres LOG: duration: 4797.601 ms execute : SELECT AsBinary(“way”) AS geom,“disused”,“name”,“waterway” from
(select way,waterway,disused,lock,name,
case when tunnel in (‘yes’,‘true’,‘1’) then ‘yes’::text else tunnel end as tunnel
from planet_osm_line
where waterway in (‘weir’,‘river’,‘canal’,‘derelict_canal’,‘stream’,‘drain’,‘ditch’)
and (bridge is null or bridge not in (‘yes’,‘true’,‘1’,‘aqueduct’))
order by z_order
) as water_lines WHERE “way” && SetSRID(‘BOX3D(-119853.2603511568 6689768.715518625,-75825.5320588965 6733796.443810884)’::box3d, 900913)
[18379] [2011-06-16 10:27:15 BST] db=gis,user=postgres LOG: duration: 1142.519 ms execute : SELECT AsBinary(“way”) AS geom,“railway” from
(select way,highway,railway,
case when tunnel in (‘yes’,‘true’,‘1’) then ‘yes’::text else tunnel end as tunnel
from planet_osm_roads
where highway is not null
or railway is not null
order by z_order
) as roads
WHERE “way” && SetSRID(‘BOX3D(-119853.2603511568 6689768.715518625,-75825.5320588965 6733796.443810884)’::box3d, 900913)
[19652] [2011-06-16 10:59:55 BST] db=gis,user=postgres LOG: duration: 163744.195 ms execute : SELECT AsBinary(“way”) AS geom,“landuse”,“military”,“natural” from
(select way,aeroway,amenity,landuse,leisure,man_made,military,“natural”,power,shop,tourism,name,religion
from planet_osm_polygon
where landuse is not null
or leisure is not null
or shop is not null
or aeroway in (‘apron’,‘aerodrome’)
or amenity in (‘parking’,‘university’,‘college’,‘school’,‘hospital’,‘kindergarten’,‘grave_yard’)
or military in (‘barracks’,‘danger_area’)
or “natural” in (‘field’,‘beach’,‘heath’,‘mud’,‘wood’)
or power in (‘station’,‘sub_station’)
or tourism in (‘attraction’,‘camp_site’,‘caravan_site’,‘picnic_site’,‘zoo’)
order by z_order,way_area desc
) as leisure
WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:26 BST] db=gis,user=postgres LOG: could not send data to client: Broken pipe
[19652] [2011-06-16 11:00:26 BST] db=gis,user=postgres STATEMENT: SELECT AsBinary(“way”) AS geom,“landuse”,“natural”,“waterway” from
(select way,“natural”,waterway,landuse,name
from planet_osm_polygon
where waterway in (‘dock’,‘mill_pond’,‘riverbank’,‘canal’)
or landuse in (‘reservoir’,‘water’,‘basin’)
or “natural” in (‘lake’,‘water’,‘land’,‘marsh’,‘scrub’,‘wetland’,‘glacier’)
order by z_order,way_area desc
) as water_areas WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:34 BST] db=gis,user=postgres LOG: duration: 27690.205 ms execute : SELECT AsBinary(“way”) AS geom,“landuse”,“natural”,“waterway” from
(select way,“natural”,waterway,landuse,name
from planet_osm_polygon
where waterway in (‘dock’,‘mill_pond’,‘riverbank’,‘canal’)
or landuse in (‘reservoir’,‘water’,‘basin’)
or “natural” in (‘lake’,‘water’,‘land’,‘marsh’,‘scrub’,‘wetland’,‘glacier’)
order by z_order,way_area desc
) as water_areas WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:34 BST] db=gis,user=postgres LOG: unexpected EOF on client connection
[root@mapdata data]# tail -50 serverlog
[18379] [2011-06-16 10:27:09 BST] db=gis,user=postgres LOG: duration: 4797.601 ms execute : SELECT AsBinary(“way”) AS geom,“disused”,“name”,“waterway” from
(select way,waterway,disused,lock,name,
case when tunnel in (‘yes’,‘true’,‘1’) then ‘yes’::text else tunnel end as tunnel
from planet_osm_line
where waterway in (‘weir’,‘river’,‘canal’,‘derelict_canal’,‘stream’,‘drain’,‘ditch’)
and (bridge is null or bridge not in (‘yes’,‘true’,‘1’,‘aqueduct’))
order by z_order
) as water_lines WHERE “way” && SetSRID(‘BOX3D(-119853.2603511568 6689768.715518625,-75825.5320588965 6733796.443810884)’::box3d, 900913)
[18379] [2011-06-16 10:27:15 BST] db=gis,user=postgres LOG: duration: 1142.519 ms execute : SELECT AsBinary(“way”) AS geom,“railway” from
(select way,highway,railway,
case when tunnel in (‘yes’,‘true’,‘1’) then ‘yes’::text else tunnel end as tunnel
from planet_osm_roads
where highway is not null
or railway is not null
order by z_order
) as roads
WHERE “way” && SetSRID(‘BOX3D(-119853.2603511568 6689768.715518625,-75825.5320588965 6733796.443810884)’::box3d, 900913)
[19652] [2011-06-16 10:59:55 BST] db=gis,user=postgres LOG: duration: 163744.195 ms execute : SELECT AsBinary(“way”) AS geom,“landuse”,“military”,“natural” from
(select way,aeroway,amenity,landuse,leisure,man_made,military,“natural”,power,shop,tourism,name,religion
from planet_osm_polygon
where landuse is not null
or leisure is not null
or shop is not null
or aeroway in (‘apron’,‘aerodrome’)
or amenity in (‘parking’,‘university’,‘college’,‘school’,‘hospital’,‘kindergarten’,‘grave_yard’)
or military in (‘barracks’,‘danger_area’)
or “natural” in (‘field’,‘beach’,‘heath’,‘mud’,‘wood’)
or power in (‘station’,‘sub_station’)
or tourism in (‘attraction’,‘camp_site’,‘caravan_site’,‘picnic_site’,‘zoo’)
order by z_order,way_area desc
) as leisure
WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:26 BST] db=gis,user=postgres LOG: could not send data to client: Broken pipe
[19652] [2011-06-16 11:00:26 BST] db=gis,user=postgres STATEMENT: SELECT AsBinary(“way”) AS geom,“landuse”,“natural”,“waterway” from
(select way,“natural”,waterway,landuse,name
from planet_osm_polygon
where waterway in (‘dock’,‘mill_pond’,‘riverbank’,‘canal’)
or landuse in (‘reservoir’,‘water’,‘basin’)
or “natural” in (‘lake’,‘water’,‘land’,‘marsh’,‘scrub’,‘wetland’,‘glacier’)
order by z_order,way_area desc
) as water_areas WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:34 BST] db=gis,user=postgres LOG: duration: 27690.205 ms execute : SELECT AsBinary(“way”) AS geom,“landuse”,“natural”,“waterway” from
(select way,“natural”,waterway,landuse,name
from planet_osm_polygon
where waterway in (‘dock’,‘mill_pond’,‘riverbank’,‘canal’)
or landuse in (‘reservoir’,‘water’,‘basin’)
or “natural” in (‘lake’,‘water’,‘land’,‘marsh’,‘scrub’,‘wetland’,‘glacier’)
order by z_order,way_area desc
) as water_areas WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:34 BST] db=gis,user=postgres LOG: unexpected EOF on client connection
[root@mapdata data]# tail -70 serverlog
[root@mapdata data]# tail -70 serverlog
or aeroway in (‘runway’,‘taxiway’)
or railway in (‘light_rail’,‘narrow_gauge’,‘funicular’,‘rail’,‘subway’,‘tram’,‘spur’,‘siding’,‘platform’,‘disused’,‘abandoned’,‘construction’)
order by z_order) as roads
WHERE “way” && SetSRID(‘BOX3D(606604.2564711582 6555239.545736713,958826.0828092532 6907461.372074807)’::box3d, 900913)
[18323] [2011-06-16 09:57:50 BST] db=gis,user=postgres LOG: unexpected EOF on client connection
[18379] [2011-06-16 10:27:04 BST] db=gis,user=postgres LOG: duration: 3098.769 ms execute : SELECT AsBinary(“way”) AS geom,“aeroway”,“amenity”,“landuse”,“leisure”,“military”,“natural”,“power”,“tourism” from
(select way,aeroway,amenity,landuse,leisure,man_made,military,“natural”,power,shop,tourism,name,religion
from planet_osm_polygon
where landuse is not null
or leisure is not null
or shop is not null
or aeroway in (‘apron’,‘aerodrome’)
or amenity in (‘parking’,‘university’,‘college’,‘school’,‘hospital’,‘kindergarten’,‘grave_yard’)
or military in (‘barracks’,‘danger_area’)
or “natural” in (‘field’,‘beach’,‘heath’,‘mud’,‘wood’)
or power in (‘station’,‘sub_station’)
or tourism in (‘attraction’,‘camp_site’,‘caravan_site’,‘picnic_site’,‘zoo’)
order by z_order,way_area desc
) as leisure
WHERE “way” && SetSRID(‘BOX3D(-119853.2603511568 6689768.715518625,-75825.5320588965 6733796.443810884)’::box3d, 900913)
[18379] [2011-06-16 10:27:09 BST] db=gis,user=postgres LOG: duration: 4797.601 ms execute : SELECT AsBinary(“way”) AS geom,“disused”,“name”,“waterway” from
(select way,waterway,disused,lock,name,
case when tunnel in (‘yes’,‘true’,‘1’) then ‘yes’::text else tunnel end as tunnel
from planet_osm_line
where waterway in (‘weir’,‘river’,‘canal’,‘derelict_canal’,‘stream’,‘drain’,‘ditch’)
and (bridge is null or bridge not in (‘yes’,‘true’,‘1’,‘aqueduct’))
order by z_order
) as water_lines WHERE “way” && SetSRID(‘BOX3D(-119853.2603511568 6689768.715518625,-75825.5320588965 6733796.443810884)’::box3d, 900913)
[18379] [2011-06-16 10:27:15 BST] db=gis,user=postgres LOG: duration: 1142.519 ms execute : SELECT AsBinary(“way”) AS geom,“railway” from
(select way,highway,railway,
case when tunnel in (‘yes’,‘true’,‘1’) then ‘yes’::text else tunnel end as tunnel
from planet_osm_roads
where highway is not null
or railway is not null
order by z_order
) as roads
WHERE “way” && SetSRID(‘BOX3D(-119853.2603511568 6689768.715518625,-75825.5320588965 6733796.443810884)’::box3d, 900913)
[19652] [2011-06-16 10:59:55 BST] db=gis,user=postgres LOG: duration: 163744.195 ms execute : SELECT AsBinary(“way”) AS geom,“landuse”,“military”,“natural” from
(select way,aeroway,amenity,landuse,leisure,man_made,military,“natural”,power,shop,tourism,name,religion
from planet_osm_polygon
where landuse is not null
or leisure is not null
or shop is not null
or aeroway in (‘apron’,‘aerodrome’)
or amenity in (‘parking’,‘university’,‘college’,‘school’,‘hospital’,‘kindergarten’,‘grave_yard’)
or military in (‘barracks’,‘danger_area’)
or “natural” in (‘field’,‘beach’,‘heath’,‘mud’,‘wood’)
or power in (‘station’,‘sub_station’)
or tourism in (‘attraction’,‘camp_site’,‘caravan_site’,‘picnic_site’,‘zoo’)
order by z_order,way_area desc
) as leisure
WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:26 BST] db=gis,user=postgres LOG: could not send data to client: Broken pipe
[19652] [2011-06-16 11:00:26 BST] db=gis,user=postgres STATEMENT: SELECT AsBinary(“way”) AS geom,“landuse”,“natural”,“waterway” from
(select way,“natural”,waterway,landuse,name
from planet_osm_polygon
where waterway in (‘dock’,‘mill_pond’,‘riverbank’,‘canal’)
or landuse in (‘reservoir’,‘water’,‘basin’)
or “natural” in (‘lake’,‘water’,‘land’,‘marsh’,‘scrub’,‘wetland’,‘glacier’)
order by z_order,way_area desc
) as water_areas WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:34 BST] db=gis,user=postgres LOG: duration: 27690.205 ms execute : SELECT AsBinary(“way”) AS geom,“landuse”,“natural”,“waterway” from
(select way,“natural”,waterway,landuse,name
from planet_osm_polygon
where waterway in (‘dock’,‘mill_pond’,‘riverbank’,‘canal’)
or landuse in (‘reservoir’,‘water’,‘basin’)
or “natural” in (‘lake’,‘water’,‘land’,‘marsh’,‘scrub’,‘wetland’,‘glacier’)
order by z_order,way_area desc
) as water_areas WHERE “way” && SetSRID(‘BOX3D(-39135.75848201033 6222585.598639626,665307.8941941757 6927029.251315814)’::box3d, 900913)
[19652] [2011-06-16 11:00:34 BST] db=gis,user=postgres LOG: unexpected EOF on client connection

==============================================

Wow rather too much information.

The bounding box you are querying and the scale you are using will never deliver good query performance. Neither the planet_osm tables nor the mapnik stylesheet are designed for rendering a large area including Belgium, London, the outskirts of Paris, and most of the Netherlands at this scale. Your bounding box approximates to this tile http://b.tile.openstreetmap.org/6/32/21.png. As you can see Mapnik on the main site renders very few features for this bounding box (largely roads from the planet_osm_roads table).

You queries include some detail information (e.g., picnic sites).

The main mapnik stylesheet is probably best regarded as optimised for mid-scale and large-scale maps.

SO, briefly, if this is what you want to do, you’ll have to live with the performance.