Server sufficient?

Hi,

I’m trying to get a grip on making my own tilecache. For one thing to get the map i need (I really would want a map without all the roads, but maybe with navigation relevant to another way of transportation). For the first try I followed the Mapnik tutorial, which worked out as far as osm2pgsql worked fine (its a Debian box). The harware is not too bad. Its an i7 of some sort (think its one of the smaller one, but still it has 8 cores), 8GB of RAM and around 1TB of HDD space (SATA, Soft Raid 1).

Now, it seemed to work fine (osm2pgsql was doing its magic) until I had these massive server loads of 170 and beyond. Panic enough to just restart the box via the providers external admin interface before something happens to the disks. Now, I assume this is caused through insufficient HDD capabilities. They are really basic in there, no fancy raid or anything.

Now, if that is the case, I would like to know if the following box would be a good place to squeeze the enormous database into. I inherited a Dell Poweedge 2600 with 2x 2.8 GHz HT Xeons and (so far) 1GB of RAM plus 4x U320 HDDs working in a hardware RAID-5. Nice machine which uses too much power to be a nice homeserver. Now … what I’d like to know would it make sense to store the pgsql database on there (at home) and maybe use another computer to create the actual tiles. Storage space on the Dell machine is limited to ~250GB in the RAID. New HDDs are very expensive, really don’t want to get into that. Thats why I though I could use my PowerMac Dual G5 2.5 with 5GB RAM and more than enough HDD space to create the tiles itstelf which then again would be uploaded to the server.

Hope I didn’t lose you though all these machines and connections. Now, the Dell has only 1GB RAM. Would that do? It has a limit of 12GB which would not even be that expensive to get on ebay. Yet I’m not in that much of a need for it. If it takes a few days longer because of 1GB its fine, if we talk about weeks … well … then it might be worth the upgrade.

Please excuse my noobness on this topic, yet I have great difficulties getting the practical part of my project started. Thank you very much for any help (also procedure wise) you can give me.

cu
Roman

This question does not have a simple answer. I’ll just answer the parts that I know.

An osm2pgsql process on the world data using an i7 would definitely saturate the HDD for nearly the entire import process. I’m not familiar with Debian, and don’t know the meaning of server load=170, but osm2pgsql is likely to be able to use all resources available on the machine during import.

Since you mention that you don’t need highways, be sure to disable these imports by modifying the osm2pgsql import rules files. Similarly, if you don’t need lakes or rivers, be sure to disable these imports.

I think I remember some people performing the import with 1GB and slim mode, but you may be able to more than double the import time by upgrading to at least 2 GB. This all varies based on how much data you can eliminate from the osm2pgsql import rules. A ‘classic import’ of the world with highways, rivers, and landuse boundaries would probably take 1 week+ with 1GB of ram. It may take 1/2 to 1 day with much of that information excluded. That’s just a wild guess, not having done that exact process before.

Using the PowerMac would be a good option; if you aren’t concerned with highways, the OSM data would not change very often and thus it is worth the trouble of creating the tiles on a separate machine.

Hi,

thanks for this. I didn’t know that turning off the import of roads and alike would be possible. That is a very valuable information. However, I’m not entirely sure where or how I’d set these limitations. Could you please give me a pointer?

My G5 refuses to propery install all the elements I need for tile generation (macports f’s around) yet I dug up the dual xeon. Will give it a shot with the 1GB of ram for now. The app needs developing anyway and for testing I’m using the OSgeo tile server. That means even if it takes a week or longer I could live with that. Well … I really just worry about the electricity bill ;D

About the Unix System load: it basically tells you how much overload there is. 1.00 means the system works at top efficiency. There are just as many requests as the machine can handle. Now … 2 means that there is a 100% overload, and so on. 170 should mean something like 17000% overload. No wonder I was unable to ssh the box any more.

This mapping biz consumes a lot of computer power, more than I ever had to provide for any other task yet. Anyway, if I’ll do my tiles at home and then send them to the webserver I should be fine. If a dedicated box doesn’t respond to commands for 2 weeks i can live with it, but a webserver being down for a few days is not exactly what I can handle ;D

Another topic (in the Q&A forum) suggested Tirex. Looks very interesting as well. At least I learned some new and promising things today. Feels like I’m really close to actually get my mapping setup to work.

cu
Roman

However, I’m not entirely sure where or how I’d set these limitations

The schema is described in the file default.style for osm2pgsql , or you may have specified a custom style sheet with the -S on the command line. This is a text file which you can edit to comment out the types of features you don’t need.

No wonder I was unable to ssh the box any more.

The processes took so much RAM on my machine that it swaps all normal functions to disk, so that even an SSH shell would take a while to load back into memory.

This mapping biz consumes a lot of computer power

That’s for sure. I get the impression that Postgres and PostGreSQL both use the OSM database as both a showcase example of large DB handling, as well as a primary test case for development!

I do not believe it is that simple. Stylesheet actually selects tags which will be imported and if a way has at least one tag from the list it will be imported. You can comment out highway tag but if the way has any other tag included in the stylesheet it will be selected still into osm_lines. You can get rid of highways by commenting out all the tags usually used with highways but that would have a side effect: No object would get name, ref etc tags.

I don’t know if you solved your problem, but the specs of your server seams enough to me for an osm2pgsql import of planet.osm

And if I read you well the load of 170 occured during the osm2pgsql import, and that shouldn’t happen, even with a slower machine. (During my imports the load stays around 2 or 3) So I’ll guess you’ve exhausted your Memory

  • Either you forgot the --slim osm2pgsql switch (it is impossible without unless you have something around 128GO or RAM)
  • You set the in memory node cache (-C XX) to something too big. I also have 8GB of memory and found that -C 1500 is a good compromise to avoid eating too much RAM for nothing and still increase import speed

On a side note : depending on your futur objectives, however, a software RAID 1 on two SATA mechanical disk might alway be the bottle neck of your system, not that it won’t be possible, but it might be very slow because of that. But again, that all depends on what you want to do and how fast.

By the way +1 on JRA

Hi all,

thanks for your valuable input. Indeed the slim switch was the missing element. I’m such a nutcase. Sorry about that. Anyway. I managed to finally download the latest osm file to my home network. Living on a 2Mbit line thats quite a task ;D

Modified the default.styles and hope that the ~175 GB of space are enough to keep the hopefully reduced database. In fact if the styles file works the way predicted most things should be missing. I really just left a few things on which might come in handy.

If it works I might even equip that box with 12GB of ram to speed up the whole process allowing more testing with different settings. Further, the slim mode is fine, I should be able to work remotely on a webserver again saving me all that down- and uploading.

The soft raid sure isn’t the fastest choice. They had a 3Ware Hardware raid controller option on these servers. Thought about it. Love these controllers. Yet it seemed fast enough for the task of simple webserving at the time. Next time.

Great plus for this forum so far. Fast and friendly help.

Wish me luck with the current conversion and the so far untested rendering process.

cu
Roman

Fingers crossed ! for an unmodified default.style you need around 100GB of free disk space for europe only…

You don’t need to do it very often, read this : http://wiki.openstreetmap.org/wiki/Minutely_Mapnik if you want to keep your database up to date.

Well, I didn’t mean the softraid is the problem. I’m using md software raid on linux and that is working very well (But I’m using RAID 0 software raid). Also the bottleneck will be your mechanical disks instead of SSD disks

You can check the osm2pgsql benchmarks if you are interested, mines are at the bottom of the page
http://wiki.openstreetmap.org/wiki/Osm2pgsql_benchmarks

The talk of Frederik Ramm at SotM10 might be of interest with regard to what to expect from different hardware: http://wiki.openstreetmap.org/wiki/SotM_2010_session:Tuning_the_Mapnik_Rendering_Chain

Both the slides and the video of the talk are linked from that page.

If you want to import the entire world (rather than just a country or so), 1 Gb is rather on the low side. It should be possible, but performance isn’t going to be great and upgrading it to 8 or 12 Gb is likely to be fairly beneficial, but it does depend on how fast you need it to be.

This 1 GB should suffice if you prefilter the .osm file before you import it via osm2pgsql into your database.
http://wiki.openstreetmap.org/wiki/Osmfilter