OSM(Valhalla and Overpass) docker services are stopped abruptly

Akhendra · February 2, 2024, 9:17am

Hello Team,
My OSM docker services are stopped abruptly in my aws ec2 machine. I hosted the valhalla and overpass services for North America, Asia and Europe continents.
The ec2 machine configuration is 8 vCPU, 64GB RAM, 1 x 1875 AWS Nitro SSD, Up to 12 Gbps Network Bandwidth and Up to 10 Gbps EBS Bandwidth.

These services are working fine when requesting for less GPS data(~1000-5000 GPS) but when requesting through batch jobs where the GPS points are more than 5000 then all the docker services are stopped abruptly.

Please help me how to overcome this issue?
Do I need to increase my system configuration ? or else Is there any cleaning procedure which can clean the memory after every request to these services?

Regards,
Akhendra

ImreSamu · February 2, 2024, 11:32am

It’s difficult to determine the root of the problem with the information provided. More details are needed:

As mentioned, reviewing the logs is necessary.
It’s important to monitor where the bottleneck lies
- is it ‘Valhalla’ or the ‘overpass services’?
What is meant by ‘requesting through batch jobs’?
- How many requests do the services receive in parallel?
- Is it through a custom script?
- It’s possible to introduce a delay ?

Additional note:

Having 8 vCPUs allows for approximately 6-7 parallel job queries, ideally.

Either the job needs to be rewritten to avoid overloading the system (by limiting queries)
or more capacity is required for the server.
- ( In a cloud environment, starting and testing 4-5 new EC2 instances is not difficult. )

In summary, more information is required.

For testing purposes, I would place Valhalla on one machine and the Overpass Service on another. This way, it would be easier to identify where the problem lies and which machine needs upgrading.

If you cannot control the job requests, I would also consider implementing a ‘rate limit’. However, this would require operating some form of a proxy server in front, which complicates the service.

EDIT:

or else Is there any cleaning procedure which can clean the memory after every request to these services?

These are specific questions, and if you don’t find answers here, you should post them in the GitHub repositories of Valhalla and Overpass Service.
My experience is mainly with Valhalla. Unfortunately, it’s very easy to overload and not easy to fine-tune. Furthermore, different versions of Valhalla can behave differently. It might be necessary to test even the latest ‘master’ version. It’s also advisable to review the Valhalla configuration. …

Akhendra · March 26, 2024, 5:19am

As you recommended, I have placed the valhalla and overpass in separate EC2 machines with the same configuration. If I face the same issue again then I will come to know which one causes the issue.
Thank you ImreSamu