Total noob here. Appropriate apologies given, repeatedly, as needed, forever. I have been tasked with building a proof of concept Linux server using CentOS 7 and openstreetmaps to do two things:
turn by turn driving directions
We want to get rid of some very outdated MapPoint sotware (no-longer-supported) and the non-Linux servers used to run it.
Data will be NorthAmerica only. Our preference is to have everything run from the command-line. No maps, just the absolute minimum of input and output through the software (with parsing as needed, of course) to get: address in, lat/lon out; two lat/lons in, turn-buy turn driving directions out. Automobile traffic only, fastest route.
From what I have read, it sounds like the most basic driving directions setup is to use OSRM as a library (libosrm). However, when I look for basic system requirements, I see daunting numbers like 175GB of RAM. But, it is not clear if this is for the basic library, or maps, or…?
for libosrm, what is the absolute minimum hardware requirements that I would need to build a proof of concept machine? Can I get away with dramatically less RAM and have the software function just for minimal tests even if it is dog meat slow?
I have installed Nominatum on a concept box to use for geocoding. Is there a better barebones geocoding solution out there that will, eventually, run on a high-demand server? Again, command-line driven. Ins and outs. My assumption is parsing large XML files is not the way to go on this (as in a database is better) but I really don’t know.
Is there anything else you can point me to that will help? While I have seen many, many posts and pages, most are for maps, web servers, or APIs for exiting services on servers run by others. We need to run our own software.
You can’t get away from memory requirements for building your OSRM routing model: it’s inherent in the whole architecture of the solution. Once built it does not require anything like the memory of the build phase. I believe some folk rent an AWS instance to do the build and then host the completed database themselves.
I cant really speak for or against Nominatim as the geocoder.
One other point is that complete addresses are, at best, patchy in OSM: most of the time this is not too critical, but long named highways without addresses may give suboptimal geocoding results (e.g., at the wrong end of a 2-3 mile highway). On the other hand adding that data to OSM automatically improves the geocoding.
US Accuracy: while we have not tested it extensively on a large data set, so far so good.
Processing: plenty fast for our application
US Accuracy: Really poor. I ran tests on it using several different sources for addresses. The last was 1807 US addresses. Of those, it failed to geocode 1151 completely. I know that OpenStreets has better EU data than the US. Could that be the issue? Something else? Any suggestions? Not looking good…
Response time: Slow. Best was 0.107739 seconds; worst was a whopping 12.237395 seconds. Average was 0.423390 seconds. Looking around forums, I see suggestions of using SSD drives. But, unless we can get around our preliminary accuracy problems, we likely won’t pursue speeding it up.
Anyone have any experience using Photon for US addresses? Or, know of a website that is using it for US geocoding in a real-world production situation other than Komoot (the European company that developed it – https://www.komoot.de/)). Using their front-end at http://photon.komoot.de/, when I throw in some addresses by hand from the failed list, it confirms my findings. In short, it looks like it is not going to work for us for US, but I would appreciate any info on the experience of others. Thanks!
SSD is a must for Photon. I’ve run it on spinning rust, dedicated SSD and virtual machine SSD. Spinning rust response times are terrible. VM SSD is tolerable but inevitably has a startup delay for the first response. Dedicated SSD is much better.
OSM is generally not good at address-level geocoding the world over. It’s better in Europe than the US but still not great. Ultimately the reason is that surveying house numbers is really boring and you’ve got to be a bit broken in the head to want to spend your weekends doing that…
You could consider using the TIGER Geocoder that ships as part of PostGIS. Mapzen’s Search product uses OpenAddresses data (generally municipal/state) as well as OSM; it’s open source so you can install it yourself, though I don’t know of anyone who’s done that, or you can use their commercial hosted version. And of course there are other commercial services such as Mapbox.