Gelöst. Man muss es in der Form -f=xml schreiben.
We’ve only tested on JDK 16 and above (We figured early adopters will use a recent JDK). It should work fine on JDK 13 (Some functionality may trigger a link exception). It definitely won’t work on pre-13 (at least not without modification). If you build from source, you will need at least 14.
Great idea! Is this an attribute of the osm
parent element? (The Wiki page claims “No official .xsd Schema exists”)
I’ve created this GitHub issue.
So:
<osm version="0.6" generator="geodesk gol/0.1.2" upload="false">
Or even upload = never (see #12731 (Add an option to completely prevent upload of a layer : e.g. "never" to upload=true/false) – JOSM)
This is all very JOSM specific I believe. I’m not sure if other apps also support it.
The conversion from WGS-84 to Mercator and back is lossless up to 100 nanodegrees (the resolution used by OSM). However, you will need to set --precision=7
(By default, output precision is rounded to 6 digits, accurate to 10cm).
Maybe default precision should be 7 instead?
A possible solution would be to use negative IDs for synthetic node IDs, but I’m concerned that some downstream consumers may choke on those (But still, as far as I understand the upload process, negative IDs would result in new nodes being created, so the outcome is still unacceptable).
If we implement the option to keep all node IDs, the produced XML will be suitable for editing. Until then: Prevent users from uploading XML with synthetic node IDs · Issue #69 · clarisma/gol-tool · GitHub
Right, negative ids would cause new objects to be created during upload. Another option might be a positive offset value which is a bit larger than the largest ids used for osm objects, but not too large that some tools would fail due to excessive memory requirements. Any attempt to upload such data would fail b/c the object ids don’t exist on the main osm database.
As an example, maybe take a look at extract and check-refs use too much RAM with numerically high node IDs · Issue #234 · osmcode/osmium-tool · GitHub
Das Dingens ist schon erschreckend schnell:
gol query germany "na[amenity='post_box']" -f=xml >box.osm
Retrieved 76.305 features in 960ms
Alle Briefkästen in Deutschland in weniger als einer Sekunde.
Jetzt müsste in box.osm nur noch etwas drin stehen… bei mir sind das 0 Bytes. (steht weiter oben im Faden auch schon mal sehe ich gerade).
Dann muss was falsch sein bei Dir.
22.042.474 box.osm
Ok, ich probier nochmal mit der Version 0.1.2 und einer neuen DB.
→ Ja, lag noch an der alten 0.1.0 Version, mit 0.1.2 passt das Ergebnis.
Ich schaue mir das gerade mal im Vergleich zu Overpass an. Gebe ich jeweils nur 1 CPU als Ressource frei (mit taskset --cpu-list 1
), brauchen beide lustigerweise ziemlich genau 1,6 Sekunden.
Ohne Einschränkung läuft gol dann in 670ms durch, benötigt dafür allerdings 3,2s an User Time (im Vergleich zu 1,65s für Overpass), ganz einfach weil im Mittel 3,8 CPUs belegt werden. Es scheint, dass gol im Moment noch etwas zu stark parallelisiert und für einen relativ kleinen Laufzeitgewinn dann überproportional CPU-Ressourcen verbraucht.
Interessant. Warum ist OP dann im Browser so lahm? Web-Overhead oder eher weil 100 Queries im Schnitt parallel laufen auf dem Server?
Wie immer ist viel los auf dem Server, etwas Web-Overhead / Latenz spielt sicherlich auch noch rein, ist aber eher von untergeordneter Bedeutung.
Und klar, es macht auch einen Unterschied, ob ich nur eine DB mit Germany teste, oder einen kompletten Planet, wo ich dann zusätzlich auf Deutschland filtern muss (ich habe oben jeweils Germany geladen und mit dieser Version getestet).
Most of this time is actually burned up by the formatter, which for OSM-XML is relatively slow (It has to build an object graph and re-create the untagged way-nodes). Ironically, the “simple” formatters (e.g. GeoJSON) are currently slower still because of this bug (The GeoJSON formatter is due to be replaced by a parallelized version anyway).
The query itself should run in < 200ms on a quadcore machine (You can approximate this with-f=count
).
Yes, the query engine is quite greedy and will use every core it can. This is probably overkill for a basic query like this, and will need some tuning. Parallelization pays off for spatial predicates (intersects, within, etc. – coming in 0.2, enabled as Preview now in 0.1.2) because the topological checks are CPU-heavy.
For reference purposes, I’m also posting a few runtimes for the “count” use case:
gol
Single CPU
/usr/bin/time -v taskset --cpu-list 1 bin/gol query germany "na[amenity=post_box]" -f=count --precision=7
75408
Retrieved 75.408 features in 346ms
Command being timed: "taskset --cpu-list 1 bin/gol query germany na[amenity=post_box] -f=count --precision=7"
User time (seconds): 0.58
System time (seconds): 0.07
Percent of CPU this job got: 96%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.67
Many CPUs
/usr/bin/time -v bin/gol query germany "na[amenity=post_box]" -f=count --precision=7
75408
Retrieved 75.408 features in 105ms
Command being timed: "bin/gol query germany na[amenity=post_box] -f=count --precision=7"
User time (seconds): 1.46
System time (seconds): 0.12
Percent of CPU this job got: 491%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.32
Overpass
Query: nw[amenity=post_box];out count;
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="Overpass API 0.7.59.120 (mmd) 84edf1af">
<note>The data included in this document is from www.openstreetmap.org. The data is made available under ODbL.</note>
<meta osm_base=""/>
<count id="0">
<tag k="nodes" v="75398"/>
<tag k="ways" v="10"/>
<tag k="relations" v="0"/>
<tag k="total" v="75408"/>
</count>
</osm>
Command being timed: "src/osm3s_query --db-dir=db"
User time (seconds): 0.34
System time (seconds): 0.04
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.39
Thanks, this is very helpful.
I continue to be blown away by how far the JVM has come. Quarter-second startup time, and that even includes setting up the gol
program itself and opening the database.
We’ve fretted about class loading, bytecode verification, bounds checks at every corner, and of course GC, but these things have essentially become non-issues (It does defer JIT compilation – as the name implies – which competes with query execution in the early phase).
Strong showing by Overpass as well. Is it using indexing for this type of query?
(By the way, are these 4 physical cores, or 2 cores hyper-threaded as 4?)
Feature Request : Would it be possible to add a link to the OSM object (like in Overpass Turbo) in the map view (-f=map)?
Good idea.
I’m leaning towards keeping the tooltip (vs. popup, which requires clicking instead of hovering), and instead make the feature clickable.