PTNA: new feature - GTFS / OSM comparison

I’m actually more referring to the 4th column on the routes analysis page, where it shows a message saying to use PTv2 name '... ref ... : from => to'

and also on each trip page with the tagging suggestions for tagging the route relation.

Maybe simply a flag to set a custom name format? or to simply disable that message

Ah, I see the first here:

Itinéraire PTv2 : le nom 'name' devrait être sous la forme '... ref ... : from => to'

‘name’ checking can be disabled per PTNA report - completely.

Second one:

name = Bus L2: Station Cégep de Lévis => Station de la Foresterie

Hmm, I would have to implement that per GTFS-feed. The PTv2 proposal suggest using this form, so I thought it would be a good idea suggesting this also in PTNA.

Hi @ToniE,

started to use the great PTNA tool recently. Really helpful!

For my routes, PTNA often complains about mismatch of names of stops, because PTNA seems to compare GTFS stop_name to OSM name. Wouldn’t it be more sensible to compare GTFS stop_name to OSM ref_name (if available, else compare to name)?

Example: bus 96 at Chemnitz (Germany)

My understanding (and tagging practice) is that name is the stop name shown outside at the bus stop sign and ref_name is the stop name used in timetables. Thus, I regularly set ref_name to the stop’s GTFS stop_name. In my regions, stop names in timetables are almost identical to GTFS names.

Best regards,

Jens

I would not do that in general, just because some of the ‘name’ values might we wrong.

So for the bus mentioned above I’d rather fix some things on the GTFS-stop_name

  • Wittgensdorf, ob Bf → Wittgensdorf, Oberer Bahnhof
  • Röhrsdorf, Gh Wildpark → Röhrsdorf, Gasthof Wildpark

and so on: OSM does not like abbreviations

PTNA has a mechanism to do that in a post-processing step for GTFS.
You’ll notice that if the GTFS ‘stop_name’ is followed by a grafik and “mouse-over” will show you the original ‘stop_name’. And: GTFS might be wrong!

For DE-BY-MVV, “W.-Heisenb.-W.” is expanded to “Werner-Heisenberg-Weg” for instance.
For AT-* “Abzw.” is expanded to “Abzweigung” whereas for DE-* this is expanded to “Abzweig” only.
That’s implemented as a long, long list (does not scale very well), but I think that’s the best way to handle most of the stuff. This would fix some issues on GTFS side.
For OSM, we’re more flexible to put “Steinbruchsweg” into ‘name’ and (with the name of the village) “Wittgensdorf, Steinbruchsweg” into ‘ref_name’. This way we have short names on the maps and sufficiently long names for queries to APIs of the authorities: VMS, bahn.de, …

BTW: comparing ‘name’ and ‘ref_name’ with ‘stop_name’ is quite “relaxed”.
It’s OK if one includes the other (‘name’ is included in ‘stop_name’ and vice versa, same for ‘ref_name’).
If ‘stop_name’ and ‘ref_name’ include each exactly one ‘,’, then it’s also OK if 'stop_name = “A, B” and ‘ref_name’ = “B, A”

Thanks for your detailed response.

Just to be sure I understand you right: You would prefer to put ‘Wittgensdorf, Oberer Bahnhof’ into ref_name instead of the GTFS stop_name ‘Wittgensdorf, ob Bf’, because OSM doesn’t like abbreviations? Even if this GTFS name is used at vms.de (‘ob Bf’) and bahn.de (‘ob Bahnhof’)?

Yes, already mailed VMS about the wrong final stop id/location in the example (bus 96). OSM is definitively correct here, GTFS is wrong. :slight_smile:

Thanks for the great tool. It would be helpful if the GTFS/OSM comparison tool could better handle the case where the same stop (/ platform) appears twice in direct succession in the GFTS data. For the VRR network, the duplicate stop entries seem to be a result of the arrival and departure time for that stop not being the same (e.g. when a bus is not scheduled to leave a stop at the same minute as it arrives). As these duplicate stops are not reflected in the OSM relations, all stops following the duplicate entry are off by one and thus displayed as errors when comparing trips (example). This makes it difficult to find the actual errors in the OSM data. Would it therefore be possible to improve the comparison tool in this respect?

This is a very strange example, indeed. I don’t know how to handle that.
I tried vms.de and bahn.de and entered “Wittgensdorf, Oberer Bahnhof” as starting point. Both seem to understand and accept that although they insist in presenting the other, abbreviated form. But I guess, local people don’t use “Wittgensdorf, ob Bf” when talking about the bus stop?

What can be done is: set the “weight” for ‘name’ comparison to zero, so that there is no checking at all for ‘stop_name’ vs ‘name’. I introduced that recently for CA-QC-STLevis and some FR-PAC* - they use ‘stop_name’ with capital characters.

There’s a group here in DE currently working on QA for stops, asking: how can we establish an official channel towards the authorities for error reporting. Open Transport Data Quality Meetup, next online meeting next Monday 2 PM CEST

I’d rather see this as a bug in GTFS data.
GTFS defines “arrival_time” and “departure_time” for the same trip, the same stop.
There is no need for two entries in this case.

I was thinking about that and have a solution for both, all three aspects:

  • introducing a field/a column in the ‘osm’ table of the GTFS feed
    • ‘name_suggestion’
      1. empty - disable ‘name’ suggestion
      2. ‘PTv2’ - according to PTv2, the current suggestion
      3. “Parcours {route_short_name} vers {trip_headsign}” as a custom suggestion (incl. ‘language’ awareness)

For the other issue, I will disable ‘name’ checking in the PTNA report

PTNA has a kind of post-processing of GTFS data.
Currently this is used to delete “false-positive” messages which have been detected by PTNA, so to say: fix PTNA’s bugs

  • example: “Suspicious 1st and 2nd stop: same name” can be deleted for Bus 212, 'cause I personally know that this is not a bug in GTFS data.

I would nevertheless refrain from using this post-processing to fix bugs in the original GTFS data like the above mentioned stop-sequence issue - although it would be possible to do so (via SQL commands).

I think NON-zero weight for name checking is okay and important. It’s more like a problem with GTFS data. Will try to attend OTDQM on Monday to see what we can do in the long-run against such problems (thanks for pointing me to that group!).

@wolfy1339

I released both enhancements/fixes yesterday.

The suggestions for the ‘name’ of OSM route_master and route can be configured per GTFS feed (CA-QC-RTC and CA-QC-STLevis)

  • route_master_name_suggestion = ‘Parcour {route_short_name}’
  • route_name_suggestion = 'Parcour {route_short_name} vers {trip_headsign}

In curly brackets, anything can be configured which is a field/column name of GTFS’ “routes” or “trips” table (plus: ‘osm_vehicle’ ~ “Bus”, “Train”, …)

1 Like

There is actually a typo, it’s missing an s after Parcour

Thanks for implementing this feature!

Oops, thanks, fixed.

Is it possible to show an error when the trip id taged on the OSM route relations doesn’t exist in the GTFS data anymore?

I seem to have come across some trip ids that have been changed in a more recent version of the GTFS data than what I had used locally when I did an import

Yeah, that’s actually one of the next topics on the to-do list.

Currently, this is only marked as “GTFS!” with “‘trip_id’ does not exist” on mouse over
grafik
in the 3rd column but it is worth to be mentioned in the “errors” columns.

1 Like

There seems to be a problem with UTF-8 characters in URLs to view the route sketch (when you click on the route ref on the analysis page), it seems to be trying to URL encode them but in the process it isn’t encoding them properly and it results in garbage

https://overpass-api.de/api/sketch-line?ref=ESQ&network=STLévis&operator=Société%20de%20transport%20de%20Lévis&bg=%23ffa400&fg=%23000000

1 Like

Thanks for reporting this. I fixed that very old bug.

UTF-8 is still a mystery to me, regarding when to use encode() and when to use decode() or even both - in which order then?

edit: @wolfy1339 you might have to refresh your browser’s chache

@wolfy1339 and others:

Would you consider this special report for bus 43E in Québec as a false-positive message?

Itinéraire PTv2 : utilise des bretelles d'autoroute sans entrer sur une autoroute: Way 157685097 (iD, JOSM), Way 157685109 (iD, JOSM), Way 472644112 (iD, JOSM), Way 906916847 (iD, JOSM)

aka: “using motorway_link without entering a motorway” - ‘trunk’ seen as equivalent to ‘motorway’ here.

Actually, the bus passes this short ‘trunk’ some kilometres before

From across the “big lake” I would say these listed ways should be trunk_link instead of motorway_link as the cross over is at Node: 1699454522 | OpenStreetMap. Therefore the message seems correct but let’s wait and see what the locals say.

@ToniE: Two links are incorrect. There is a typo in the “43E” and the “bus” is wrong.

1 Like