Queensland Translink GTFS feed IDs/tagging modernisation/PTNA?

I noticed the ISO 3166-2 code used by the SEQ feed in https://wiki.openstreetmap.org/wiki/List_of_GTFS_feeds#Australia is incorrect: according to Wikipedia, at least, AU-QD was never the code for Queensland, it was initially AU-QL, then changed to AU-QLD in March 2004 (before I was born!), so it should be AU-QLD. The PTNA listing should be corrected too (I think @ToniE is who I should ping about this?).

More to the point (and why I’m writing this post and not simply correcting the wiki article), I noticed that this feed ID isn’t actually used anywhere on the map, and all the GTFS tagging in Queensland seems to be in the older, deprecated gtfs_id=* style instead of the newer gtfs:* namespace. This is a problem, because one stop may have multiple stop IDs from different feeds: an example is Morayfield bus station, platform 1, which is #321172 in the SEQ feed but #740169 in the Kilcoy feed. Currently this is tagged in note=* which is somewhat disappointing. I think the stop IDs might be unique across all the feeds, which might help with some sort of automated edit to update things; but I did find out that bus routes can have the same number as a route from a different feed/network/service area/whatever the right term is (not sure) (e.g. SEQ route 896 and Kilcoy route 896).

I’d like to add some of the other feeds to that page, and I think changing the IDs to include Translink might be an improvement too (AU-QLD-Translink-SEQ, AU-QLD-Translink-Kilcoy, etc.), I think most of the other feed IDs on that page include the network or operator name, and it’d be more future-proof.

It might also be nice to have PTNA set up to analyse the routes to check their correctness/completeness, maybe with the CSV in Queensland/Public transport/Analysis or something like that. I think some of them are missing stops or route master relations and some stops themselves are missing too.

There is the slight wrinkle that we don’t yet have a waiver for these feeds, the data sources wiki article says “CC BY 4.0 - explicit permission - waiver sent 16th Mar (TMR), followed up 13th Mar, last update 24th Apr”, although “explicit permission” sounds good so maybe it’s not much of a wrinkle? And if the data is already in OSM and has been for a few years it seems reasonable to improve it either way.

If those feed IDs are what gets used (and if I’m understanding gtfs:* correctly), I think the earlier example would be tagged something like (just the main ones here):

public_transport=platform
local_ref=1
ref=321172;740169
gtfs:stop_id:AU-QLD-Translink-SEQ=321172
gtfs:stop_id:AU-QLD-Translink-Kilcoy=740169
route_ref=651;653;654;656;895;896;9999
network=Translink

Not sure how transport:zone=* should be tagged, for this stop it’s different for the two different service areas (zone 3 for SEQ, zone 1 for Kilcoy). Adding the feed suffix to something not in gtfs:* seems weird, but would probably work fine, and Taginfo lists no users so it (hopefully) wouldn’t break anything. I think some stops (maybe more for train routes) have multiple zones, although with the 50c fares the zone system seems somewhat vestigial now anyway. Or I could just do transport:zone=3;1 (or =1;3) I guess? route_ref also has this ambiguity but it should matter less there, since you should be able to determine which service area the referred route is part of from the route’s tags and/or the stop’s relation memberships.

Any thoughts?

I suppose some of the feed IDs might get a little long if they’re spelt out like that, like maybe AU-QLD-Translink-Magnetic-Island. Magnetic Island also has a separate feed for the ferry (for some reason) which would then be something like AU-QLD-Translink-Magnetic-Island-Ferry. The URLs for the feeds do have a three-letter code that might be better: those two would be AU-QLD-Translink-MAG and AU-QLD-Translink-MIF. Less verbose but maybe harder to parse?

That didn’t age well. That’s 16th March 2018. At the time they only provided me the waiver for the road centerline dataset at File:AU QLD TMR 2018-11-13 OSM Permission.pdf - OpenStreetMap Wiki I then followed up again in 2021 on the same thread about a broader waiver but didn’t hear back.

I never really liked the permission email at Attribution/qld.data.gov.au explicit permission - OpenStreetMap Wiki since to me it reads like they are saying OSM can use their data under the terms of the CC BY license, but that’s already a given based on the license, we request permission since OSMF doesn’t agree (or at least has concerns) that we can use CC BY directly, so we ask for the publisher to confirm that the two areas of concern around the compatibility of CC BY with OSM are waived/clarified. This is why we have the standard CC BY waivers from the OSMF LWG. Granted those waivers didn’t exist at the time of the explicit permission we have, which is why I tried to obtain the waivers to confirm what we were assuming to be true, that there’s no issues using these datasets in OSM.

Yep, that’s fixed now.

No problem, I can add several of them. I just need a couple of information:

  • which area(s) shall I use searching for PT route relations using overpass-api
  • what’s the name / what are the names (short and long form: possible ‘network’ values) of the transport authority/authorities for the analysis
    • Sydney: it’s any name
    • Adelaide: “Adelaide Metro” as long form, there is no short form
    • for Munich, DE: “Münchner Verkehrs- und Tarifverbund” and “MVV
  • What are the URLs of those authorities
  • Where to store the CSV list in the OSM wiki
    • already suggested as starting with Queensland/Public transport/Analysis
  • optional: what shall I use as name for the PTNA analysis AU-QLD-* (short form of ‘network’ preferred)
    • for Munich, DE: DE-BY-MVV

Correct, PTNA takes that into account comparing a GTFS trip with an OSM route relation if the GTFS source is e.g. AU-QLD-Translink-SEQ or AU-QLD-Translink-Kilcoy.
Example PTNA - Compare GTFS trip with OSM route, scroll downwards: around column 13 “gtfs:stop_id:DE-BY-MVV
N.B. “ref:IFOPT” is a European “Identifier For Objects in Public Transport”

1 Like

“Short is beautiful”. Those three-letter-codes are much appreciated for PTNA.

1 Like

I went and added all the feeds using the short IDs to that wiki article. There’s quite a lot of them, but the public transport coverage definitely gets a lot sparser towards the centre of the country, and most of the feeds probably only have a couple routes, AU-QLD-Translink-SEQ is the big one.

Q111640183 (Greater Brisbane), Q1025962 (Cairns Regional), Q1005343 (Bundaberg Region), Q1066821 (Gympie Region), Q828357 (Sunshine Coast Regional), Q1069385 (Gold Coast City) is probably a decent start, but I’m not sure what the full list would be, I’m probably missing a few things here. Is Q36074 (Queensland) an option or is it broken down into smaller regions for performance reasons?

The network=* value would be Translink; I’ve never seen it shortened, so AU-QLD-Translink would make sense as the analysis ID. TransLink is used as well but is about 1/5 as common, and the plainly-cased one seems to be the one https://nsi.guide/ prefers, which I would agree with. The URL is https://translink.com.au/.

Thanks for the list. Using QLD als search area shouldn’t have a performance issue, size of search area is OK, size of found data is more relevant. I’d suggest having several small analysis configurations rather than a big one. Reason:

PTNA is able to manage a CSV list which includes a ‘ref;bus’ pair being listed multiple times (DE-SN-VMS has 11 times 'ref'='A', for some even ‘operator’ is identical, …) but that would create much effort for the maintainers of the CSV. Also managing a long, long list of routes is more tedious, local mappers in Cairns wouldn’t be interested in seeing buses from Brisbane, …

The advantage of a single QLD-wide analysis is that you see all PT mapped over there.

But, similar to DE-Bahnverkehr, US-Amtrak, EU-Eurotrains, EU-Flixbus, … we could also have a dedicated AU-QLD-Trains excluding all route=* values except “train” and at the same time exclude all route=train from the other analysis configs, focusing there on bus, subway, tram, ferry, …

I forgot to mention an advantage of a single AU-QLD-Translink, but we can check whether this can also be used with several smaller analysis configs.

If we have a GTFS feed AU-QLD-Translink which provides data for all PT in QLD, we can maintain the CSV data (keep the CSV data up-to-date with GTFS feed data) with the new feature “CSV data import from GTFS” which is available since February, has been implemented by @NeatNit (section of Public Transport Network Analysis/Syntax of CSV data - OpenStreetMap Wiki).

Edit: Update link to “CSV data import from GTFS”

By https://translink.com.au/plan-your-journey/maps, SEQ would also need to include
Q1782552 (Logan City)
Q1631867 (Ipswich City)
Q1064985 (Moreton Bay Region)
Q1492782 (Redlands Region)
while TransLink has a number more state-wide, but it doesn’t cover all PT in Qld, so I don’t think

would work?

The Greater Brisbane boundary contains all these, I think.

Oh! I was not actually aware that there are other PT networks in QLD. Which ones are those?

I guess if using the QLD boundary isn’t going to work, since the routes have already been imported, I could probably load all the boundaries and routes and spend a while in JOSM figuring out a full list? Or find a GTFS→GeoJSON tool and use that instead of assuming no routes have been added in new areas since whenever they were imported.

Dunno?

They are certainly different areas on the TransLink map, with Brisbane SE stopping at Springwood, then changing to Logan after that?

Also don’t know about Scenic Rim, as there is a bus from Logan area that goes to Beaudesert 250512-logan-network-map.pdf althugh Beauie isn’t mentioned anywhere?

You may well be right!

I was thinking local networks in various smaller towns, but no, there don’t appear to be any, although they all have school buses going usually ~30k out of town to pick kids up for school.

There is also a Greyhound network https://www.greyhound.com.au/company/network-map but I guess that’s a separate kettle of fish / can of worms again! :crazy_face:

1 Like

I managed to get the boundaries and routes loaded into JOSM. Roughly from north to south along the east coast, the wikidata=* of each boundary containing a route is Q1025962, Q1048446, Q1066884, Q1069401, Q951713, Q675722, Q201774, Q925987, Q1005343, Q1066809, Q1066821, Q1322992, Q828357, Q1065025, Q111640183 (contains Q1064985, Q917682, Q1492782, Q1631867, Q1782552), Q1067014, Q1065012, Q1069385, Q1065054, Q1065065.

Not sure whether one big analysis config or multiple smaller ones would be better. Some of the feeds overlap others and form a larger network (SEQ with a few others; MAG, MIF and TSV), which might be better to analyse all at once; but the rest are separated and could be split into other configs. There’s also a pretty large disparity between the number of routes in SEQ and in the rest of the feeds, maybe having a bunch of small configs would be tedious to maintain:

$ wc -l *_GTFS/routes.txt
     5 BOW_GTFS/routes.txt
    14 BUN_GTFS/routes.txt
    23 CNS_GTFS/routes.txt
     9 GLT_GTFS/routes.txt
    10 GYM_GTFS/routes.txt
     6 INN_GTFS/routes.txt
     4 KIL_GTFS/routes.txt
     2 MAG_GTFS/routes.txt
     3 MAL_GTFS/routes.txt
    15 MHB_GTFS/routes.txt
     2 MIF_GTFS/routes.txt
    23 MKY_GTFS/routes.txt
     3 NSI_GTFS/routes.txt
    20 RKY_GTFS/routes.txt
  1572 SEQ_GTFS/routes.txt
    17 TSV_GTFS/routes.txt
     9 TWB_GTFS/routes.txt
     6 WAR_GTFS/routes.txt
     3 WHT_GTFS/routes.txt
  1746 total

Maybe we could just start with one big config and split it if it proves unwieldy? And if there’s no other PT in QLD (except maybe a few TfNSW routes that cross the QLD–NSW border), it should be fine to just use the QLD boundary (Q36074), if I’m understanding how this works correctly.

1 Like

Yeah, let’s start with QLD and see what we get.

Its 9:20 AM here, I’ll start in the afternoon. I guess you’re 8 hours ahead?

1 Like

Here we are: AU-QLD-All. Size of XML data is < 40 MB, so pretty small.

There is yet no CSV list in the OSM Wiki Queensland Routes, I’ll work on that, create a template and give examples on how to handle duplicate bus numbers like 110 and 111 in Brisbane and Cairns.

The current output lists what has been found, relation by relation. Route_master and routes are not linked together, this will be done once their ‘ref’ appears in the CSV list.

Ideally, the CSV data will list all routes which exist in real life. The report will then show what’s missing in OSM and where OSM has artefacts.

afk for a while.

I’ve created a first version.

The ‘operator’ field of the CSV is important only if a bus number exists more then once.
In order to distinguish multiple occurrences of an e.g. “110;bus;” entry, the value of ‘operator’ in the CSV is compared with the corresponding value in the route/route_master relation(s).

...
=== Brisbane
110;bus;;;;Brisbane Transport
...
=== Cairns
110;bus;;;;Sunbus
...

@OrichalcumCosmonaut @Fizzie41

2 Likes