PTNA: news for Public Transport Network Analysis

NeatNit · February 22, 2025, 12:43am

How is from/to mapped in OSM right now for these routes?

ToniE · February 22, 2025, 12:46am

seems that GTFS is wrong here w.r.t. these two operators. In reality (OSM) they are different.

NeatNit · February 22, 2025, 12:50am

For what it’s worth, for Israel I currently take the trip_headsign from trips.txt - to from trips with direction_id=0, and from from trips with direction_id=1. It’s not flawless, but it seems to match what’s already mapped in OSM in the cases I’ve checked. As I mentioned to you, PTv2 stipulates that from and to should be the name of the start and end stops of the route (at least, that’s my reading of Tag:route=bus - OpenStreetMap Wiki), but what’s mapped right now (and IMO more useful) is a more general thing like “downtown” or the name of the destination city when it differs from the origin city.

Well, who do you trust more?

Edit: I’ll just not add from and to for now, and let the scripts and PTNA complain about it. We can always deal with edge cases later.

I’ll see if I can start work on this tomorrow or early next week.

ToniE · February 22, 2025, 1:01am

Yep, those tiny pitfalls can also be fixed manually after an import. Updates are done once a month only.

One important thing to mention about your approach is that we automatically detect/import new routes and delete no longer existing ones. This remains as the added value even for those GTFS feeds where the route_id and route_short_name never change for a given bus, …

Time to go to bed for me, so long, good night!

NeatNit · February 22, 2025, 1:14am

I don’t think it’s reasonable to expect a human to go back every month and fix something, only for the importer to automatically undo their fix the next month.

At worst, a viable solution would be to pull the buses out of the filter and type them out manualy:

1;bus;;from1;to1;...;rab-14-301-1
1;bus;;from2;to2;...;rab-14-810-1
@bus
@route_id!=rab-14-301-1
@route_id!=rab-14-810-1
# all other routes will go here...
@@

But then these routes will always appear at the top of the list, instead of their rightful place sorted among the others. Still, the lesser evil. Especially for route 1 which is supposed to be at the top anyway.

I should be doing that too - good night!

ToniE · February 22, 2025, 11:46am

GTFS of LU is CC-BY-4.0 which is, without further notes, not compatible with OSM licenses.

ToniE · February 22, 2025, 11:49am

Oops! I found a design issue/gap regarding the GTFS to CSV injection.

Pro: this is triggered whenever a new GTFS version is published on the server
Con: this is not triggered when the CSV data got enhanced/changed regarding ‘@…’ statements

NeatNit · February 22, 2025, 1:46pm

Right, that is an annoyance. If you can save routes.json (aka catalog.json for Israel) and run the CSV injection script (+ wiki upload if there are any differences) as a pre-processing step for analysis, that would solve the issue.

ToniE · February 22, 2025, 2:23pm

Good point! Manageable!

reading the CSV data from Wiki is anyway done before
injecting data into this file is fine if e.g. IL-JM-Jerusalem.json is found in the working area
upload to Wiki if there is a diff
start analysis

NeatNit · February 22, 2025, 3:35pm

This was so much easier than Israel.

#!/usr/bin/env python3

import os.path
import csv
import json
import argparse
import re

def main(gtfs_dir, out_file):
    gtfs_routes = get_gtfs_routes(gtfs_dir)
    ptna_routes = convert_to_ptna_routes(gtfs_routes)
    sort_routes(ptna_routes)
    output_routes(ptna_routes, out_file)

def get_gtfs_routes(gtfs_dir):
    agency_file = os.path.join(gtfs_dir, 'agency.txt')
    with open(agency_file, newline='', encoding='utf_8') as csvfile:
        reader = csv.DictReader(csvfile)
        agency_name_by_agency_id = {agency['agency_id']: agency['agency_name'] for agency in reader}

    routes_file = os.path.join(gtfs_dir, 'routes.txt')
    with open(routes_file, newline='', encoding='utf_8') as csvfile:
        reader = csv.DictReader(csvfile)
        routes = list(reader)
    
    for route in routes:
        route['agency_name'] = agency_name_by_agency_id[route['agency_id']]
    
    return routes

def convert_to_ptna_routes(gtfs_routes):
    return [gtfs_route_to_ptna_route(route) for route in gtfs_routes]

def gtfs_route_to_ptna_route(route):
    # route_id,agency_id,route_short_name,route_long_name,route_type
    return {
        'ref': route['route_short_name'],
        'route_type': gtfs_route_type_to_osm_route_type(route['route_type']),
        'comment': route['route_long_name'],
        'operator': route['agency_name'],
        'gtfs_feed': 'DE-BW-bodo',
        'route_id': route['route_id'],
    }

def sort_routes(ptna_routes):
    def sort_key(route):
        ref = route['ref']
        ref_num = re.search(r'\d+(\.\d+)?', ref)
        ref_num = float(ref_num.group()) if ref_num else float('-inf')
        return (route['route_type'], ref_num, ref, route['route_id'])
    ptna_routes.sort(key=sort_key)

def output_routes(ptna_routes, out_file):
    with open(out_file, 'w', encoding='utf-8') as f:
        json.dump(ptna_routes, f, ensure_ascii=False)

def gtfs_route_type_to_osm_route_type(gtfs_route_type):
    # cover just the route types present in bodo.zip for now
    return {
        '0': 'tram',
        '2': 'train',
        '3': 'bus',
        '4': 'ferry'
    }[gtfs_route_type]

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument("-g", "--gtfsdir", required=True, help="directory containing the unzipped GTFS files")
    parser.add_argument("-o", "--outfile", required=True, help="output file, json")
    args = parser.parse_args()
    main(args.gtfsdir, args.outfile)

Haven’t tried it with the injector yet, I will in a second.

There’s an issue though: these train routes have no route_short_name and thus no ref:

route_id	agency_id	route_long_name	route_type
obb-9-WB1W-1	obb-02	Westbahnhof - Hauptbahnhof	2
obb-10-CH3-1	obb-85	Bahnhof - Lindau-Reutin	2
obb-13-CH2-1	obb-01	Bahnhof - Bahnhof	2
obb-14-0A3-1	obb-01	Hauptbahnhof - Lindau-Reutin	2
ddb-90-751-1	ddb-V7	Hauptbahnhof - Singen (Hohentwiel)	2
ddb-90-O401-1	Default	Lindau-Insel - Feldkirch	2

ToniE · February 22, 2025, 3:57pm

I’d say: if anything can’t be used: there will be no catalog entry

NeatNit · February 22, 2025, 4:05pm

Sure. What does PTNA do if there are CSV lines with missing type or ref?

NeatNit · February 22, 2025, 4:20pm

FYI, in addition to complaining about the missing ref, ptnaFillCsvData.py prints the following errors:

Error: the following routes can't be differentiated by PTNA:
{'ref': '1', 'route_type': 'bus', 'comment': 'stadtbus Ravensburg Weingarten Schmalegg - Hofgut - Huberesch - Ravensburg Bahnhof - Weingarten - Baienfurt - Baindt Marsweiler/Rathaus', 'operator': 'BW', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'rab-14-301-1'}
{'ref': '1', 'route_type': 'bus', 'comment': 'Ortsbus Meersburg BSB-Hafen-Töbele-Parkplatz Allmend-Daisendorf', 'operator': 'BW', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'rab-14-810-1'}
Error: the following routes can't be differentiated by PTNA:
{'ref': '626', 'route_type': 'bus', 'comment': 'BürgerMobil Meckenbeuren e. V.', 'operator': 'Strauss', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'bod-18-626a-1'}
{'ref': '626', 'route_type': 'bus', 'comment': 'BürgerMobil Meckenbeuren e. V.', 'operator': 'Strauss', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'bod-18-626b-1'}
{'ref': '626', 'route_type': 'bus', 'comment': 'BürgerMobil Meckenbeuren e. V.', 'operator': 'Strauss', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'bod-18-626c-1'}
Error: the following routes can't be differentiated by PTNA:
{'ref': '7382', 'route_type': 'bus', 'comment': 'Ahausen - Bermatingen - Markdorf (Schülerverkehr)', 'operator': 'BW', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'rab-14-189-1'}
{'ref': '7382', 'route_type': 'bus', 'comment': 'Meersburg Daisendorf Riedetsweiler Baitenhs. - Ahausen - Bermatingen - Markdorf', 'operator': 'BW', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'rab-14-382-1'}
Error: the following routes can't be differentiated by PTNA:
{'ref': 'BAT', 'route_type': 'ferry', 'comment': '', 'operator': 'Bodensee-Schiffsbetriebe GmbH', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'sbb-94-003Y-1'}
{'ref': 'BAT', 'route_type': 'ferry', 'comment': '', 'operator': 'Bodensee-Schiffsbetriebe GmbH', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'sbb-94-004Y-1'}
Error: the following routes can't be differentiated by PTNA:
{'ref': '', 'route_type': 'train', 'comment': 'Bahnhof - Bahnhof', 'operator': 'OEBB Personenverkehr AG Kundenservice', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'obb-13-CH2-1'}
{'ref': '', 'route_type': 'train', 'comment': 'Hauptbahnhof - Lindau-Reutin', 'operator': 'OEBB Personenverkehr AG Kundenservice', 'gtfs_feed': 'DE-BW-bodo', 'route_id': 'obb-14-0A3-1'}

ToniE · February 22, 2025, 4:24pm

It shows an “Error” line in the report with "CSV data includes errors. Line %s of Routes-Data. Contents: '%s'"

NeatNit · February 22, 2025, 4:25pm

Example output: Bodensee-Oberschwaben Verkehrsverbund/Analyse/DE-BW-bodo-Linien/Template Test - OpenStreetMap Wiki

ToniE · February 22, 2025, 4:25pm

Well, in those cases, the CSV needs to handle them separately by @filter!=… and so on

ToniE · February 22, 2025, 4:29pm

Yeah, even for me it is hard to sort them out by city, … local guys (heroes) may help here @mcliquid @skyper

NeatNit · February 22, 2025, 4:37pm

Currently the sorting goes by the first number in the ref. Refs with no number go first.

Sorting is done when creating the json file, in this case by this code:

def sort_routes(ptna_routes):
    def sort_key(route):
        ref = route['ref']
        ref_num = re.search(r'\d+(\.\d+)?', ref)
        ref_num = float(ref_num.group()) if ref_num else float('-inf')
        return (route['route_type'], ref_num, ref, route['route_id'])
    ptna_routes.sort(key=sort_key)

Or in layman’s terms: sort by type (bus, ferry, train, tram), then by the number in the ref, then by ref (alphabetically), then by route_id.

Sorting by type should usually not matter because filters wouldn’t normally allow routes of different types, but it does matter if you do what I did in the example output with:

== Everything else
@!=
...
@@

mcliquid · February 22, 2025, 5:59pm

I really tried to read and understand the thread from above, but unfortunately it’s way too advanced for me in terms of technology and programming.
How can I help specifically?

In principle, it can of course be really mean and confusing, because in DE-BW-bodo (part of the four-country region around Lake Constance), a total of four countries, ten federal states and countless regional and local buses are mixed up.

NeatNit · February 22, 2025, 6:26pm

Take a look here: Public Transport Network Analysis/Syntax of CSV data - OpenStreetMap Wiki under the title “CSV data import definition: @”. Try to read it and get an idea for what’s going on. Please give me feedback on how easy or hard it is to read and understand, and let me know if you have any ideas to make it clearer. Feel free to edit the page if you feel confident about it.

If you’re not familiar with “regular expressions” I expect this to be the hardest bit to understand, in which case I advise you to not go too deep into it at this stage, just know that regular expressions are a way to determine whether a string is of the right pattern. Technical details on how that works can come later.

Anyway, once you hopefully understand the idea for the templates and filters (which may require a few rounds of you asking questions), we need to create import filters/templates in this page: DE-BW-bodo-Linien/Template Test

I’ve already made appropriate filters for the first few sections:

== Züge
@train
...
@@train
-

== Busse

=== Stadtverkehre

==== Friedrichshafen
@route_id~^bod-17
...
@@route_id~^bod-17
@route_id~^(bod-18-0N3-1|rab-14-031-1)$
...
@@route_id~^(bod-18-0N3-1|rab-14-031-1)$
-

==== Überlingen
@route_id~^bod-15
...
@@route_id~^bod-15
-

== Fähren
@ferry
...
@@ferry

But I don’t know how well I did. I can’t read German!

Edit: of course, use the actual DE-BW-bodo-Linien CSV page as a basis