PTNA: news for Public Transport Network Analysis

I checked the documentation published by the MOT (it’s mostly in English, sorry I didn’t bring it up sooner) to find out. I also agree that it would be a good idea to track the data for a while and make sure they follow their own spec.

It seems route_id is permanent:

2.5.3. Each alternative (“ חלופה ”), gets its own route_id, and remains constant throughout the lifetime of the alternative.

2.5.4 For each such alternative, there are three parameters that are unique to the alternative: schedule, sequence of stops, and trip route.

2.5.5 Even when one of these data changes (schedule / stations / route) the record remains with the same route_id.

The same cannot be said for trip_id (emphasis mine):

  1. It should be noted that the trip_id field in this file is a running number only. The date field structure (xxxx_ddmmyy) is meaningless to the user in the files, but is intended to create a unique field only.
  2. It should also be emphasized that this field does not reflect the actual “trip ID” of the public transport operators in Israel. The information for the “trip ID” - the exact day of the week and hour of day, is detailed in the reference file found in the package: ‘TripToDate’.

There’s no mention for shape_id.


Absolutely, agreed.

That’s very true. It requires some very peculiar handling, which might be too much to ask of you (and generally inappropriate IMO to ask from a global maintainer). I might look into handling it myself, if you agree. Or at least taking part in maintaining the code that handles it, should any issues pop up in the future.

Well, it looks fantastic :slight_smile: I’d love to hear what you think the most useful parts of PTNA are. It might be something I didn’t even find yet. I guess the route validation and error output easily takes the cake, but other than that?

Would that work? Can we try it now? I added it just now for line 2 of Metronit (bus) on the wiki. I hope I did it right, I went by the field order defined in the comments there which is different from what you wrote above.

I’m less interested in adding it via the route relations because I want to add all routes to PTNA, even those that aren’t mapped yet. (How does PTNA handle routes that are documented but not mapped?)

Oh, so it shouldn’t work yet until you add that code. Well, I say go for it! But it seems to me like id1|id2 makes more sense than "id1;id2" - it should match how ref multiple values are allowed, which is also documented in the comments of the CSV.

I see, amazing!

That’s absolutely great.

That’s OK, trip_id is relevant if you add this to the route-relation. So let’s skip that.

OK, I just mentioned that for completeness. can be used in route-relations.
Shapes define the path but do not say anything about the stops.

I have in the pipeline: the comparison on the map and with the “score” shall also be implemented in the code which creates the analysis report, printing the “score”. If the score changes over night, one could have a look at that by comparing on the map, …

Did it and it works, 'cause PTNA can handle more than one feed/route_id per route_master:

  • RE22;train;;;;DB;"CH-Alle;DE-BW-VAG";"12345;re22-24-1"
    • data for RegionalExpress 22 train (example) can be found in Swiss and German GTFS feeds with different route_id - the train crosses the border between the states.

But: you will see “GTFS and the icon” 4 times now - that’s not a “one-click solution”, more a temporary work-around. So I will go ahead implementing something with “id1|id2” rather than “id1;id2” to have a one-click solution. That’ll be a change in the web-site code rather than in the analysis report code.

P.S. I could use & instead of | but that would be irritating/strange (for humans) as the value is passed as part of an URL “…?feed=IL-MOT&route_id=id1%26id2

Hurray! PTNA - Compare GTFS trip with OSM route

I will have to look into that stop_id mismatch - it’s probably because it’s in a central station (stop_area), those are not properly imported at the moment because of the awkwardness of the GTFS data for them: all child stops have the same position as the parent_station. I have an idea for how to remedy this but that’s not done yet.

1 Like

It complains about (e.g.): Missing route for ‘ref’=‘236’ and ‘route’=‘bus’

I just added analysis of public-transport to ptna

… there will be more to come over the next few days.

@NeatNit

1 Like

@NeatNit

That’s not true: single feed but multiple route_ids are supported.

See: Haifa bus 2

I did the necessary changes and it now shows one icon and analyses the listed route_ids (‘;’ separated !) as if it were a single route_id.

1 Like

I just added gtfs and public-transport analysis to ptna

PTNA for OSM data

A compilation of GTFS data for the region Normandie is available

I just added four more public-transport analysis to ptna

Edit: having a mix of left-to-right and right-to-left character writing has strange effects:

  • “Route has not matching ‘ref’ = ‘1א’: 12821500 (id, JOSM)” should read
  • “Route has not matching ‘ref’ = ‘1א’:” followed by “12821500 (id, JOSM)”

Any ideas how to solve that?

I’m glad you asked! Wrap each uncontrolled value with a <bdi> tag, a.k.a. bidirectional-isolate. That should fix it. If you already have an HTML element wrapping the value (which you currently don’t, but maybe in the future), the same can be achieved with dir="auto" on the HTML element, or with CSS using unicode-bidi: isolate;

Thank you so much for your work on this! I want to look into importing the CSV files from GTFS. Do you have any code to start with on that?

Taking a closer look, the route_desc field can be used to identify which route_ids belong to the same route (i.e. should be one route_master). This is explained in the documentation, essentially route_desc is in the format “catalog_number-direction-variant”, e.g. “56002-1-#”. This makes it fairly easy to translate to PTNA or OSM.

Thanks for this! I did this in CSS and will wrap uncontrolled values <span class="foreign"></span>. I think there’s much to do now - step-by-step.

PTNA’s GTFS data is stored in an sqlite3 db.

So sqlite3 xxx.db 'SELECT route_id, route_type, route_descr, route_short_name, agency_name, agency_id FROM routes JOIN agency ...;' should do the basics, but …

  • GTFS route_type needs conversion to OSM route=*: starting @ line 3648 in gtfs.php
  • you can leave ‘from’ and ‘to’ in the CSV data ‘blank’
  • GTFS route_descr can be used, as you describe, to group several route_ids to a single CSV entry
  • since an individual GTFS route_short_name (OSM ‘ref’) may appear multiple times in GTFS the CSV ‘operator’ value is of importance
    • in those cases, PTNA compares the CSV ‘operator’ with OSM ‘operator’ values of routes/route_masters to distinguish between e.g. different bus ‘1’ in different areas
    • if this still results in identical CSV entries, CSV ‘from’ and ‘to’ must be set and will be compared with OSM ‘from’ and ‘to’
    • DE-SN-VMS has 11 times the bus ‘ref’ = ‘A’ in different cities/villages, some of them with identical ‘operator’ values, took a while to correctly consider that in PTNA

Edit: fix SELECT …

Do note the warning in the MDN page: Warning: This property is intended for Document Type Definition (DTD) designers. Web designers and similar authors should not override it.

I’m not too sure what this means, but it sounds like the other two options are better. If you think about it, this has more to do with markup than with styling, so it shouldn’t be done through CSS. You’re going to have to change the HTML anyway, might as well add dir="auto", no?

That said, do it however you want!

Right… Well, I never used php before, plus I don’t know where to even start in terms of integrating with your workflow/database/etc. So I think if I do code this up I’m probably going to do it from scratch. And probably in Python :stuck_out_tongue: Edit: but if you provide a starting point (i.e. “YOUR CODE GOES HERE”) I will prefer that. I’m willing to learn php for this… despite the horror stories I’ve heard.

You know, that reminds me. You’re not going to like this…

In the whole country there is one funicular, and one gondola. They are both in Haifa. However, the MOT in their infinite wisdom list both of them with route_type 5 = cable tram, which is wrong for both of them.

I should maybe ask them why they did it this way… But I don’t expect them to have a good answer. I extra don’t expect them to fix it. There are probably a bunch of systems relying on this data by now.

Ahh, I got it! I’ll take <span class="foreign" dir="auto"> ... </span> and will currently not add any CSS ‘.foreign’ or ‘:dir’

No need to do that. Python is pretty fine. It’ll be a stand-alone script which takes some command line parameters like ‘DB-name’, ‘template location’, … and writes the CSV data to STDOUT.
You can add the script to the ‘bin’ folder of the ‘gtfs’ repo @ GitHub.
I’ll integrate that at an appropriate place during the GTFS import workflow. Will the script create 15 independent CSV data sets for the 15 sub-districts by calling it 15 times with different parameters) or a single one which can be split into 15 pieces using ‘csplit’?

PTNA could take care of uploading that to the OSM wiki using ptna-wiki-page.pl

Good to know: PTNA’s GTFS import has a post-processing step, individual for each feed. We can fix that using SQL statements.

This could also be a good place to start the GTFS-to-CSV conversion and upload the results to the OSM wiki.

1 Like

Okay, tell you what: I’ll write it on my end by parsing the GTFS files from scratch. Then when I get satisfactory output, we can strip out the GTFS parsing code and replace it with the appropriate SQL queries.

I think it’ll be called 15 times with different parameters.

Only thing I’m slightly concerned about is how to decide which routes belong to each region. Essentially I think this should be done with an Overpass query to select stops in that area, then select only the routes that pass through these stops. Overpass’s area is a true life-saver!

My import has already filled in gtfs:stop_id:IL-MOT on all stops in the country… but while I intend my import updates to keep going for years, it might be better to use ref and compare with stop_code as a safer bet.

E.g. the following query returns a list of refs (stop_codes) in Haifa:

[out:csv(ref)];
area[wikidata=Q5423411];
node(area)[highway=bus_stop][ref];
out geom;

Then it’s just a matter of processing the GTFS data to connect the stop_codes to routes, and output those routes according to some template.

Should be fun :slight_smile:

Edit:

For what it’s worth:
agency_id 20 (route_desc 93001-*-0) is the funicular
agency_id 33 (route_desc 57010-*-0) is the cable car (I think it’s a gondola but don’t quote me on that)

I’m fine with that. That’ll be a learning phase anyway. You’re the first to try it.

That’s the crucial question.

Yeah, that should work. Don’t forget the trains, light_rails, funicular, cable cars, …

Right, I will have to check if they even have the data needed for a match in OSM. But that can wait for later, first I want to see if I get buses right. The first draft doesn’t need to be complete.

Yay, I like to break new ground :slight_smile:

Yeah, actually this would be based on what has been mapped already in OSM, leaving out the missing ones and those which do not have set ‘ref’ yet.

The approach should be

  1. based on stops that exist in GTFS
  2. filter those which are in the area of interest (Haifa, …)
  3. find the GTFS trip_ids which stop there
  4. find the route_ids of the trip_ids

Step #2 is the tricky one which is performance critical

Yeah, it’s a pain. That’s exactly why I said: “Overpass’s area is a true life-saver!”
But you’re right, and it may not be out of reach. A quick search shows that Python libraries exist to check if a point is within a polygon. Perhaps we can convert the OSM district boundaries into a format those libraries can take, and then do it in Python.

It seems this will be necessary for every type of stop except bus stops. They aren’t consistently mapped with ref in OSM.

This will all wait though - for now I’m going to do just bus stops and I’m using Overpass as a crutch like I said. I’m more interested in getting the PTNA CSV output right.

Edit: I forgot to say I started work on it, you can see it on gtfs2osm-il/ptnaGenerate.py at ptna-import - NeatNit/gtfs2osm-il - Codeberg.org

Nothing much to show yet. The output as of right now:

Running Overpass query: [out:csv(ref;false)];area[wikidata=Q5423411];node(area)[highway=bus_stop][ref]; out;
2490 stop_codes out of get_stop_codes
2548 stop_ids out of stop_ids_from_stop_codes
49472 trip_ids out of trip_ids_from_stop_ids
784 route_ids out of route_ids_from_trip_ids
784 routes out of routes_from_route_ids
[('אגד', 490), ('סופרבוס', 117), ('נתיב אקספרס', 82), ('נסיעות ותיירות', 36), ('קווים', 30), ('גי.בי.טורס', 25), ('מטרופולין', 4)]
Grouped by 308 unique catalog numbers

OSM boundaries and
Polygons could become your friends.
Haifa

1 Like

I’m going to release better support for right-to-left character writing (Arabic, Farsi, Hebrew, …)

Don’t be surprised or shocked!

This will result in a huge number of changes in the ‘diff’ report (e.g. DE-NW-WT > 150k) although the report itself might not show many differences.

Background: data with origin “OSM” (name, ref, network, operator, from, via, …) will be embedded in special HTML tags e.g. <span class="foreign" dir="auto">...</span> so that browsers can display (mixed) left-to-right and right-to-left writing correctly. Thanks to @NeatNit for the hint.

1 Like

I just added public-transport analysis to ptna for the remaining 9 sub-districts in IL-M-* and IL-Z-*: PTNA - Results

No Route-Realtions were found for IL-Z-Kinneret ( מחוז הצפון, נפת כנרת / Kinneret Subdistrict, North District) and IL-Z-Safed ( מחוז הצפון, נפת צפת / Safed Subdistrict, North District)

@NeatNit