Turns out I slightly misunderstood how argparse works. So positional arguments are out, named arguments will be the way to go.
I think it’s done.
./gtfsCatalog.py -s districts/Tel-Aviv.geojson -g israel-public-transportation -o catalog.json
./ptnaGenerate.py -r catalog.json -t IL-TA-Tel-Aviv-Template.txt -o telaviv.wikitext
Grab the scripts here: https://codeberg.org/NeatNit/ptna-gtfs-import
Go ahead and implement them into the GTFS post-processing. I’ll edit the templates on the wiki so it has something to work with. Hope to see some results in the next few days
Edit: wait, some important notes:
- The shebang in both of the scripts is something I like to use myself. You’d probably want to change it to either the common
#!/usr/bin/env python3
, or remove it entirely and only allow the scripts to be run withpython3 script.py
. - You need shapely for the gtfs->json script:
pip install shapely
- As I said, script names are placeholders, feel free to change them
- I imagine these scripts will now live somewhere in your GitHub repositories, and those will be their ‘canonical’ version. Let me know where exactly that ends up being.
- Uhh… Code license? Is GPL okay? (Don’t let this question stall you from using it ASAP)
Before I forget:
It seems problematic to me that each PTNA CSV wiki page (example) includes full instructions for the file syntax. It takes up a huge chunk at the top of the file which makes it annoying to edit the data underneath, and it runs the risk of going out of date, for example right now they are all out of date because they make no mention of @
being a special character.
I say: instead of this repeated information, each CSV page should link to a single instructions page, let’s say: https://wiki.openstreetmap.org/wiki/Public_Transport_Network_Analysis/Syntax, and that will cover everything and be more easily kept up to date. It would also be able to use standard wiki formatting to make the documentation more readable.
Only question is, how to add a link without introducing syntax errors to PTNA?
Thanks NeatNit,
I’ll copy the scrips to my GitHub repos, which are all under “GNU General Public License v3.0”.
“shapely” isn’t installed on the FOSSGIS server I’m using, so I’ll install that in a venv.
Let’s how to integrate them.
Good idea, it is time to start that out-sourcing of syntax info.
For a mass-manipulation of the (> 300) CSV-pages, I’ll use “ptna-network.sh -gmp
” which gets the data, manipulates it and pushes it back to OSM Wiki.
The data in the OSM-Wiki is enclosed in <pre>
… </pre>
.
Anything after the </pre>
is ignored - sometimes there is a [[Category:PTNA]]
or so, but that’ll appear at the end of the page.
At any place (between <pre>
and </pre>
) we can insert:
- [[Public_Transport_Network_Analysis/Syntax|Syntax for this file]]
as normal text with a Wiki-link.
Is everything before <pre>
ignored as well? I was thinking:
This page follows the PTNA CSV format. For more information, see [[Public Transport Network Analysis/Syntax]].
<pre>
# ... all the usual stuff ...
# ...
</pre>
[[Category:PTNA]]
Edit: to clarify, this is because I want the link to appear as a link on the wiki itself, for whoever is editing it.
Good point, but no, CSV data for most of the FR-IDF-* are in GitHub and do not need/have <pre>
.
But: I’ll change the code of PTNA to also ignore anything before <pre>
if the data has been read from / is located in OSM-Wiki.
Currently: if PTNA finds an opening <pre>
and then a closing </pre>
it stops reading, that’s all.
<pre>
and </pre>
are used only to prevent OSM-Wiki from displaying the data als Wiki-Page
Well, in those cases you can just put the URL in a comment:
# This page follows the PTNA CSV format. For more information, see https://wiki.openstreetmap.org/wiki/Public_Transport_Network_Analysis/Syntax
Great, that should do it.
I just noticed that Israel is missing from this page PTNA - GTFS Analysis which is linked from the GTFS wiki page.
I copied the header of a CSV OSM wiki page as is to
I can later on work on a Wiki structured documentation
Great, I might fiddle with it too
I am currently in the process of overthinking how to automatically import train routes from GTFS. The data is really bad to work with, but I might have a way.
It won’t be foolproof, but it just might work.
Edit: to clarify, this only applies to Israel. I might take a look at other GTFS feeds if anyone wants, after Israel’s automation is working.
I finished a mass-edit of the OSM-Wiki routes data (the CSV files)
- replacing the lengthy comment section about the syntax by a link to OSM-wiki documentation
@NeatNit suggested this and has started with the Wiki based documentation in the “en” version - work in progress … (NeatNit: a big thanks for that).
Please wait for the “en” version being in a stable state before starting translations.
P.S.: there’s more to come, provided by NeatNit: import of CSV data directly from GTFS, updated with every new GTFS version … you’ve probably seen the discussions between NeatNit and me during the last weeks.
Very happy to do it
I’m sure there will be a lot of room for improvement when I’m done - my writing tends to be lengthy and convoluted - but I want to get the basics in there so improving the documentation later is easier.
Regarding the GTFS->CSV import, I’m still making minor improvements and touch-ups to the code, but as I said it is ready to be deployed for Israel. Any timeline estimate? Technically there’s no rush, but I am a bit excited to see the results!
Edit: Is there a best example of a PTNA CSV, or do you have a favorite? There’s so many countries and so many regions for each country, I wouldn’t know where to look. Having examples to point to in the syntax page would be very useful, especially if we can also link to the resulting analysis page. I’m going to guess that DE will have the best examples, but which one should I use? Although, an English-language analysis page would be preferable. Ideally somewhere without many lines.
I really don’t want to use Israel for examples, mostly because it is so badly mapped right now that it’s not a good representation of PTNA’s normal output.
I did already copy and rename the code, will do so again. Changes are minor
gtfsCatalog.py
has been renamed togtfsAreaCatalog.py
reflecting that an area needs to be providedptnaGenerate.py
has been renamed toptnaGenerateCsv.py
- shebang is now
#!/usr/bin/env python3
, does not needshapely
and therefore not venv
- shebang is now
- in
create-venv.sh
line 8:test -f "${HERE}/gtfsAreaCatalog.py"
No, but my aim is to finish that ASAP, at least by end of the week. I’ve got the code in mind, tricky part is the transfer from brain to code.
Yeah, English-language:
- US-MA-LRTA Analysis and US-MA-LRTA GTFS are quite small and up-to-date, but has only buses
- US-WA-KCM Analysis is somehow bigger with bus and ferry and US-WA-KCM GTFS is OK but released end of December 2024
Is there a known issue when adding multiple wikilinks in a single line? In my earlier testing, having multiple links in the same line caused a broken result. This diff fixed the issue: Israel/Public Transport/PTNA/IL-TA-Tel-Aviv-Routes: Difference between revisions - OpenStreetMap Wiki
I don’t know if I’ve missed some obvious problem, or there’s a bug with the parser.
Can you add PTNA analysis for all train routes (and only trains) in Israel? Might be handy.
I don’t have a list of routes yet, but it’s coming.
There can indeed be a bug in the parser. I’ll check tomorrow.
Yes, no problem.
I think I’m done editing Syntax of CSV data for the time being. I think I got all the important parts in there. Hope you or anyone else can read it and improve what needs to be improved. I’m sure there’s a lot.
One line I’m particularly not pleased with with is this: “If fewer than 9 fields are provided, the rest are assumed to be empty.” I just can’t think of a good way to say you only have to specify all the fields up until the last one that has a non-empty value. Why is that so hard to express? Bah!
I also wrote documentation for the import filters. Give that a read as well. Hope it’s clear, if not, it’s a wiki for a reason!
I still want to add more/better examples for import filters.
About the import though, as I was writing it I wondered - what does PTNA do if ref and type uniquely identify a route, but operator/from/to don’t match? Will it match the route and complain about the unmatched fields, or will it put it in the “Other Public Transport Lines” section? Depending on the answer, it might warrant changing the behavior of the import script.
Thanks for your work on the Syntax of CSV data. I’ll have a look at it.
If the tuple “ref;type” appears only once, then the rest (from;to;operator) is of no interest and will not be used.
If the tuple “ref;type” appears more than once, 1. operator, and if this does not help, 2. from/to are used. If there is no match, they will be listed in “Not clearly assigned routes”.
Need to stop now: the bug in the parser keeps me busy and frustrated
while ( $text =~ m/xxx/g ) { }
while ( $text =~ m/yyy/g ) { }
does ignore first ‘yyy’ in the $text = “yyy xxx yyy” because searching in the 2nd while-loop starts after ‘xxx’ and I did not yet figure out how to reset the start-position back to 0;
I like Perl and Regex, but this is driving me nuts.
There is certainly an easy solution for 2 while-loops but for 8 while-loops with very complex regex?