Israel GTFS release

This could also be a option. Simply move them to just one position.
This way we would not loose any information and we could import again and skip all existing stop with existing refs.

I have been working on the GTFS route data, and now I have a .osm file with a large number of ways, each representing a bus route. Here is a screenshot in JOSM:

The question is how to incorporate this data into OSM. A bus route in OSM consists of a relation which contains all the street segments the bus travels on. However, my data is not relations but ways. Furthermore, we presumably want to use our pre-existing ways, rather than the ways I’m uploading, to represent streets. (Though we could update our ways if necessary to reflect information in my ways.)

Right now it looks like I should upload all my data as-is to OSM. It will not show on the rendered map (the only tags in each way are ref,name,operator,source). But when we edit, we will see a number of ways overlaid on our streets. One by one, these can then be manually deleted and relations created to hold the equivalent information.

That is a labor-intensive process. Also, we need to find a reliable way of uploading this 300MB file (35MB when bzipped). My relatively new computer basically grinds to a halt when the file is opened in JOSM. But there must be some smaller script that can do the upload, perhaps by breaking the file into pieces.

Any thoughts, or alternative suggestions?

I would suggest to split it to smaller chunks (by operator/area/city/…).
Also, may be an option to create many small files, such as we can get small piece of data, open in in josm, and create normal relation without uploading temporary data and then deleting it.

eric, can you share some example? any single route?

I agree with yrtimiD to have smaller chunks and to try not to upload everything and then delete it.
It’s better to do it slowly but with a focus to finish it together. You are welcome to add all routes to Israel by yourself :slight_smile:

You are preparing several OSM files with the data you have and split it to the smallest chunks we can do. Maybe bus route by bus route.
And then make it available to download. So everybody that wants to help needs to download the OSM file and add his name to the list at the route.
This way we would add route by route… will take a long time but better that “spaming” OSM with data that needs to be deleted afterwards.

Adding a bus route to osm is a very annoying process as a lot of streets needing to be cut into very small peaces to add the correct relation.
Doing it very slowly and with a system would really help.

Good ideas :slight_smile:

I started separating the data by operator, and when looking more closely in JOSM, I realized a portion of the data is wrong. Not sure yet if the problem is in my script or their data. But I’m glad I noticed before uploading…

I just found two interesting slides.

http://www.slideshare.net/shelef/the-advantages-of-gtfs-in-israel
http://www.slideshare.net/shelef/transit-information-services-in-israel-update-2012-v4

Very interesting that there is no word about Openstreetmap.
Maybe we should tell him how great the step was for Openstreetmap.

What do you think?

I was wondering why our imported bus stop refs always differs from the id printed in the real bus stop sign.
The original text files has two fields - stop_id and stop_code. As I understand, we used stop_code for ref field.
I think it’s better to use the actual ID printed on bus stop’s sign so we’ll be easier to confirm existing stops.

Anybody have other opinion?

I noticed that, too, but did not say anything because I am not familiar with the implementation details.

I agree; if stop_id is really what’s printed on the signs, then it would make much more sense to use those for physical cross-reference.

I just finished to upload a changeset (http://www.openstreetmap.org/browse/changeset/14265835)
The new values as follows:


stop_id -> gtfs:id
stop_code -> ref

eric22, Important note for you: If you going to do further imports, please be sure you change your scripts.

I have chosen gtfs:id tag name, but if somebody have better idea for the name - it’s not a problem to change.

Also, I found 93 missed bus stops. Possible they recently have been added to the gtfs files, or were deleted by users.
IMHO, for future route imports we may not delete stops which isn’t exist phisically, but may mark them, for example, as proposed:


highway=proposed
proposed=bus_stop

I’m ready to upload this set of missed stops at any time.

I have no current plans to update the bus stops, but I’ve made a note to myself this.

Feel free to upload the missed stops.

Guys, I talked with owners of Moovit application http://www.moovit.co.il/.
It’s very nice application for public transport which also use data from GTFS and our map.

Asked them to work together on fixing and updating current GTFS data and they appear very cooperative and will be happy to sync bus stop info two way.

Here is a part of their answer:

Where can I find more information on specific GTFS IDs?

See my http://www.openstreetmap.org/browse/changeset/14370035 from today.
I added some more information on th Hatzor bus terminal. There were some six stations scattered around it that seem to have been related. I’m not sure what they all mean.

What kind of information you need? Explanations about whole GTFS data you can find at http://tinyurl.com/chhb6wn
For now, you can confirm/move actual position of the bus stop by comparing its ref value with actual code printed on bus stop label.

What is current agreement on what to do with bus stops imported from GTFS if they do not exist or have different names/numbers from existing stops?

For example, I found a bunch of bus_stop POIs with different ‘ref’ values close to each other, but in reality in that place there is a bus stop with a number that is different to all of them.
Should I try to find a bus_stop POI with that existing number and move it to real position? Or change a ref value (and name value) of the existing POI?
How should I merge several bus_stop POIs into one if there is only one stop in reality?

No replies, seems that I’m late for couple of years for main Israel OSM activity :slight_smile:

In the meantime, I downloaded current GTFS data and found that it’s much more accurate (at least in the places where I could check the real situation) than data uploaded to OSM in 2012.

I managed to run GO_Sync app mentioned in the first posts here, solving encoding problem with -Dfile.encoding parameter:

java -Xmx1200M -Dfile.encoding=UTF-8 -jar gtfs-osm-sync-1.0-SNAPSHOT-jar-with-dependencies.jar

(added -Xmx1200M because it is crashed with out of memory after analyzing all stops)
But as current OSM data has no specific GO_Sync tags, GO_Sync can’t help a lot with updating current data.

Probably all current bus_stop POIs that are sourced from GTFS should be removed from OSM and new POIs uploaded in GO_Sync-compatible manner? After that, we will be able update the stops with GO_Sync in the future.
On the other side, if some of the POIs were fixed manually, we’ll lose this work.

I also played a little with stops CSV file and extracted data from GTFS description into different columns (city, address, floor, platform), they can be put into separate tags as well.

I’ve never made such big uploads to OSM so I need an advise from more experienced mappers.

The current OSM bus stops are 5 years old and are significantly different from the most up to date Israel GTFS in some areas. I am looking into ways of updating this in a way that guarantees easy periodic syncing.

OSM statistics:

  • 33,519 total bus stops

  • 1,622 have neither gtfs:id nor ref

  • 31,932 have either gtfs:id or ref

  • 31,930 have ref

  • 2 bus stops have gtfs:id but not ref

  • 35 bus stops have ref but not gtfs:id

SwiftFast, last time when I did a bulk update the rule was simple: gtfs:id is the internal id for back reference to gtfs data, and ref is actual number of stop written on yellow signs. Bus stops w/o any of these ids are usually old and added manually, in many cases they can be merged with gtfs ones.

Thanks. So correlation appears to be quite simple, but I think merging should be a bit more sophisticated.

The simple way to merge is to just remove all existing nodes and re-add. I think this is not suitable for frequent periodic updates. (It destroys history, manually added tags like “sheltered”, and needlessly causes the entire IHM to re-render).

My initial plan (subject to many changes) is roughly like this:

  • Turn the GTFS into a JOSM bus_stop layer using GO_Sync

  • Grab the current OSM bus_stops to a different layer using Overpass API.

  • Merge the two layers

  • Use the JOSM Scripting plugin to detect and merge the different identical nodes (based on ref/gtfs:id and even distance), preserving node IDs and manual tags.

  • Somehow delete the bus stops that are no longer in the GTFS.

The script should be reusable, allowing it to be shared with the community. Also, before starting I need to make sure nobody has already solved this before.

Also, I might be underestimating GO-Sync since I have not tried it. It might save me the trouble of writing a script, if it can perform syncs in a smart way.