I’ve got a query I’m posting on behalf of a friend who isn’t on this forum and is interested in seeing bus routes on an OSM map.
His query is as follows:
“I notice that people add bus routes to OSM and assume that rather than riding round all day with a logger that they get them out of the Bus Open Data Set (BODS). Sometimes the ones in OSM are rather out of date, so do you know of anyone who has a way of parsing TransXChange files to .gpx or something to get me there?”
When the topic of Stagecoach’s open data came up I got the impression that the bus data provided to and released by the government had downstream attribution requirements.
The OSMF Licence Compatibility Page appears to me to say that sources requiring specific downstream attribution are not compatible with OSM’s requirements without much in the way of wiggle room.
Although as I say in that and a previous post in that thread I’m about as far from a lawyer as you can get and it’s entirely possible I’ve got not just the wrong end of the stick but have picked up the wrong stick entirely.
bustimes.org has a very clear page about bus open data, and “attribution” requirements. I don’t see an obstacle to using this data in OSM (we have certainly imported NAPTAN data).
That said, I don’t think bus routes really belong OSM. bustimes.org seems a much better approach, where the data is overlaid on an OSM-based map.
bustimes.org is one of my goto sources of bus data as it clealy attributes and has links to the government open bus data. I’ve downloaded some of the BODS data which I believe is compatible with OSM - it is in XML format and looks a real pain to extract - so why duplicate when bustimes.org has done a tremendous job. Parsing the route data could be problematic in that while the roads and lengths are given by BODS, collating them with OSM ways could be very difficult.
I like the bus routes in OSM as it gives a map view of route coverage - helps plan days out.
Maintaining bus routes is a problem - I focus on south Lancashire where the county council publish their route changes every month and thus enables me to attempt to keep routes in my area upto date
You have seen Transport layer on the OSM base map page. There is also https://www.%C3%B6pnvkarte.de ( ÖPNVKarte)
I actually agree with that, but playing devil’s advocate for a second:
I can see that someone might want to present the data slightly differently. For example, here is bustimes.org outside York Station and here is one map based on OSM data, and here is another. Each of those three maps shows data that the other two don’t show.
Someone may want to show some combination of those capabilities, or even something different again. If they try and reconcile multiple datasets (OSM plus others, or even just the others) they’ll have lots of challenges since different open data bus dataset contradict each other (something that EdLoach on #osm-gb IRC has provided a lot of detail on). Bustimes.org has apparently spent some time working out what feeds (of the contradicting ones) to use to get the best answer in each case.
This reconciliation is tricky - notice how the eastern stops have moved to the south. At least some of the “official” data doesn’t know about this yet (including the software driving the announcements on the buses). Bustimes.org is still using this data, and has the stops in the wrong place. Google Maps has stops in almost the right place, so they have done that reconciliation with data (including locations and pictures of some of the new bus stops) that they have obtained themselves and the government open data feeds they use.
information has been taken from the bus open data digital service internet site
while the Secretary of State strives to preserve the integrity and quality of information on the bus open data digital service internet site, they cannot warrant the accuracy or quality of the information on the site
bustimes.org does not have the endorsement, affiliation, support or approval of the Secretary of State
If a user of the data downstream of OSM is “required … to state” those things because of where OSM got the data then it is in conflict with the OSMF guidance saying that additional downstream attribution is in incompatible with the ODBL.
I’ve checked with my friend and actually he doesn’t intend to import this data into OSM or update bus routes, he just wants to overlay them in a mapping application he has for his own use.
However he also thinks this data should be in OSM saying that the bus line data is “pretty out of date round here” [probably referring to the Derby area] and “exhort[s] the OSM community to investigate this further because it is a distinct Achilles heel in OSM”.
That is a statement from bustimes.org based on the data that they are using in their app.
The relevant guidance for BODS is in the paragraphs below here.
The way that the licencing for the NaPTAN data was handled is described here; and it actually looks very similar. Much (perhaps all) of what’s in BODS but not NaPTAN may actually be out of scope for OSM (e.g. how late will the bus due at 17:30 be?).
My understanding was that OGL V3 is compatible with the ODBL and I have added lots of data e.g. right of way or adopted highways data that has been provided under OGL.
But the licence does state the users must “acknowledge the source of the Information in your product or application by including or linking to any attribution statement specified by the Information Provider(s)”. It’s hard to see that a source statement in the changeset really satisfies this requirement.
Perhaps we need a list of “attribution statements” for UK OGL V3 data sets that have been included in OSM?
The Attribution of data sources subheading looks to be a more verbose version of what bustimes says (although referring to BDS and DfT instead) with an additional “Data consumers will be prevented from further using the service if they do not state this in their product, application or service.”
The OSM Wiki page linked for NapTAN say that data is OGL. I think the regime for that stuff predates BODS?
Is there something that explicitly says that BODS data is OGL? There’s rather a lot of text on these sites so I may just have missed that but IIRC the BODS specific stuff all just seems to reference the relevant bus open data regs.
The NAPTAN data is certainly OGL (see bottom of page). It may be that the data provided by bus companies can’t be simply “re-stamped” with an OGL licence, hence the ambiguities on the BODS data.
My more general concern is whether we should be more explicit about acknowledging the source of OGL data.
Good! But there are gaps. My mistake was thinking that referencing that data was OGL in the changeset source was sufficient. Any OGL dataset should be added to this Contributors page along with any specific attribution text.
It contains a python script that can convert simple transxchange into geojson and a flask webserver that generates geojson on the fly and displays it overlayed on an OSM map in leaflet (with a link to download said geojson). If your friend only needs to convert a small amount then the cli will probably be easier, the webui is better for dealing with routes that are split over multiple files or handling one or more zip files of TransXChange files.
Please note it is very hacky and mainly created around TNDS data in the East Midlands region so some assumptions I made might cause it to crash with some companies BODS data (I did briefly check it works with major companies BODS data though). Also since it was made for TNDS it also only draws lines between bus stops and doesn’t use the more detailed tracks data that BODS gives.
Hopefully it is helpful and if you have any problems let me know.