Railway station ID confusion (uic_ref)

Whats the issue?

Railway stations in europe are tagged with uic_ref, which should refer to the id given by the International Union of Railways (UIC) (released with the MERITS dataset).

In the past these have been falsely understood as equivalent to the ids used by the distribution system of Deutsche Bahn and their timetable information system, which uses IBNR (Interne Bahnhofsnummer) Codes internally.
While both codes start with the same country code the assignment differs and for most countries is not related.
This leads to many uic_ref annotations to wrongly state IBNR numbers and if you expect every uic_ref to be IBNR vice versa.

An Example:

The UIC Code of Hamburg Hbf is 8001071, the IBNR is 8002549 (Other values for other parts of the station for both codes).
The osm node uic_ref currently says 8002549. Wiki data already has both ids and correctly separates IBNR and UIC codes.

How to test which id is a uic code and which an IBNR?

IBNRs can be looked up here and validated by typing in the station code into the trip search of bahn.de or fahrplan.oebb.at.

The only source to validate UIC codes I found to be the Eurail timetable search. If you know any other way to collect or validate the ids please add this here.

Trainline also offers a list of uic codes and IBNRs (labled as db_id) (please see comment for link list).

How could we solve it?

To avoid further confusion, I propose to make use of namespace specific (ref:) id tags:

  • ref:ibnr → For real IBNR codes (already used about 3k times)

  • ref:uic → For real uic codes (Not used yet - besides a small test in Czechia)

  • uic_ref should be removed as soon as its value has been applied to the correct reference.

This way we ensure consistency and make use of the correct namespacing.

As I do have a list of all european stations with all asigned codes ready, semi automatic import might be possible in the future.

If this thead is conclusive I would update the OSM wiki to provide more detailed information about the issue and how to tackle it.

What do you think about the proposal? Do you know any systems that currently rely on the uic_ref tag?

Related issues

Besides clarifying the use of ids, many stations are actually split up into substations by IBNR or UIC reference. This should be addressed correctly while mapping, avoiding binding multiple of the same id type (e.g. having uic_ref_2 as a tag) to the same station. While we clear the confusion of IBNR and UIC ref, please look out for some related issues.

This is also a widespread issue in Sweden see: Omtaggning/borttagning av uic_ref vid svenska stationer

I do however see little point in replacing uic_ref as it is currently supported by multiple pieces of software which would otherwise need to be changed.

1 Like

Would you than say to adding ref:uic and/or ref:ibnr aside of uic_ref?

As they would be expected to have the same value I would see no point in having both uic_ref and ref:uic but note:uic_ref could be used to keep track of which stations and stop areas have had they’re uic_ref checked against merits.

ref:Ibnr would still have a purpose so I’d keep that

From my point of view if we advise to use note:uic_ref we would need to specify a pattern, that would require people to educate themself about this issue before tagging to avoid the same confusion. The content of note:uic_ref would also need to be as standardized as possible to allow automatic linting. Replacing a non well defined tag with two new well defined ones seams more clear to me, but I also fear some systems breaking because of this.

1 Like

How would ref:uic have a clearer definition than ref_uic. The definition of ref_uic could be improved a bit but they describe the same thing. To me it seems it would run into the same issue eventually.

I don’t believe anyone sees ref_uic and thinks the ibnr fits there they probably just think the ibnr is a UIC station code since they have the same format changing the key would not solve that.

note: might not be ideal for computer processing so something else might have to do.

Might also be worth adding ref:era for era primary location codes which to my knowledge align with UIC codes but use iso country codes instead https://teleref.era.europa.eu/

3 Likes

Sound good to also include those. As far as I can check these ids do not align with uic nor ibnr. So for Hamburg Hbf its: DE14393.

I still believe there is some value in migrating uic_ref to ref:uic as currently most of the station do have uic_ref, so its more about clarifying whats actually there, while aligning with a namespaced key schema. The alternative would be to add another key or rely on a ref:ibnr as an indicator that both ids have been checked correctly.

As a next step I would now add a ref:ibnr key to some major stations in Germany and adjust ref_uic to the actual uic code to see if there is someone complaining.
Additionally I would update the wiki article to address the difference between IBNR and UIC correctly.

I noticed you updated the wiki page, which is good although I think the formatting is a bit rough. The German page is considerably cleaner with it’s own section for the UIC/ibnr distinction, which I also believe has a better header. I also don’t see any issue with the use of the tag on stop areas, especially since this seems quite widespread.

I would do this myself if I had access to a computer.

Might also be worth noting that there apparently are specific reservation codes too.

stationUIC = station codes as used in UIC leaflet 108.1 for open tickets
– - stationUICReservation = station codes as used in Reservation leaflets 918.1 and 108.2

From UIC GitHub uic Barcode repo

1 Like

Thanks for the remark! Will update the formatting on the English wiki page and look into examples for codes related to both documents.

@marudor (Person behind bahn.expert and part of DB InfraGo) mentioned in the #opentransportmeetup that the id that is referenced as ibnr is actually mostly a code of the EVA (Elektronische Verkehrs Auskunft). As the EVA code is used as THE internal identifier at DB, I propose to update all “ibnrs” (ref:ibnr) to ref:eva and also use this key when replacing uic_ref with the UIC id.

The IBNR is a 5 digit id, that is mostly equivalent with the EVA number (which adds a country code in some scenarios / the country code padding is not mandatory). The IBNR Wikipedia article seams also to be misleading.

The EVA number is a 7 digit id, which also covers bus stations in Germany (mostly without a country code). As this code is the one used by bahn.de, it is easy to validate and mostly already referenced on the map, but using wrong identifiers like (uic_ref or ref:ibnr)

1 Like

@TheNewCivilian how far into the process of moving uic_ref to ref:ibnr are you? For multiple internal projects I have a table of uic_ref and ref:railway and tried adding the node point, and about 30 stations didn’t match the uic that they had about 3 months ago. Is that all that needed a change? or is your process still ongoing?

K

Hi K!
There is actually many more. Pretty much all stations in DE, DK and AT, some SK, PL, CZ, SE are labeled incorrect from my experience. I am right now preparing a reviewing tool based on a Deutsche Bahn API to allow people to correct the stations piece by piece. Talking to DB officials I would also propose to now use ref:eva instead of ref:ibnr, as those seam to be similar but not necessarily the same.
Does this migration spark many trubble on your side? Would a list of all official uic codes help your project?

Hey, thanks for your reply.

As I said, we’re talking about 30 stations that you seem to have changed since (out of ~2700 that I was using before), so that’s just an hour’s manual work to find the “right” information again. For now, I will use the IBNR/EVA ref for my project as I directly query the bahn website with them for ticket purposes. When “everything” is said and done (whenever that might be) I’ll just rework the projects a little - not a real trouble :slight_smile:

Please do let me know, when your review tool is ready (DM or @-me here), I’m happy to help!

Kai

1 Like

Hey all,

I’ve just recently imported UIC Codes for all stations in Finland so at least in there uic_ref refers to valid numbers. And where multiple UIC Codes refer to the parts of a larger station, those parts are mapped individually.

In Finland the UIC Codes are available through the open data and also several publications. For example, this link lists currently active stations in JSON format: https://rata.digitraffic.fi/api/v1/metadata/stations

Trafikverket in Sweden has also open data that includes Primary Location Codes for their stations. The data is behind registration. I’ve not yet compared them with anything to see if the numbers are the same as UIC Codes.

I did look into that data not too long ago and found that within Sweden the PLC mostly aligns with UIC. I did however find some signs that there are more UIC codes than PLCs mostly related to workshops iirc.

The coordinates are also in Sweref 99 which has made comparing it to osm a bit harder for me. It’s probably worth noting however that the Trafikverket open API also has this data in the stations endpoint, but without any stated license.

Yes the license is somewhat hidden but is CC0 1.0 Universal. They have a link to the license on this page: Hämta öppen data från Trafikverket - www.trafikverket.se

Hi everyone,

thanks for raising this important topic. I fully agree that uic_ref should be replaced with a more consistent namespaced tagging approach.
In my view, ref:UIC would be preferable to ref:uic, as it makes the meaning clearer and highlights that it is an abbreviation. This detail can of course be discussed further, since there is currently no strict consistency in how ref:* namespaces are capitalized.

This mechanical edit aims to improve tagging consistency but does not attempt to resolve the separate issue of uic_ref values that are not actual UIC reference numbers. That data problem will need to be resolved separately.

Using ref:UIC also aligns with other reference tags already in use or that could be used in the future, such as ref:IBNR (Interne Bahnhofsnummer, used in Germany) and ref:RFI (station/location references from the Italian railway infrastructure operator).

I have created a proposal for a one-time mechanical edit that converts:

  • uic_ref → ref:UIC

  • uic_name → name:UIC

  • source (when it refers to the UIC code) → source:UIC

You can find the mechanical edit plan here:

Feedback is very welcome!

Hard disagree from my side. This proposal erodes the clear distinction between uic_ref signifying possibly conflated values and ref:uic/ref:eva signifying deconflated values.

Also a point worth considering is that these subkeys are usually capitalised, so written name:UIC/ref:UIC/ref:EVA.

1 Like

@hlfan Thanks for your comment. My main intention was simply to replace uic_ref with ref:UIC. I agree that a mechanical edit would lose the distinction between verified and unverified data. I’m fine with doing this manually and using ref:UIC only for checked values.

The wiki page should also be updated to explain this. Can I go ahead and do that?

1 Like

I was under the distinct impression (and have gone through a discussion on the Polish OSM discord to this effect already) that keys are to be all lowercase with just a few accepted exceptions. This wiki article supports this → Any tags you like - OpenStreetMap Wiki → “Ideally, a key is one word, in lowercase, using British English if possible.” (underlining by myself).

So while I’m not necessarily bothered by the accuracy of these IDs as they are hardly ever included in my area to begin with, I would like to highlight that, as far as I see it, ref:uic is more in line with OSM practice and guidelines than ref:UIC.

1 Like