Mappers will never accept semi-colons appearing in rendered names on osm-carto. Unless that problem is solved, semi-colon delimiters in the
name tag is a non-starter.
Mappers will never accept semi-colons appearing in rendered names on osm-carto. Unless that problem is solved, semi-colon delimiters in the
So is tagging for the renderer.
Tagging for the renderer is absolutely an accepted practice (hence the space and slash delimited names).
Let’s not mix that up with mis-tagging for the renderer.
It is mistagging for the renderer – a particular renderer at the expense of others. It’s one thing when a multilingual name is specified to have a slash in reality, such as “Aoraki / Mount Cook” in New Zealand, or when a monolingual name contains a slash, as in “Parkview/Woodbrook” in Baltimore. The point of a semicolon delimiter is to be able to distinguish cases like these from cases where a different delimiter wouldn’t be a misspelling.
To the extent that other delimiters are an accepted practice, I don’t agree that it’s an accepted practice in the U.S. Many POIs in the U.S. use a semicolon as a delimiter for multiple names.
I’m surprised that you didn’t bring up the example of Londonderry/Derry! Of course, that’s a case where there’s two alternatives to the name, depending on national/political affiliation, and picking one to go in
name and the other to go in
alt_name would be untenable.
While I (to be clear) agree that semi-colon separation would be the right answer from a data model perspective, the fact that a semi-colon would appear on the osm-carto map makes this a utopian, ivory tower discussion that would be unacceptable to the mapping public.
However, if osm-carto were to treat semi-colons like spaces or line breaks, this would be a 100% workable solution. If osm-carto already does this, please stop me now…
This can go one of two ways:
- Multiple slash separate values in the
nametag is “mis-tagging for the renderer” or “lying to the render” and should be called out as a tagging error, despite the practice being widespread.
- The Semi-colon value separator guidance should be revised to state that either a semi-colon or a slash is an acceptable delimiter, both with or without spaces surrounding them.
I don’t particularly care either way, as long as data consumers can be convinced to get on board. Currently some data consumers support semi-colon delimited values, and I’m not aware of any that support slash delimited values. So the semi-colon route seems more likely. Currently
" " (space),
"-" (hyphen), and
"#" are called out on the wiki as old separators that should be replaced.
I mean… it’s obviously not lying to the renderer. Those slash-separated names are in fact names used for the thing. The harshest criticism we can really give this practice is something like “formatting data to look nice on certain renderers while making things hard for other data consumers”. This isn’t the same as tagging a flowerbed as
landuse=commercial because it renders in pink on osm-carto.
This is the United States subcategory, and the place in question is in the U.S. There’s no global standard for a particular delimiter in place names – far from it.
In the case of Derry/Londonderry, mappers have had to add a semicolon-delimited
alt_name just so Nominatim will find it by both names.
I disagree with the implication that replacing the semicolon with another delimiter or truncating the result at the semicolon is a blue-sky idea. To the contrary, styles based on the Mapbox Streets source replace the semicolon in this way with an em dash.
When GraphHopper tells you to turn onto this road, it replaces the semicolon with a comma:
The Mapbox Navigation SDK puts the names on different rows and says just one of the names aloud for brevity.
This is within the realm of possibility: since 2014, openstreetmap-carto has been replacing semicolons in
ref with a newline. However, the maintainers declined to pretty-print semicolons in
name based on the opinion that
name should not have multiple values in the first place. We can see that this decision has backfired, encouraging mappers to add multiple values anyways but in unstructured form.
For sure, it would be a monumental task to persuade mapping communities in multilingual regions to change their agreed-upon name delimiters, but that’s off-topic here. I’m not aware of any entrenched standard in the U.S. This is an opportunity to establish the semicolon as the standard delimiter for the relatively few cases where
name needs multiple values in this country.
The only reason why a slash has become so prevalent in U.S. place names over the past month or so is that one mapper unilaterally applied it across the country. They’ve also been chided by Canadian mappers for the same practice.
It’s a minor lie for sure, but I do consider it a lie under the current guidelines*.
name=Oneida / kanaˀalóhaleˀ asserts to a data consumer that the one primary name for this feature is “Oneida / kanaˀalóhaleˀ”. This is incorrect since that string contains two different names.
name=Oneida;kanaˀalóhaleˀ asserts that the two co-equal primary names for this feature are “Oneida” and “kanaˀalóhaleˀ”. This would be correct in a case where neither name is more primary than the other. In this particular example, that is probably not true as @Minh_Nguyen noted earlier, so just take this as a hypothetical example.
*If the guidelines were to change to say that slashes are acceptable delimiters and data consumers should treat them accordingly (including spaces around them), then I would no longer consider it a lie. However, that would mean that slashes would need to be escaped in names that truly include them. The semi-colon was chosen as a delimiter for good reason.
Here is a test case in the US that I modified to use the semi-colon delimiter.
In addition to render support, editor support will also be needed (example from JOSM):
I believe this is also an issue in osmose.
12 posts were merged into an existing topic: Multiple delimited names in the name tag
Thanks, bidirectional text is a great argument for a formal delimiter. Computers have a lot of difficulty deciding whether mixed-direction text should be reversed or not, and a language-neutral punctuation character doesn’t help at all. In Unicode plain text, displaying bidirectional text in the correct order sometimes requires the use of invisible control characters. I think most mappers would prefer to use a semicolon that would be parsed out more literally.
Osmose has been flagging the slash-delimited name anyways. I asked the developers to exempt any double-barreled name that’s a concatenation of multiple other name tags:
Good discussion - thank you!
Generally I’ve seen the slash as a political compromise when a place really does have two equally used names. As alluded upthread, “Londonderry / Derry” is the obvious one, but there’s also “Bruxelles - Brussel” (with a dash rather than a slash… sigh).
In the UK, and Wales in particular, we follow the general principle (which I think is sensible) that the
name= tag reflects the dominant language spoken locally. So it’s Swansea rather than Abertawe, but Y Bala rather than Bala. There are a few rare examples where the slash is used because there isn’t one dominant language (Aberteifi / Cardigan, and Ynys Môn / Isle of Anglesey).
But I’m aware that this isn’t a universal practice, and we did have one rather awkward episode a few years back where a mapper from another country came to study in Wales, and assumed that the “slashes at every opportunity” approach from his home country would also apply there.
It seems logical to me that applying the Wales principle in the Oneida case would result in removing " / kanaˀalóhaleˀ" from the name tag, given @Minh_Nguyen 's stat of a 1% population. But I’ll leave it to you locals to have that fight
In the meantime I’ll do a little bit of work on my Lua rendering profile for North America to filter out this sort of thing.
To try and keep this topic focused on Native American place names in the US I’ve split off some posts to a new topic in #general. Lets continue any further globally focused discussion there.
I’ve also seen the ‘/’ practice being done in New York State on Seneca reservation. Take the city of Salamanca as an example (which is within the reservation). I’m in support of it because it is a bilingual area with active Seneca language restoration and awareness. Many signs erected by the state are bilingual in the area.
The only reason for any type of separator character is to be able join two or more names. If the tribe has one official name for thier land then there should no need for a separator.
After a bit of warmth / heat (and thankfully, eventually light) at some talk-ca posts about someone rather aggressively (and insensitively) re-naming some boundary=aboriginal_lands (multi)polygons in Canada, the (USA Native American Indian) section of United States admin level - OpenStreetMap Wiki has expanded a bit to state that these are emerging along a “case-by-case” basis. There are a number of sensitivities and possibilities here, and so the reality is that “many options are possible” on these entities. You might have place=* nodes with a value of village, you might not. You might have a boundary=aboriginal_lands tag along with an admin_level=* tag (not the-usually-associated-with-this-tag of boundary=administrative). You might have tagging in “the local language” (as an adjunct tagging convention, or even the only method by which things are named). You might find that some of these have further subdivisions “inside of” the boundary=aboriginal_lands polygon.
What that wiki says (in essence) is to, largely speaking, “leave it up to the people upon the land” to describe this / these attributes, and use their conventions as OSM’s best practices for their land (and people). The net result is that parsing these data / figuring out what to do with them for any given use case is going to be rather open-ended in its necessity to cope with a wide variety of data here. Such “parsing difficulties on the back end of our data” come as part of the (higher) cost of “case by case.”
As far as I can tell, most of these names were not introduced with the level of care that you’re recommending, but rather indiscriminately and unilaterally by one mapper, not only in the U.S. but all across the Western hemisphere.
While I appreciate the need for indigenous languages’ representation in the database, Oneida’s slash-delimited
name seems to be a case of overzealousness. It isn’t located inside a reservation, though the Oneida Nation does run a casino outside town. Judging from their changeset comments, their general approach is apparently to systematically copy names out of online dictionaries. But these dictionaries only say what the name of the place is in a given language, not that it’s one of the main languages there.
As for the choice of delimiter, it seems to be a personal whim rather than a matter of principle. So far no one has claimed that the tribal authorities have a strong preference for the slash over other punctuation that a data consumer would apply if the standard semicolon were tagged (which might well be a slash anyways). A “case by case” approach sounds like a tradeoff in favor of something theoretical at the expense of something real.
I reflect the consensus from the talk-ca thread. “Case by case” seems to be the realpolitik of the situation. This literally means “practical rather than moral or ideological considerations,” but actually practicality AND ideological considerations are important to consider, given the truly sensitive nature and histories of the subject.
My “introduction” towards these names (and conventions) is forward-looking in light of the talk-ca discussion, and only partly based upon the data I saw was already (and do see now) in our map (data). Although, I have been looking at these data for many years and was (I hope) somewhat helpful at the transition away from protect_class (numerical values) and towards the (more harmonious, at least starting after its Approval in 2019) of boundary=aboriginal_lands.
I realize “slash-separated” is the subject here, but I decided to zoom out to a broader sense of how truly wide these data can go, as I believe the project as a whole, worldwide, benefits from this wider perspective on the topic.
A “so far recap:” (yes, I’ve skipped some, hopefully minor things, feel free to re-inflate them)
It’s a lone mapper thing and de facto accepted practice, at least from a USA/Canada/North American (México, too? unclear) perspective (I think that’s “about” what Zeke meant). I’ll toss in that adding México and/or saying “North America” couldn’t hurt (add Spanish and some other languages, which is already done with French and some other languages with Canada). There is a trend towards “all of North America” when two of USA, Canada and México are discussed about “how we do things ‘around here’”).
These (native names using / slash characters) are added by non-locals, largely one, who doesn’t seem to get a lot of input "from the people of the land,"now widely noticed, thanks to Minh for pulling tight focus on this. “Engaging local tribal communities in mapping their own lands” is an expressed ideal. The “sense of stewardship” in our map “as ours to present, with honor” now lies before us. We’ll want to tag our best to map our best.
name:lang tags seem a solid, established method, though sometimes getting which dialect is correct can challenge. I’ll toss in this might be where a scholar (see the recent talk-ca thread [Talk-ca] First Nations reserve naming) or those with “already some overlap” here (these topics, these sorts of conversations) could step in and help people chat together…which ISO codes of this or that dialect are correct, reducing cultural friction perhaps, kind of stuff. Someone listening on the OSM side to structure that into a “start here” approach can work. Really, the more who listen, the more the ball can roll towards good motion. It may be three or four people who can nod heads together, it may be a whole band of people take a vote in a meeting, “we’ll see.”
Sometimes, alt_name is used, downstream users of our data should know that. We might want to say that descriptively and take pause to invent something prescriptive, or not, and remain in “observe and learn” mode. There are a lot of possible directions we could and already are going in and this really is a highly fluid kind of design, exceedingly difficult if not impossible to predict well. We really must be (at least partly) very much in listening mode.
Semicolons can work when “a name tag does have more than one name in it” and slash character / is a tagging error. “Semicolons seem here to stay.” The characters solidus / space hyphen - and octothorpe # have been called out as “do not use these in names” (they are old separators we want to replace). Many downstream data users “cope with” semicolons reasonably predictably well. Establishing (as a line of “the paint has dried hard”) that semicolon is a separator as we are describing here causes some downstream ripple in things like JOSM and Osmose. Unicode / mixed directional text and other difficulties (perhaps invisible or quoted characters are necessary in some cases / contexts) complicates this.
In some cases (e.g. rendering text for map tiles) some of this can be done with tweaks to Lua profiles. The slash character / is used in political and cultural distinctions on multiple continents and for multiple reasons.
A “best initial seed case” scenario might be a name:xy=* tag in the local preferred language, plus a name=* tag in “English” (that might be Canadian English, US English, Australian English…) as is OSM’s convention to tag a name=* in English, if possible and appropriate. Other (any?) languages are welcome to be added. Slash characters are to be avoided (along with hyphen, spaces and octothorpe) going forward. Let’s think about how we might communicate intended changes (like deprecating slashes) to downstream users and bump it around.
It seems like there is some fairly wide understanding about this (hundreds of views of this topic). Can someone else take the baton away from me about now and keep running this towards a finish line? It may actually turn into something like a mechanical edit (to filter out slash characters…) as an earlier step. Maybe a middle step.